obstkel.com logo

14 Redshift Data Types to make you the office hero(2022)

redshift data type

Redshift data types are not a whole lot different from the standard relational database types. 

Relational Databases store data in tables, which are made up of rows and columns. A column is the smallest granularity of logical data storage. Each of these columns have attributes associated with it. A Redshift Data Type, in this context defines the attributes of a column.

There are 4 categories of built-in Redshift data types: Character, Numeric, Datetime and Boolean. Knowing these data types and their attributes is key to writing quality DDL statements in Redshift.

The tables below list the types within each of these categories.

Character Redshift data types

CHAR

The important stuff about CHAR:

  • A CHAR in Redshift is a fixed length character string with a maximum length of 4096 bytes.

  • CHAR, CHARACTER, NCHAR are the same data types in Redshift.

  • You declare a CHAR data type as shown below –

char(10) or character(10) or nchar(10)

Let us look at an example of creating a table in Redshift with the char data type.

Since a char datatype uses up the entire allocated space, use char types for small fields. For larger character fields, use VARCHAR.

CREATE TABLE employees
(
marital_status char(1) default 'U'
);

VARCHAR

The important stuff about VARCHAR:

  • In Redshift, VARCHAR is a variable length character data type string.

  • The default length of VARCHAR is 256.

  • The Redshift VARCHAR max length is 65,535 bytes.

  • VARCHAR, NVARCHAR, TEXT and CHARACTER VARYING are the same data types in Redshift.

  • You declare a VARCHAR data type as shown below. 
varchar(20)or nvarchar(10) or text(10) or character varying (10)

Below is an example of a redshift create table statement with two VARCHAR fields, first name and last name.

CREATE TABLE employees
(
first_name varchar(30),
last_name varchar(30)
);

Redshift Numeric data types

An incorrectly defined Redshift numeric datatype can wreak havoc on performance and throw off your calculations. So, lets focus on the simple basics! 

A Redshift numeric data type is used to store numbers, we all know that. But what kind of numbers?

  • Integers – Also known as whole numbers. Picture you at a grocery store buying apples. One apple, a dozen (12) apples, and so on. You do not buy half an apple!

  • Decimals – These are numbers where a quantity less than one is denoted on the right of a decimal point. Example 2.5, 9.9, 12.5. 
    For instance, assuming you are still at the grocery store, and decide to buy potatoes or onions. You could be picking up a 2.5 lb bag. That’s an example of a decimal.

  • Floating-Point – A floating point number is similar to a decimal, except that the number of digits to the right of the decimal point can vary. Your paycheck for instance could be 950.50, 950.333, 950.0154.

Rule of thumb, if the number of digits to the right of the decimal is constant, use a decimal type. If they vary based on computation, then use a floating-point.

Read the above definitions a couple of times, and let it sink in. 

SMALLINT

The important stuff about SMALLINT:

  • A Redshift smallint can store upto 2 bytes of information.

  • Use a Redshift smallint data type to store whole numbers in the range -32,768 to +32,767.

  • Syntax for a SMALLINT is

smallint or int2

INTEGER

The important stuff about INTEGER:

  • Use the INTEGER data type in Redshift to store whole numbers in the range -2,147,483,648 to +2,147,483,647.

  • Syntax for an INTEGER is –

integer or int or int4

BIGINT

The important stuff about BIGINT:

  • If you need to store really large whole numbers in the range -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807, then use the Redshift BIGINT data type.

  • A Redshift BIGINT can store up to 8 bytes of information.

  • Syntax for a BIGINT is –

bigint or int8

DECIMAL

The important stuff about DECIMAL:

  • DECIMAL uses up to 128 bytes to store numeric data as signed integers with a precision of up to 38 digits.

  • If you need to store numbers with scale and precision, then use the Redshift DECIMAL data type.

  • Precision refers to the sum of the digits to the left and right of the decimal point. The default precision is 18 and the max precision limit is 38.

  • Scale refers to the number of digits to the right of the decimal point. The default scale is 0, and the max scale can be as high as 37.

  • Syntax for a Redshift DECIMAL data type is –

decimal(precision, scale)

Now let’s look at an example on defining a Redshift decimal column.

Example1: Define a Redshift decimal column based on the number 3219.22
  • Precision: Number of digits to the left of the decimal + number of digits to the right of the decimal = 6
  • Scale: Number of digits to the right of the decimal = 2
  • Column definition = decimal (6,2)

REAL

The important stuff about REAL:

  • Use the REAL or FLOAT4 data type to store numbers with up to 6 digits of variable precision.

  • Syntax for a REAL data type is –

real or float4

FLOAT

The important stuff about FLOAT:

  • FLOAT stores numeric data with up to 15 digits of variable precision.

  • Syntax for a Redshift FLOAT data type is –
float or float8 or double precision

DateTime data types

DATE

The important stuff about Redshift DATE data type:

  • The DATE data type uses 4 bytes to store the Calendar date in the default format YYYY-MM-DD.

  • The date range goes from 4713 BC to 294276 AD.

  • Syntax for a DATE data type is as shown below.

date

TIME
STAMP

The important stuff about TIMESTAMP:

  • TIMESTAMP uses 8 bytes to store date and time of day in default format YYYY-MM-DD HH:MI:SS.

  • This type does not include TIME ZONE. 

  • Similar to the DATE data type, the range goes from 4713 BC to 294276 AD.

  • Syntax for a Redshift TIMESTAMP is

timestamp

TIME

The important stuff about TIME:

  • TIME uses 8 bytes to store the time of day without the TIME ZONE.

  • For displaying time in a 24-hour clock format use HH24:MI:SS.

  • If you are displaying time in a 12-hour clock format, then use HH12:MI:SS.

  • Syntax for TIME is –

time

TIMETZ

The important stuff about TIMETZ:

  • TIMETZ uses 8 bytes to store the time of day with the time zone,

  • Syntax for Redshift time of day with time zone is –

timetz

TIME
STAMPTZ

The important stuff about TIMESTAMPTZ:

  • To capture timestamp with the time zone, use TIMESTAMPTZ.

  • TIMESTAMPTZ uses 8 bytes to store data in the format YYYY-MM-DD HH:MI:SS TZ.

  • Syntax for a Redshift timestamp with time zone type is –

timestamptz

Redshift Boolean Data Type

BOOLEAN

The important stuff about Boolean data type:

  • A Redshift Boolean data type is a single byte column used to store true or false values.

  • You can use ‘1’, ‘t’,’ y’, ‘yes’, ‘true’ or ‘TRUE’ to represent a True value in your input.

  • False values can be represented as ‘0’, ‘f’, ‘n’, ‘no’, ‘false’ or ‘FALSE‘ in the input.

  • Unknowns are represented as NULL.

  • The syntax for a Boolean data type in Redshift is –

boolean

Trending questions on Redshift Data Types

What are Redshift column types?

Redshift column types are the same as Redshift data types.

Tables in a database are made up of one or more columns. Each column is intended to store a certain kind of data. Depending on the type of data stored, the column in a table needs to be different.

For example, a NAME column requires a data type of Character, whereas a PRICE column requires a Numeric type. 

A data type in Redshift is the attribute of a single column in a table. In other words, the Redshift data type property used to define the attribute of a Redshift column.

Often, we tend to use column and field interchangeably as well. So don’t let that confuse you.

In summary, if I have 3 columns in a table with FIRST_NAME, LAST_NAME and MIDDLE_NAME, the data types for them (assume VARCHAR) are the same as the Redshift column types.

Table of Contents

Redshift helpful links

Amazon Redshift Documentation

This is the latest version of Amazon Redshift Documentation

Amazon Redshift & Analytics

Another great blog post by Nick Corbett, AWS Professional Services on Agile Analytics with Amazon Redshift

Recent Posts on Redshift

Interested in our services ?

email us at : info@obstkel.com

Copyright 2022 © OBSTKEL LLC. All rights Reserved