14 Redshift Data Types to make you the office hero

girl working on laptop

Redshift data types are not a whole lot different from the standard relational database types. 

Relational Databases store data in tables, which are made up of rows and columns. A column is the smallest granularity of logical data storage. Each of these columns have attributes associated with it. A Redshift Data Type, in this context defines the attributes of a column.

There are 4 categories of built-in data types in Redshift: Character, Numeric, Datetime and Boolean. 

The table below lists the types within each of these categories.

Character Types
CHAR
VARCHAR
Boolean Types
BOOLEAN

Character Data Types in Redshift

CHAR

The important stuff:

  • A character type is a fixed length string and can have a maximum length of 4096 bytes. A byte is the same as a character.
  • CHAR, CHARACTER, BPCHAR and NCHAR essentially mean the same thing in Redshift
  • Syntax for a Char Type is shown below  
char(10) or character(10) or bpchar(10) or nchar(10)

Let us look at an example of creating a table in Redshift with the char data type. Since a char datatype uses up the entire allocated space, use char types for small fields. For larger character fields, use VARCHAR.

CREATE TABLE employees
(
marital_status char(1) default 'U'
);

VARCHAR

The important stuff:

  • A VARCHAR in Redshift is a variable length character data type string with a maximum length of 65,535 bytes.
  • VARCHAR, NVARCHAR, TEXT and CHARACTER VARYING are the same data types in Redshift
  • You declare a VARCHAR data type as shown below 
varchar(20)or nvarchar(10) or text(10) or character varying (10)

Below is an example of a redshift create table statement with two VARCHAR fields, first name and last name.

CREATE TABLE employees
(
first_name varchar(30),
last_name varchar(30)
);

Numeric Data Types in Redshift

SMALLINT

The important stuff:

  • Use a smallint data type to store whole numbers in the range -32,768 to +32,767
  • Syntax for a SMALLINT is
smallint or int2

INTEGER

The important stuff:

  • Use the INTEGER data type in Redshift to store whole numbers in the range -2,147,483,648 to +2,147,483,647
  • Syntax for an INTEGER is 
integer or int or int4

BIGINT

The important stuff:

  • If you need to store really large whole numbers in the range -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807, then use the BIGINT data type
  • Syntax for a BIGINT is
bigint or int8

DECIMAL

The important stuff:

  • If you need to store numbers with scale and precision, then use the DECIMAL data type
  • DECIMAL  uses up to 128 bytes to store numeric data as signed integers with a precision of up to 38 digits
  • Syntax for a decimal is
decimal(precision, scale)

REAL

The important stuff:

  • Use the REAL or FLOAT4 data type to store numbers with up to 6 digits of variable precision
  • Syntax for a decimal is
real or float4

FLOAT

The important stuff:

  • FLOAT stores numeric data with up to 15 digits of variable precision  
  • Syntax for a decimal is
float or float8 or double precision

DateTime Data Types in Redshift

DATE

The important stuff:

  • The DATE data type uses 4 bytes to store the Calendar date in the default format YYYY-MM-DD
  • The date range goes from 4713 BC to 294276 AD
  • Syntax for a DATE data type is as shown below
date

TIMESTAMP

The important stuff:

  • TIMESTAMP uses 8 bytes to store date and time of day in default format YYYY-MM-DD HH:MI:SS
  • This type does not include TIME ZONE 
  • Similar to the DATE data type, the range goes from 4713 BC to 294276 AD
  • Syntax for a Redshift TIMESTAMP is
timestamp

TIME

The important stuff:

  • TIME uses 8 bytes to store the time of day without the TIME ZONE
  • For displaying time in a 24 hour clock format use  HH24:MI:SS
  • If you are displaying time in a 12 hour clock format, then use HH12:MI:SS  
  • Syntax for TIME is
time

TIMETZ

The important stuff:

  • TIMETZ uses 8 bytes to store the time of day with the time zone
  • Syntax for Redshift time of day with time zone is
timetz

TIMESTAMPTZ

The important stuff:

  • To capture timestamp with the time zone, use TIMESTAMPTZ
  • TIMESTAMPTZ uses 8 bytes to store data in the format YYYY-MM-DD HH:MI:SS TZ 
  • Syntax for a Redshift timestamp with time zone type is 
timestamptz

Boolean Data Types in Redshift

BOOLEAN

The important stuff:

  • A Boolean data type is a single byte column used to store true or false values.
  • You can use ‘1’, ‘t’,’ y’, ‘yes’, ‘true’ or ‘TRUE’ to represent  a True value in your input.
  • False values can be represented as ‘0’, ‘f’, ‘n’, ‘no’, ‘false’ or ‘FALSE’ in the input.
  • Unknowns are represented as NULL
  • The syntax for a boolean data type in Redshift is
boolean

Amazon Redshift helpful links

Amazon Redshift Documentation

This is the latest version of Amazon Redshift Documentation

Amazon Redshift & Analytics

Another great blog post by Nick Corbett, AWS Professional Services on Agile Analytics with Amazon Redshift

Service Offerings by Obstkel

Get to know the AWS Cloud Services offered by Obstkel

Building a Data Warehouse

This is an excellent blog post from AWS by Stephen Borg on Building a Data Warehouse using Amazon Redshift

Email us at : info@obstkel.com

Copyright 2020 © OBSTKEL LLC. All rights Reserved.
Scroll to Top