redshift-scripts

Sample scripts and SQL commands for RedShift

General Information

Utility	Reference
RedShift Admin Scripts	https://github.com/awslabs/amazon-redshift-utils/tree/master/src/AdminScripts
RedShift Vacuum Utility	https://github.com/awslabs/amazon-redshift-utils/tree/master/src/AnalyzeVacuumUtility - Only for table data under 700 GB
RedShift Monitoring	https://github.com/awslabs/amazon-redshift-monitoring
S3 manifest File Over SSH	For Connections - https://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-source-ssh.html#copy-parameters-data-source-ssh-parameters
RedShift Deep Copy	https://docs.aws.amazon.com/redshift/latest/dg/performing-a-deep-copy.html
RedShift Snapshot Manager	https://github.com/awslabs/amazon-redshift-utils/tree/master/src/SnapshotManager#amazon-redshift-snapshot-manager
Redshift Database Loader	https://github.com/awslabs/aws-lambda-redshift-loader - Load Data from S3 to RedShift using AWS Lambda

WLM

Specification	Limits/Values
Concurrent User Connection	500
Total Concurrent Queries	50
User Defined Queues	8
Default Concurrency per Queue	5
Automatic Compression	ENCODE
DATE DataType	YYYY/MM/DD
Varchar DataType	Comments
Char DataType	Flag(Y\N)
Decimal Datatype	Price

UPDATE and INSERT Operations

**UPSERT (combination of UPDATE and INSERT) not allowed **
Use staging tables instead

Ref - https://docs.aws.amazon.com/redshift/latest/dg/t_updating-inserting-using-staging-tables-.html

Key Distribution Styles

Key Name	Usage	Data Distribution
ALL Distribution Style	Tables that change infrequently	entire table is distributed to every node.
EVEN Distribution Style	Appropriate when a table does not participate in joins	EVEN distribution is the default distribution style.
KEY Distribution Style	Tables that need to be JOIN for further analysis	If you distribute a pair of tables on the joining keys, the leader node collocates the rows on the slices according to the values in the joining columns so that matching values from the common columns are physically stored together.

Ref - https://docs.aws.amazon.com/redshift/latest/dg/c_choosing_dist_sort.html

COPY command only supports

S3
Dynamodb
EMR/EC2 Instance

NOTE When you use the UNLOAD command in Redshift to write data to S3, it automatically creates files using Amazon S3 server-side encryption with AWS-managed encryption keys (SSE-S3).

Encryption-at-rest

CloudHSM	KMS
Hourly based on region	Usage based pricing
Need to setup HA & durability	Highly Available and durable
Single Tenant	Multi-tenant
Customer Managed	AWS Managed
EAL4 + U.S. Gov. NIST FIPS 140-2 Certified	Auditing
Symmetric and Asymmetric Encryption	Symmetric Encryption Only

Symmetric encryption - same keys are used to perform encryption and decryption.

Note Once you enable encryption for a Redshift cluster upon launch, you can cannot then change it to an unencrypted cluster. You’ll have to unload the data and reload the data into a new cluster with your new encryption setting.