Create Amazon Redshift Cluster
data:image/s3,"s3://crabby-images/4ffa3/4ffa3af1fa3134991ea32a3f0050fe762022b4a8" alt=""
Amazon Redshift is a fast, scalable data warehouse that makes it simple and cost-effective to analyze data across data warehouse and data lake. With a few clicks, we can create a Amazon Redshift cluster in minutes.
In this article we are going to setup & deploy an Amazon Redshift Cluster.
We can deploy a new data warehouse in minutes and save time with automated administrative tasks that manage, monitor, and scale data warehouse. Amazon Redshift is fast, simple, cost-effective data warehouse that can extend queries to data lake as well.
data:image/s3,"s3://crabby-images/65872/65872753727441947699c33ce50f0dfa6198ed5c" alt="Amazon Redshift Clusters"
Before we start to deploy our first Redshift cluster, as a prerequisite let us setup few resources like Security Group, IAM Policies & Roles.
First we create a Security Group with access to Redshift on port 5439 restricted from our VPC.
data:image/s3,"s3://crabby-images/b0fc2/b0fc2ce76a7202d41753fb69b45372c94320e247" alt="Security Group"
Next we will create an IAM Policy to access our S3 buckets to read & write datasets from Redshift.
data:image/s3,"s3://crabby-images/0d8ae/0d8ae83f5b7528a5b7b50bc740c3841c3d43c614" alt="IAM Policy<br>"
Next we will create an IAM Role to allow Redshift to access other AWS services on our behalf.
data:image/s3,"s3://crabby-images/c8bb0/c8bb0d2739ca6cf7d1481e5ace6ef0cb4bb5d9ba" alt="IAM Role"
Trust Relationship entities that can assume the role and the access conditions for the role redshift.amazonaws.com
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "redshift.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
Next we will set up Redshift Cluster Subnet Groups & Parameter Groups.
data:image/s3,"s3://crabby-images/a299c/a299c4f7b448753b48846566d275fa4c534ee62c" alt="Redshift Configurations"
Click on the Create subnet group button.
data:image/s3,"s3://crabby-images/c8ca2/c8ca28302b65a579f4dc1f9423b8465a6aeed532" alt="Redshift Cluster Subnet Group"
Add a name & description to the Cluster subnet group.
Next, we will select our VPC & choose the Private subnets across the various Availability zones. Finally click the Create cluster subnet group button.
data:image/s3,"s3://crabby-images/17c2e/17c2e32f27f7ad7dea4e840e8641685e8a6edd46" alt="Create Cluster Subnet Group"
Next from the Configurations menu, click on the Workload management link.
data:image/s3,"s3://crabby-images/4c706/4c7061c1ddd1cdf5071dbc18977c042f65051868" alt="Cluster Subnet Group"
Now let us create a new Parameter group.
data:image/s3,"s3://crabby-images/085a9/085a958ca4bd872e3c3850ca390bcd15c6f244a8" alt="Workload Management Parameter Groups"
Enter the Parameter Group name and a description.
data:image/s3,"s3://crabby-images/efac2/efac2c3ca53af003c25209535e5f1518b8a82edc" alt="Create Parameter Group"
Modify the cluster parameters as needed. For the demo, let's keep it AS-IS.
data:image/s3,"s3://crabby-images/66409/66409465372c399d8aa6e8e4ec955049ab5f72ca" alt="Cluster Parameter Group"
Next, click on the Clusters menu link. Now we are ready to setup our first Redshift cluster. Click on Create cluster button.
data:image/s3,"s3://crabby-images/f7ef0/f7ef016de46c10899b9ab1d759bd92b8824756e4" alt="Create Redshift Cluster"
Let's name the cluster & select the Node type as dc2.large. For our demo it is good to start with 1 node in the cluster.
data:image/s3,"s3://crabby-images/7c00f/7c00f69f48e498426f65ef61f1a1d90c6f435022" alt="Cluster Configuration"
Choose a database username & password for the login credentials.
Next, select the IAM role we created earlier, from the dropdown menu & associate it with the Redshift cluster.
data:image/s3,"s3://crabby-images/fc3ef/fc3efe8b6eae8477d47bdf98c40f049d84d13bb4" alt="Cluster Permissions"
Let's select our VPC & Security group for the Redshift cluster.
Next, select the Subnet group we created earlier, from the dropdown menu & choose one of the availability zone.
Let us also disable, the public access to our Redshift cluster.
data:image/s3,"s3://crabby-images/431a7/431a712a5fdde94b14c5771a9c152cf1ab2bd2a3" alt="Network & Security"
Let's add an initial database to the Redshift cluster. The default Redshift cluster port is 5439.
Next, select the Parameter group we created earlier, from the dropdown menu.
data:image/s3,"s3://crabby-images/17aad/17aad818207225a81767af35861ad28a5cc82a34" alt="Database Configurations"
For the demo, lets keep the Default Maintenance window settings.
data:image/s3,"s3://crabby-images/f3f3f/f3f3f007fb4837fff05ce7edac928cb2d80f4d1e" alt="Cluster Maintenance"
For the demo, lets keep the Default Snapshot retention as 1 day.
data:image/s3,"s3://crabby-images/c5ed1/c5ed12767797b128c8f5d37af82a28a527bb498c" alt="Cluster Backup"
Now that all the configuration setup is done let's click on the Create cluster button. After a few minutes our Redshift cluster will be available.
data:image/s3,"s3://crabby-images/26bc7/26bc7076738afea16cccc1c123212103f6780b8e" alt="Redshift Cluster Available"
Verify the configuration & properties of the Redshift cluster. New let go ahead and do some data analysis using Redshift. Click on the Query data button followed by Query in query editor link.
data:image/s3,"s3://crabby-images/f3cdd/f3cdd13af7a51dc406d255d68b8c38a5ae6d9b94" alt="Redshift Cluster Query Data"
In our next article we will perform data analysis using Redshift & visualization on top of it using Metabase.