Redshift complements its existing Relational Database Service, along with Elastic MapReduce and nosql datastore DynamoDB. It is considered to have taken performance in analyzing volumunous datasets to a step further according to Amazon CTO, Werner Vogels. According to Vogels Redshift gains this performance by storing each column sequentially unlike other relational databases which store each row sequentially. Similar data are also stored sequentially which helps compress data effeciently reducing the amount of IO it needs to perform to return results.
The other optimizing factor for Redshift is its massively parallel processing (MPP) architecture, which enables it to distribute and parallelize queries across multiple low cost nodes. The nodes themselves are designed specifically for data warehousing workloads. The initial pilots according to Amazon have seen speeds ranging from 10x-150x on a two billion row data set.
Amazon Redshift already comes with partners Jaspersoft, MicroStrategy certifications and allows you to connect SQL client or business intelligence tools to Amazon Redshift data warehouse cluster using standard PostgreSQL JDBC or ODBC drivers. With a few clicks in the AWS Management Console, you can launch a Redshift cluster, starting with a few hundred gigabytes of data and scaling to a petabyte or more, for under $1,000 per terabyte per year.




Amazon sets another bench mark and hopefully a new challenge to its competitors in the big data space. It launches fully managed, petabyte-scale data warehouse service in the cloud by name Amazon Redshift.