AWS Data Pipeline is a web service that helps you reliably process and move data between different AWS compute and storage services as well as on-premise data sources at specified intervals. User is able to create a pipeline definition in the AWS Management Console which will essentially consist of a set of data sources, pre-conditions, destinations, processing steps and an operational schedule. This definition will essentially specify where the data comes from, what processing has to be done with it and where to store it.
The pipeline will wait for any preconditions to be satisfied and will execute as per the defined schedule. AWS Data Pipeline gives you the power to automate the movement and processing of any amount of data using data-driven workflows and built-in dependency checking. You can access it from the command line, the APIs, or the AWS Management Console.
After processing the data as per the schedule the results can be transferred to AWS services such as Amazon S3, Amazon RDS, Amazon DynamoDB, and Amazon Elastic MapReduce (EMR). Moreover its integrated with Amazon Simple Notification Service (Amazon SNS) to notify you of any failures during processing of the data.
Enterprises and small businesses longing for an automated and scheduled tasks to process variety and volumes of data on a daily basis, the function is now available with a click.