- Data Pipeline is a web service that makes it easy to schedule regular data movement and data processing activities in the AWS cloud.
- It provides built-in support for the following activities: CopyActivity, HiveActivity, EMRActivity & ShellCommandActivity.
- It provides built-in support for the following preconditions: DynamoDBDataExists, DynamoDBTableExists, S3KeyExists, S3PrefixExists and ShellCommandPrecondition.
- Types of compute resources: AWS Data Pipeline–managed and self-managed.
- It handles running and monitoring user's processing activities on a highly reliable, fault-tolerant infrastructure.
- It is specifically designed to facilitate the specific steps that are common across a majority of data-driven workflows.
- To enable running activities using on-premise resources, It supplies a Task Runner package that can be installed on on-premise hosts.
- If failures occur in activity logic or data sources, It automatically retries the activity.
- It provides a library of pipeline templates.
- It is inexpensive to use & is billed at a low monthly rate.
No comments:
Post a Comment