20 November 2020

AWS-Data-Pipeline

  • Data Pipeline is a web service that makes it easy to schedule regular data movement and data processing activities in the AWS cloud.
  • It provides built-in support for the following activities: CopyActivity, HiveActivity, EMRActivity & ShellCommandActivity.
  • It provides built-in support for the following preconditions: DynamoDBDataExists, DynamoDBTableExists, S3KeyExists, S3PrefixExists and ShellCommandPrecondition.
  • Types of compute resources: AWS Data Pipeline–managed and self-managed.
  • It handles running and monitoring user's processing activities on a highly reliable, fault-tolerant infrastructure.
  • It is specifically designed to facilitate the specific steps that are common across a majority of data-driven workflows.
  • To enable running activities using on-premise resources, It supplies a Task Runner package that can be installed on on-premise hosts.
  • If failures occur in activity logic or data sources, It automatically retries the activity.
  • It provides a library of pipeline templates.
  • It is inexpensive to use & is billed at a low monthly rate.

No comments:

Post a Comment

Most views on this month