18 January 2021

AWS-Lake Formation

  • Lake Formation is an integrated data lake service that makes it easy for users to ingest, clean, catalog, transform & secure data & make it available for analysis & ML.
  • It can manage data ingestion via AWS Glue.
  • It can ingest data from S3, RDS databases & CloudTrail logs, understand their formats & make data clean and queryable.
  • It encompasses all Glue features and provides additional capabilities designed to help build, secure & manage a data lake.
  • It automatically discovers all AWS data sources to which it is provided access by AWS IAM policies.
  • Users can define JDBC connections to allow Lake Formation to access their AWS databases and on-premises databases including Oracle, MySQL, Postgres, SQL Server & MariaDB.
  • It provides jobs that run ML algorithms to perform de-duplication and link matching records.
  • It provides a way import existing catalog and meta store into the Data Catalog.
  • It currently supports Server-Side-Encryption on S3 (SSE-S3, AES-265).
  • It provides APIs and a CLI to integrate Lake Formation functionality into user's custom applications.

No comments:

Post a Comment

Most views on this month