18 November 2020

AWS-Polly

  • Polly is a service that turns text into lifelike speech.
  • It supports Speech Synthesis Markup Language (SSML) tags like prosody so users can adjust the speech rate, pitch or volume.
  • It is a secure service that delivers benefits at high scale and at low latency.
  • Users can cache and replay Amazon Polly’s generated speech at no additional cost.
  • Users can use Polly to power their application with high-quality spoken output.
  • Users can synthesize speech for certain Neural voices using the Newscaster style, to make them sound like a TV or Radio newscaster.
  • Users can detect when specific words or sentences in the text are being spoken to the user based on the metadata included in the audio stream.
  • It generates Speech Marks using the following four elements: Sentence, Word, Viseme and SSML.
  • It can be used in announcement systems in public transportation and industrial control systems for notifications and emergency announcements.
  • Applications such as quiz games, animations, avatars or narration generation are common use-cases for cloud-based Text-to-speech solution like Polly.
  • Cloud-based text-to-speech (Polly) is platform independent, so it minimizes development time and effort.
  • It supports all the programming languages included in the AWS SDK (Java, Node.js, .NET, PHP, Python, Ruby, Go and C++) and AWS Mobile SDK (iOS/Android).
  • It supports an HTTP API so users can implement their own access layer.
  • It supports MP3, Vorbis and raw PCM audio stream formats.
  • It is a HIPAA Eligible Service covered under the AWS Business Associate Addendum (AWS BAA).
  • It makes it easy to request an additional stream of metadata with information about when particular sentences, words and sounds are being pronounced.
  • It 's pay-per-use model means there are no setup costs. User can start small and scale up as their application grows.
  • It provides simple API operations that users can easily integrate with their existing applications.
  • It has a Neural TTS (NTTS) system that can produce even higher quality voices than its standard voices. The NTTS system produces the most natural and human-like text-to-speech voices possible.
  • Neural voices aren't available in all AWS Regions, nor do they support all It features.
  • It provides API operations that users can use to store lexicons in an AWS region.
  • Lexicons give additional control over how Polly pronounces words uncommon to the selected language.
  • The SynthesizeSpeech operation produces audio in near-real time, with relatively little latency in most cases.
  • Polly's Asynchronous Synthesis feature overcomes the challenge of processing a larger text document by changing the way the document is both synthesized and returned.
  • With the Polly plugin for WordPress, users can provide visitors to their WordPress website audio recordings of their content.

No comments:

Post a Comment

Most views on this month