If you are like me you’ve decided to study for the AWS Solutions Architect Associate certificate. That’s great news! I’m also sure you’ve heard that Amazon has changed the test as of August 30, 2022. If you are in my situation, then you’ve already started studying for the old test (SSA-C02) and are unsure where to find study guides that will prepare you for the new version (SAA-CO3). Looking at the AWS Exam Guide the biggest change on the test will be a shift in focus on more of AWS’s Machine Learning offerings. In this article, I’m going to go over each of the Amazon Web Services Machine Learning offerings found in the exam guide and provide a high-level overview of what they are, to help you prepare.
What is Machine Learning
Machine learning is a flavor of Artificial Intelligence that uses data and algorithms to simulate human learning to gradually improve accuracy. Statistical methods and algorithms are used to train models that are later used to make classifications of data. These classifications are used to uncover insights into a dataset or drive business decisions. A real-world application of machine learning can be as simple as Netflix Movie recommendations or as complex as self-driving cars. The basic Machine Learn lifecycle is as follows
- Collect: Collect data about the domain that will be fed into the algorithms to train models
- Feature and labels: Labels correspond to something you are trying to predict and can be manually asserted. Features are used to trigger the prediction
- Feature reduction: leaving out data that has nothing to do with what we are trying to predict.
- Encoding: encoding datasets so algorithms can read them
- Formatting: Formatting datasets so algorithms can read them
- Choose an algorithm: Choose an algorithm suitable to train a model that meets your feature requirements (AWS has a library of algorithms available for the user)
- Train a model: Passing the data through the algorithm
- Test: Verify the newly trained model with datasets that have known outcomes
- Use: use the trained model to make predictions about new, incoming data
AWS Comprehend
AWS Comprehend uses an AWS-trained NLP model to uncover insight from unstructured text documents to find meaning and provide context.
The three most common use-cases of AWS Comprehend are
- Voice of Customer: Voice of the customer analyzes customer feedback from social media, support calls, emails, and other online channels to determine if the overall customer sentiment is positive, negative, or mixed.
- Semantic Search: Semantic search provides context to internal search engine indexes, allowing you to focus searches on intent using context as opposed to basic keywords.
- Knowledge management and discovery: Knowledge management and discovery analyzes a collection of documents and organizes them by topics.
How to deploy AWS Comprehend?
Unstructured documents are stored in AWS S3 Data Lake and AWS Comprehend is pointed at the source. Insights can be stored in any Amazon Web Services data storage, database, or data warehouse such as Red Shift
Other details
- AWS Comprehend is an AWS-managed service
- The user currently cannot upload or use their own NLP model.
- SSL and AWS IAM are used for security
- AWS Comprehend Medical is available for medical use cases
AWS Forecast
AWS Forecast uses time-series data to predict or forecast future data points. Examples of time-series data are sales data, inventory levels, website visits, etc.
Use-cases
- Retail demand
- Manufacturing demand
- Revenue
- Inventory
- IT capacity
- Web-traffic
Other details
- AWS Forecast is an AWS-managed service
- Analysis can be viewed via API calls, CSV, or Visible through the AWS console
AWS Fraud Detector
AWS Fraud Detector uses your company’s historical data along with Amazon Web Services fraud models to train the Fraud Detector’s algorithms to make real-time, identity fraud decisions.
This works by uploading your existing fraud case data into AWS S3. In AWS Fraud Detector a model is selected and pointed to the data to train, test, and deploy a fraud model. The company’s application can call the AWS Fraud Detector API to get a confidence rating about a real-time transaction. Based on the confidence results rules can be set up to allow, reject, or send the results out for review.
AWS-provided models include
- New account fraud, within an account sign-up process
- Online identity fraud
- Payment fraud for online orders
- Guest checkout fraud
- Loyalty account protection
- Account takeover detection
- Seller fraud in online marketplaces
Other details
- AWS Fraud Detector is an AWS-managed service
- Custom use-cases can be configured
AWS Kendra
AWS Kendra is an enterprise search engine that uses natural language to index and search unstructured and semi-structured documents and data. The most common use for Kendra is to index and search AWS S3 and RDS. Kendra also integrates with the other Amazon Web Services ML offerings such as Transcribe and Comprehend to translate and index data.
Use-cases
- Legal and compliance documentation and reports
- Research and development
- Customer interactions
Other details
- AWS Kendra is an AWS-managed service
- Custom connectors can be developed to access non-AWS data
- AWS KMS can be used to store keys and encrypt data
AWS Lex
AWS Lex can be thought of as conversation AI and chatbots that integrates with Kik, Facebook Messenger, SMS, Slack, and many more. AWS Lex natively integrates with AWS Lambda, CloudWatch, DynamoDB, and Cognito.
Use-cases
- Self-service voice assistants and chatbots
- Informational bot
- Automated customer support agent
- Application/Transactional bot
- Enterprise Productivity bot
- Device Control bot
Other details
- AWS Lex is an AWS-managed service
- Integrates with AWS Polly for text-to-speech interactions
AWS Polly
AWS Polly is text to speech service that delivers life-like voice reproduction for text. It can be integrated using the AWS Polly API. The client sends text to the API and the API returns an audio stream. Polly allows the user to set up custom lexicons and vocabularies.
Use-cases
- E-learning and education
- Enable people with reading disabilities the ability to use an application
- Assist blind and visually impaired people to consume digital media
- Gaming applications
- Automated telephone services
Other details
- AWS Polly is an AWS-managed service
- Multi-lingual
- Utilizes Speech Synthesis Markup Language (SSML) to alter the emphasis, pitch, or pronunciation of the text
AWS Rekognition
AWS Rekognition provides image and video analysis. Image recognition can provide object identification, face recognition, extract text, and identify inappropriate content. Video analysis can provide object tracking, person recognition, and object indexing. Rekognition processes images and videos stored in S3 buckets or streamed to the APIs. Rekognition returns objects identified with confidence scores that can be used to trigger alarms and events utilizing AWS Augmented AI (A2I).
Use-cases
- Images
- Searchable Image Library
- Face-Based User Verification
- Sentiment Analysis
- Facial Recognition
- Image Moderation
- Video
- Search Index for video archives
- Easy filtering of video for explicit and suggestive content
Other details
- AWS Rekognition is an AWS-managed service
- Integrates with AWS Lambda and CloudTrail
AWS SageMaker
AWS SageMaker is the framework AWS uses to create, test, train, and deploy machine learning models. It is the backbone and framework that all the other Amazon Web Services machine learning services. Amazon Web Services makes this framework available to the customer to create and train their own machine learning models. Along with SageMaker are a suite of SageMaker tools that can enhance and speed up machine learning development.
A few key SageMaker services are
- Studio Lab: A free ML development environment for users to learn and create ML tools
- Data Labeling: Labels and identifies images, videos, and data sets to speed up ML training
- Data Wrangler: Aggregates, prepares, and filters data sets for exploration and visualization
- Autopilot: Automatically chooses, trains, and deploys the best model based on a given dataset.
- Feature Store: Is the central store for features and feature metadata
- Clarify: Detects potential bias and, paired with Model Monitor, can detect drift of deployed models
AWS Textract
AWS Textract extracts text from scanned documents. The extracted text is given a confidence score that will allow data to be routed for human review using the AWS A2I. Textract can extract print, cursive, or handwritten data in a variety of languages and file formats.
Use-cases
- Importing documents and forms into business applications
- Creating smart search indexes
- Building automated document processing workflows
- Maintaining compliance in document archives
- Extracting text for Natural Language Processing (NLP)
- Extracting text for document classification
Other details
- AWS Texract is an AWS-managed service
- Integrates with CloudTrail
AWS Transcribe
Amazon Web Services Transcribe provides speak-to-text services that can transcribe audio files into text. The text contains timestamps to easily reference the provenance of the transcription.
The transcribe service forms the basis for two domain-specific offerings
- AWS Transcribe Medical: The ML is trained to recognize medical terms and speech. It is used for clinical documentation and drug safety monitoring and subtitling for telemedicine.
- AWS Transcribe Call Analytics: NLP is used to create conversation insights to improve customer experience and agent evaluation.
Other details
- AWS Transcribe is an AWS-managed service
- Supports multiple languages
- Custom vocabulary can be configured
AWS Translate
AWS Translate uses neural machine translation to translate text between languages. The neural machine translation is trained to allow for a more accurate and smooth translation than traditional translation applications.
Other details
- AWS Translate is an AWS-managed service
Conclusion
Amazon Web Services offers a variety of managed machine learning services including a framework that allows a user to create their own machine learning models. This article attempts to capture a handful of the most popular service offerings that are now part of the AWS Solutions Architect Certification (SSA-C03). At the moment it is unknown how in-depth the Certification will go into these offerings, but having a broad understanding of what these services are will greatly improve your chances. Best of luck with the certification and if you are looking for Professional Services to configure or set up any of Amazon Web Services Machine Learning reach out to Ten Mile Square, we are happy to help.
Resources
- https://cloudacademy.com/blog/new-aws-certified-solutions-architect-associate-saa-c03/
- https://docs.aws.amazon.com/whitepapers/latest/aws-overview/machine-learning.html