Exambible offers free demo for AWS-Certified-Machine-Learning-Specialty exam. "AWS Certified Machine Learning - Specialty", also known as AWS-Certified-Machine-Learning-Specialty exam, is a Amazon Certification. This set of posts, Passing the Amazon AWS-Certified-Machine-Learning-Specialty exam, will help you answer those questions. The AWS-Certified-Machine-Learning-Specialty Questions & Answers covers all the knowledge points of the real exam. 100% real Amazon AWS-Certified-Machine-Learning-Specialty exams and revised by experts!

Online Amazon AWS-Certified-Machine-Learning-Specialty free dumps demo Below:

NEW QUESTION 1
A company is converting a large number of unstructured paper receipts into images. The company wants to create a model based on natural language processing (NLP) to find relevant entities such as date, location, and notes, as well as some custom entities such as receipt numbers.
The company is using optical character recognition (OCR) to extract text for data labeling. However, documents are in different structures and formats, and the company is facing challenges with setting up the manual workflows for each document type. Additionally, the company trained a named entity recognition (NER) model for custom entity detection using a small sample size. This model has a very low confidence score and will require retraining with a large dataset.
Which solution for text extraction and entity detection will require the LEAST amount of effort?

  • A. Extract text from receipt images by using Amazon Textrac
  • B. Use the Amazon SageMaker BlazingTextalgorithm to train on the text for entities and custom entities.
  • C. Extract text from receipt images by using a deep learning OCR model from the AWS Marketplac
  • D. Use the NER deep learning model to extract entities.
  • E. Extract text from receipt images by using Amazon Textrac
  • F. Use Amazon Comprehend for entity detection, and use Amazon Comprehend custom entity recognition for custom entity detection.
  • G. Extract text from receipt images by using a deep learning OCR model from the AWS Marketplac
  • H. Use Amazon Comprehend for entity detection, and use Amazon Comprehend custom entity recognition for custom entity detection.

Answer: C

NEW QUESTION 2
A Machine Learning Specialist needs to create a data repository to hold a large amount of time-based training data for a new model. In the source system, new files are added every hour Throughout a single 24-hour period, the volume of hourly updates will change significantly. The Specialist always wants to train on the last 24 hours of the data
Which type of data repository is the MOST cost-effective solution?

  • A. An Amazon EBS-backed Amazon EC2 instance with hourly directories
  • B. An Amazon RDS database with hourly table partitions
  • C. An Amazon S3 data lake with hourly object prefixes
  • D. An Amazon EMR cluster with hourly hive partitions on Amazon EBS volumes

Answer: C

NEW QUESTION 3
A data scientist is working on a public sector project for an urban traffic system. While studying the traffic patterns, it is clear to the data scientist that the traffic behavior at each light is correlated, subject to a small stochastic error term. The data scientist must model the traffic behavior to analyze the traffic patterns and reduce congestion.
How will the data scientist MOST effectively model the problem?

  • A. The data scientist should obtain a correlated equilibrium policy by formulating this problem as a multi-agent reinforcement learning problem.
  • B. The data scientist should obtain the optimal equilibrium policy by formulating this problem as a single-agent reinforcement learning problem.
  • C. Rather than finding an equilibrium policy, the data scientist should obtain accurate predictors of traffic flow by using historical data through a supervised learning approach.
  • D. Rather than finding an equilibrium policy, the data scientist should obtain accurate predictors of traffic flow by using unlabeled simulated data representing the new traffic patterns in the city and applying an unsupervised learning approach.

Answer: D

NEW QUESTION 4
A financial company is trying to detect credit card fraud. The company observed that, on average, 2% of credit card transactions were fraudulent. A data scientist trained a classifier on a year's worth of credit card transactions data. The model needs to identify the fraudulent transactions (positives) from the regular ones (negatives). The company's goal is to accurately capture as many positives as possible.
Which metrics should the data scientist use to optimize the model? (Choose two.)

  • A. Specificity
  • B. False positive rate
  • C. Accuracy
  • D. Area under the precision-recall curve
  • E. True positive rate

Answer: DE

NEW QUESTION 5
A library is developing an automatic book-borrowing system that uses Amazon Rekognition. Images of library members’ faces are stored in an Amazon S3 bucket. When members borrow books, the Amazon Rekognition CompareFaces API operation compares real faces against the stored faces in Amazon S3.
The library needs to improve security by making sure that images are encrypted at rest. Also, when the images are used with Amazon Rekognition. they need to be encrypted in transit. The library also must ensure that the images are not used to improve Amazon Rekognition as a service.
How should a machine learning specialist architect the solution to satisfy these requirements?

  • A. Enable server-side encryption on the S3 bucke
  • B. Submit an AWS Support ticket to opt out of allowing images to be used for improving the service, and follow the process provided by AWS Support.
  • C. Switch to using an Amazon Rekognition collection to store the image
  • D. Use the IndexFaces andSearchFacesByImage API operations instead of the CompareFaces API operation.
  • E. Switch to using the AWS GovCloud (US) Region for Amazon S3 to store images and for Amazon Rekognition to compare face
  • F. Set up a VPN connection and only call the Amazon Rekognition API operations through the VPN.
  • G. Enable client-side encryption on the S3 bucke
  • H. Set up a VPN connection and only call the Amazon Rekognition API operations through the VPN.

Answer: B

NEW QUESTION 6
A company is using Amazon Polly to translate plaintext documents to speech for automated company announcements However company acronyms are being mispronounced in the current documents How should a Machine Learning Specialist address this issue for future documents'?

  • A. Convert current documents to SSML with pronunciation tags
  • B. Create an appropriate pronunciation lexicon.
  • C. Output speech marks to guide in pronunciation
  • D. Use Amazon Lex to preprocess the text files for pronunciation

Answer: A

NEW QUESTION 7
A Machine Learning Specialist observes several performance problems with the training portion of a machine learning solution on Amazon SageMaker The solution uses a large training dataset 2 TB in size and is using the SageMaker k-means algorithm The observed issues include the unacceptable length of time it takes before the training job launches and poor I/O throughput while training the model
What should the Specialist do to address the performance issues with the current solution?

  • A. Use the SageMaker batch transform feature
  • B. Compress the training data into Apache Parquet format.
  • C. Ensure that the input mode for the training job is set to Pipe.
  • D. Copy the training dataset to an Amazon EFS volume mounted on the SageMaker instance.

Answer: B

NEW QUESTION 8
A manufacturer of car engines collects data from cars as they are being driven The data collected includes timestamp, engine temperature, rotations per minute (RPM), and other sensor readings The company wants to predict when an engine is going to have a problem so it can notify drivers in advance to get engine maintenance The engine data is loaded into a data lake for training
Which is the MOST suitable predictive model that can be deployed into production'?

  • A. Add labels over time to indicate which engine faults occur at what time in the future to turn this into a supervised learning problem Use a recurrent neural network (RNN) to train the model to recognize when an engine might need maintenance for a certain fault.
  • B. This data requires an unsupervised learning algorithm Use Amazon SageMaker k-means to cluster the data
  • C. Add labels over time to indicate which engine faults occur at what time in the future to turn this into a supervised learning problem Use a convolutional neural network (CNN) to train the model to recognize when an engine might need maintenance for a certain fault.
  • D. This data is already formulated as a time series Use Amazon SageMaker seq2seq to model the time series.

Answer: B

NEW QUESTION 9
A gaming company has launched an online game where people can start playing for free but they need to pay if they choose to use certain features The company needs to build an automated system to predict whether or not a new user will become a paid user within 1 year The company has gathered a labeled dataset from 1 million users
The training dataset consists of 1.000 positive samples (from users who ended up paying within 1 year) and 999.1 negative samples (from users who did not use any paid features) Each data sample consists of 200 features including user age, device, location, and play patterns
Using this dataset for training, the Data Science team trained a random forest model that converged with over 99% accuracy on the training set However, the prediction results on a test dataset were not satisfactory.
Which of the following approaches should the Data Science team take to mitigate this issue? (Select TWO.)

  • A. Add more deep trees to the random forest to enable the model to learn more features.
  • B. indicate a copy of the samples in the test database in the training dataset
  • C. Generate more positive samples by duplicating the positive samples and adding a small amount of noise to the duplicated data.
  • D. Change the cost function so that false negatives have a higher impact on the cost value than false positives
  • E. Change the cost function so that false positives have a higher impact on the cost value than false negatives

Answer: CD

NEW QUESTION 10
A Machine Learning Specialist needs to move and transform data in preparation for training Some of the data needs to be processed in near-real time and other data can be moved hourly There are existing Amazon EMR MapReduce jobs to clean and feature engineering to perform on the data
Which of the following services can feed data to the MapReduce jobs? (Select TWO )

  • A. AWSDMS
  • B. Amazon Kinesis
  • C. AWS Data Pipeline
  • D. Amazon Athena
  • E. Amazon ES

Answer: BC

Explanation:
https://aws.amazon.com/jp/emr/?whats-new-cards.sort-by=item.additionalFields.postDateTime&whats-new-car

NEW QUESTION 11
A data engineer at a bank is evaluating a new tabular dataset that includes customer data. The data engineer will use the customer data to create a new model to predict customer behavior. After creating a correlation matrix for the variables, the data engineer notices that many of the 100 features are highly correlated with each other.
Which steps should the data engineer take to address this issue? (Choose two.)

  • A. Use a linear-based algorithm to train the model.
  • B. Apply principal component analysis (PCA).
  • C. Remove a portion of highly correlated features from the dataset.
  • D. Apply min-max feature scaling to the dataset.
  • E. Apply one-hot encoding category-based variables.

Answer: BD

NEW QUESTION 12
A Machine Learning Specialist is assigned a TensorFlow project using Amazon SageMaker for training, and needs to continue working for an extended period with no Wi-Fi access.
Which approach should the Specialist use to continue working?

  • A. Install Python 3 and boto3 on their laptop and continue the code development using that environment.
  • B. Download the TensorFlow Docker container used in Amazon SageMaker from GitHub to their local environment, and use the Amazon SageMaker Python SDK to test the code.
  • C. Download TensorFlow from tensorflow.org to emulate the TensorFlow kernel in the SageMaker environment.
  • D. Download the SageMaker notebook to their local environment then install Jupyter Notebooks on their laptop and continue the development in a local notebook.

Answer: D

NEW QUESTION 13
A machine learning specialist is developing a proof of concept for government users whose primary concern is security. The specialist is using Amazon SageMaker to train a convolutional neural network (CNN) model for a photo classifier application. The specialist wants to protect the data so that it cannot be accessed and transferred to a remote host by malicious code accidentally installed on the training container.
Which action will provide the MOST secure protection?

  • A. Remove Amazon S3 access permissions from the SageMaker execution role.
  • B. Encrypt the weights of the CNN model.
  • C. Encrypt the training and validation dataset.
  • D. Enable network isolation for training jobs.

Answer: D

NEW QUESTION 14
A retail company is selling products through a global online marketplace. The company wants to use machine learning (ML) to analyze customer feedback and identify specific areas for improvement. A developer has built a tool that collects customer reviews from the online marketplace and stores them in an Amazon S3 bucket. This process yields a dataset of 40 reviews. A data scientist building the ML models must identify additional sources of data to increase the size of the dataset.
Which data sources should the data scientist use to augment the dataset of reviews? (Choose three.)

  • A. Emails exchanged by customers and the company’s customer service agents
  • B. Social media posts containing the name of the company or its products
  • C. A publicly available collection of news articles
  • D. A publicly available collection of customer reviews
  • E. Product sales revenue figures for the company
  • F. Instruction manuals for the company’s products

Answer: BDF

NEW QUESTION 15
A retail chain has been ingesting purchasing records from its network of 20,000 stores to Amazon S3 using Amazon Kinesis Data Firehose To support training an improved machine learning model, training records will require new but simple transformations, and some attributes will be combined The model needs lo be retrained daily
Given the large number of stores and the legacy data ingestion, which change will require the LEAST amount of development effort?

  • A. Require that the stores to switch to capturing their data locally on AWS Storage Gateway for loading into Amazon S3 then use AWS Glue to do the transformation
  • B. Deploy an Amazon EMR cluster running Apache Spark with the transformation logic, and have the cluster run each day on the accumulating records in Amazon S3, outputting new/transformed records to Amazon S3
  • C. Spin up a fleet of Amazon EC2 instances with the transformation logic, have them transform the data records accumulating on Amazon S3, and output the transformed records to Amazon S3.
  • D. Insert an Amazon Kinesis Data Analytics stream downstream of the Kinesis Data Firehouse stream that transforms raw record attributes into simple transformed values using SQL.

Answer: D

NEW QUESTION 16
A company wants to classify user behavior as either fraudulent or normal. Based on internal research, a Machine Learning Specialist would like to build a binary classifier based on two features: age of account and transaction month. The class distribution for these features is illustrated in the figure provided.
AWS-Certified-Machine-Learning-Specialty dumps exhibit
Based on this information which model would have the HIGHEST accuracy?

  • A. Long short-term memory (LSTM) model with scaled exponential linear unit (SELL))
  • B. Logistic regression
  • C. Support vector machine (SVM) with non-linear kernel
  • D. Single perceptron with tanh activation function

Answer: C

NEW QUESTION 17
A company is running a machine learning prediction service that generates 100 TB of predictions every day A Machine Learning Specialist must generate a visualization of the daily precision-recall curve from the predictions, and forward a read-only version to the Business team.
Which solution requires the LEAST coding effort?

  • A. Run a daily Amazon EMR workflow to generate precision-recall data, and save the results in Amazon S3 Give the Business team read-only access to S3
  • B. Generate daily precision-recall data in Amazon QuickSight, and publish the results in a dashboard shared with the Business team
  • C. Run a daily Amazon EMR workflow to generate precision-recall data, and save the results in Amazon S3 Visualize the arrays in Amazon QuickSight, and publish them in a dashboard shared with the Business team
  • D. Generate daily precision-recall data in Amazon ES, and publish the results in a dashboard shared with the Business team.

Answer: C

NEW QUESTION 18
A company wants to use automatic speech recognition (ASR) to transcribe messages that are less than 60 seconds long from a voicemail-style application. The company requires the correct identification of 200 unique product names, some of which have unique spellings or pronunciations.
The company has 4,000 words of Amazon SageMaker Ground Truth voicemail transcripts it can use to customize the chosen ASR model. The company needs to ensure that everyone can update their customizations multiple times each hour.
Which approach will maximize transcription accuracy during the development phase?

  • A. Use a voice-driven Amazon Lex bot to perform the ASR customizatio
  • B. Create customer slots within the bot that specifically identify each of the required product name
  • C. Use the Amazon Lex synonym mechanism to provide additional variations of each product name as mis-transcriptions are identified in development.
  • D. Use Amazon Transcribe to perform the ASR customizatio
  • E. Analyze the word confidence scores in the transcript, and automatically create or update a custom vocabulary file with any word that has a confidence score below an acceptable threshold valu
  • F. Use this updated custom vocabulary file in all future transcription tasks.
  • G. Create a custom vocabulary file containing each product name with phonetic pronunciations, and use it with Amazon Transcribe to perform the ASR customizatio
  • H. Analyze the transcripts and manually update the custom vocabulary file to include updated or additional entries for those names that are not being correctly identified.
  • I. Use the audio transcripts to create a training dataset and build an Amazon Transcribe custom language mode
  • J. Analyze the transcripts and update the training dataset with a manually corrected version of transcripts where product names are not being transcribed correctl
  • K. Create an updated custom language model.

Answer: A

NEW QUESTION 19
A company uses a long short-term memory (LSTM) model to evaluate the risk factors of a particular energy sector. The model reviews multi-page text documents to analyze each sentence of the text and categorize it as either a potential risk or no risk. The model is not performing well, even though the Data Scientist has experimented with many different network structures and tuned the corresponding hyperparameters.
Which approach will provide the MAXIMUM performance boost?

  • A. Initialize the words by term frequency-inverse document frequency (TF-IDF) vectors pretrained on a large collection of news articles related to the energy sector.
  • B. Use gated recurrent units (GRUs) instead of LSTM and run the training process until the validation loss stops decreasing.
  • C. Reduce the learning rate and run the training process until the training loss stops decreasing.
  • D. Initialize the words by word2vec embeddings pretrained on a large collection of news articles related to the energy sector.

Answer: C

NEW QUESTION 20
......

Recommend!! Get the Full AWS-Certified-Machine-Learning-Specialty dumps in VCE and PDF From Dumps-hub.com, Welcome to Download: https://www.dumps-hub.com/AWS-Certified-Machine-Learning-Specialty-dumps.html (New 307 Q&As Version)