A Machine Learning Specialist is building a prediction model for a large number of features using linear models, such as linear regression and logistic regression. During exploratory data analysis, the Specialist observes that many features are highly correlated with each other. This may make the model unstable.What should be done to reduce the impact of having such a large number of features?
Answer(s): C
A Machine Learning Specialist is implementing a full Bayesian network on a dataset that describes public transit in New York City. One of the random variables is discrete, and represents the number of minutes New Yorkers wait for a bus given that the buses cycle every 10 minutes, with a mean of 3 minutes.Which prior probability distribution should the ML Specialist use for this variable?
Answer(s): A
If you have information about the average (mean) number of things that happen in some given time period / interval, Poisson distribution can give you a way to predict the odds of getting some other value on a given future day
https://brilliant.org/wiki/poisson-distribution/
A Data Science team within a large company uses Amazon SageMaker notebooks to access data stored in Amazon S3 buckets. The IT Security team is concerned that internet-enabled notebook instances create a security vulnerability where malicious code running on the instances could compromise data privacy. The company mandates that all instances stay within a secured VPC with no internet access, and data communication traffic must stay within the AWS network.How should the Data Science team configure the notebook instance placement to meet these requirements?
A Machine Learning Specialist has created a deep learning neural network model that performs well on the training data but performs poorly on the test data.Which of the following methods should the Specialist consider using to correct this? (Choose three.)
Answer(s): B,C,F
A Data Scientist needs to create a serverless ingestion and analytics solution for high-velocity, real-time streaming data.The ingestion process must buffer and convert incoming records from JSON to a query-optimized, columnar format without data loss. The output datastore must be highly available, and Analysts must be able to run SQL queries against the data and connect to existing business intelligence dashboards.Which solution should the Data Scientist build to satisfy the requirements?
An online reseller has a large, multi-column dataset with one column missing 30% of its data. A Machine Learning Specialist believes that certain columns in the dataset could be used to reconstruct the missing data.Which reconstruction approach should the Specialist use to preserve the integrity of the dataset?
https://worldwidescience.org/topicpages/i/imputing+missing+values.html
A company is setting up an Amazon SageMaker environment. The corporate data security policy does not allow communication over the internet.How can the company enable the Amazon SageMaker service without enabling direct internet access to Amazon SageMaker notebook instances?
https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-dg.pdf (516)https://docs.aws.amazon.com/zh_tw/vpc/latest/userguide/vpc-endpoints.html
A Machine Learning Specialist is training a model to identify the make and model of vehicles in images. The Specialist wants to use transfer learning and an existing model trained on images of general objects. The Specialist collated a large custom dataset of pictures containing different vehicle makes and models.What should the Specialist do to initialize the model to re-train it with the custom data?
Answer(s): B
Share your comments for Amazon MLS-C01 exam with other users:
i am thrilled to say that i passed my amazon web services mls-c01 exam, thanks to study materials. they were comprehensive and well-structured, making my preparation efficient.