You are developing ML models with AI Platform for image segmentation on CT scans. You frequently update your model architectures based on the newest available research papers, and have to rerun training on the same dataset to benchmark their performance. You want to minimize computation costs and manual intervention while having version control for your code. What should you do?
Answer(s): C
Option C is correct because Cloud Build with Cloud Source Repositories enables automated retraining triggers on code pushes, providing version-controlled, repeatable benchmarks with minimal manual intervention and cost control.A) Incorrect — Cloud Functions reacting to Cloud Storage changes is event-driven but does not inherently provide developer-friendly version control or reliable, repository-backed trigger for ML benchmarks. B) Incorrect — Using gcloud to submit jobs is manual per update and lacks continuous integration/automated triggering tied to code changes. D) Incorrect — Cloud Composer (Airflow) daily checks introduces unnecessary scheduling overhead and may miss immediate retraining opportunities after code changes.
https://cloud.google.com/ai-platform/training/docs/training-jobs
Your team needs to build a model that predicts whether images contain a driver's license, passport, or credit card. The data engineering team already built the pipeline and generated a dataset composed of 10,000 images with driver's licenses, 1,000 images with passports, and 1,000 images with credit cards. You now have to train a model with the following label map: [`drivers_license', `passport', `credit_card']. You cannot change the format of the labels. Which loss function should you use?
Answer(s): D
se sparse_categorical_crossentropy. Examples for above 3-class classification problem: [1] , [2], [3]
https://stats.stackexchange.com/questions/326065/cross-entropy-vs-sparse-cross-entropy-when-to- use-one-over-the-other
You are designing an ML recommendation model for shoppers on your company's ecommerce website. You will use Recommendations AI to build, test, and deploy your system. How should you develop recommendations that increase revenue while following best practices?
Answer(s): B
Frequently bought together' recommendations aim to up-sell and cross-sell customers by providing product.
https://rejoiner.com/resources/amazon-recommendations-secret-selling-online/
You are designing an architecture with a serverless ML system to enrich customer support tickets with informative metadata before they are routed to a support agent. You need a set of models to predict ticket priority, predict ticket resolution time, and perform sentiment analysis to help agents make strategic decisions when they process support requests. Tickets are not expected to have any domain-specific terms or jargon.The proposed architecture has the following flow:Which endpoints should the Enrichment Cloud Functions call?
Option C is correct because: 1) priority prediction and 2) resolution time prediction can be served by models deployed in AI Platform (ML service for custom models). 3) sentiment analysis can be served by Cloud Natural Language API, which provides sentiment analysis suitable for text without domain-specific terms. Incorrect — A) AutoML Vision is for image data, not text. Incorrect — B) AutoML Natural Language would be suitable for sentiment but the combination used here requires a scalable custom-model approach in AI Platform for the two numeric predictions. Incorrect — D) Cloud Natural Language API cannot handle the two custom-priority/resolution-time predictions as hosted AI Platform models.
You have trained a deep neural network model on Google Cloud. The model has low loss on the training data, but is performing worse on the validation data. You want the model to be resilient to overfitting. Which strategy should you use when retraining the model?
Option C is correct because Vertex AI Vizier can search hyperparameters (e.g., regularization and dropout) to reduce overfitting by finding balanced capacity and regularization settings, improving validation performance.A) Incorrect — while dropout helps regularization, decreasing learning rate with fixed dropout is not a comprehensive hyperparameter search strategy and may not yield optimal regularization balance.B) Incorrect — L2=0.4 is a large regularization value that may underfit; paired with a fixed learning-rate adjustment does not explore the broader hyperparameter space.D) Incorrect — increasing neurons increases capacity and may worsen overfitting; Vizier should optimize relevant regularization/architecture jointly rather than indiscriminately increasing size.
You built and manage a production system that is responsible for predicting sales numbers. Model accuracy is crucial, because the production model is required to keep up with market changes. Since being deployed to production, the model hasn't changed; however the accuracy of the model has steadily deteriorated. What issue is most likely causing the steady decline in model accuracy?
Option B is correct because in production without model changes, deterioration in accuracy often indicates concept drift and the need for model retraining with fresh data to adapt to changing patterns. A) Poor data quality could cause issues, but if data quality were the root cause, it would likely be apparent from data validation and would not necessitate continuous retraining when data quality remains degraded over time. C) Too few layers is a model capacity issue, not a time-varying production accuracy problem. D) Incorrect data split ratio affects evaluation metrics during training, not ongoing production performance.
You have been asked to develop an input pipeline for an ML training model that processes images from disparate sources at a low latency. You discover that your input data does not fit in memory. How should you create a dataset following Google-recommended best practices?
Option D is correct because TFRecords with Cloud Storage enable scalable, streaming ingestion for large datasets that don’t fit in memory, fitting Google-recommended best practices for efficient I/O and parallel data loading with tf.data. A) prefetch helps latency but assumes data already available via a dataset; it doesn’t address large-scale storage and streaming directly. B) and C) from_tensor_slices/from_tensors require loading entire data into memory, causing OOM errors. They are impractical for large image datasets. Therefore D properly decouples storage from processing and leverages tf.data reading of TFRecords.
https://www.tensorflow.org/api_docs/python/tf/data/Dataset
You are an ML engineer at a large grocery retailer with stores in multiple regions. You have been asked to create an inventory prediction model. Your model's features include region, location, historical demand, and seasonal popularity. You want the algorithm to learn from new inventory data on a daily basis. Which algorithms should you use to build the model?
Option C is correct because Recurrent Neural Networks (RNNs) are suited for sequence data and time-series forecasting, enabling daily updates as new inventory data arrives and capturing temporal dependencies (history, seasonality). Incorrect — A) Classification is not ideal for predicting continuous inventory quantities or time-series values. Incorrect — B) Reinforcement Learning focuses on agent-environment interactions and long-term rewards, not standard supervised demand forecasting with daily updates. Incorrect — D) Convolutional Neural Networks are best for spatial/visual data, not typical tabular time-series inventory prediction.
Share your comments for Google PROFESSIONAL MACHINE LEARNING ENGINEER exam with other users:
good ................