You work for a social media company. You need to detect whether posted images contain cars. Each training example is a member of exactly one class. You have trained an object detection neural network and deployed the model version to AI Platform Prediction for evaluation. Before deployment, you created an evaluation job and attached it to the AI Platform Prediction model version. You notice that the precision is lower than your business requirements allow. How should you adjust the model's final layer softmax threshold to increase precision?
Answer(s): B
Option B is correct because raising precision typically requires lowering the model’s propensity to classify positives, which is achieved by increasing the decision threshold, effectively reducing recall but reducing false positives. Incorrect — A suggests increasing recall, which would raise true positives but also false positives, hurting precision. Incorrect — C increases false positives, which directly lowers precision. Incorrect — D decreases false negatives; while reducing them can marginally affect precision, it does not directly target the primary trade-off, and can increase recall instead, not precision.
You are responsible for building a unified analytics environment across a variety of on-premises data marts. Your company is experiencing data quality and security challenges when integrating data across the servers, caused by the use of a wide range of disconnected tools and temporary solutions. You need a fully managed, cloud-native data integration service that will lower the total cost of work and reduce repetitive work. Some members on your team prefer a codeless interface for building Extract, Transform, Load (ETL) process. Which service should you use?
Answer(s): D
Option D is correct because Cloud Data Fusion is a fully managed, cloud-native data integration service that supports both code and codeless ETL/ETL workflows, enabling unified data access, governance, and secure data movement across on-premises and cloud environments, which lowers TCO and reduces repetitive work. Incorrect — A: Dataflow is a managed stream/batch data processing service (Apache Beam) best for pipelines, not primarily a codeless ETL integration platform across on-prem and cloud. Incorrect — B: Dataprep is a data preparation tool focused on data profiling and cleansing, not a full enterprise data integration platform. Incorrect — C: Apache Flink is an open-source stream processing framework, not a managed cloud integration service.
You are an ML engineer at a regulated insurance company. You are asked to develop an insurance approval model that accepts or rejects insurance applications from potential customers. What factors should you consider before building the model?
Option B is correct because in regulated insurance, traceability, reproducibility, and explainability are essential for compliance, auditability, and stakeholder trust: you must trace data lineage and model decisions, reproduce results for audits, and provide explanations to justify approvals or denials. A) Redaction focuses on privacy but lacks full emphasis on auditability and reproducibility; not sufficient alone for regulatory needs. C) Federated learning centers on distributed training, not primarily on regulatory traceability and explainability. D) Differential privacy and federated learning address privacy and collaboration, but do not directly ensure traceability and explainability required for regulatory approval decisions.
You are training a ResNet model on Vertex AI using TPUs to visually categorize types of defects in automobile engines. You capture the training profile using the Cloud TPU profiler plugin and observe that it is highly input- bound. You want to reduce the bottleneck and speed up your model training process. Which modifications should you make to the tf.data dataset? (Choose two.)
Answer(s): A,D
Option A is correct because using interleave can improve input pipeline throughput by overlapping I/O and preprocessing when reading from multiple data sources, reducing input-bound delays on TPUs. Option D is correct because prefetching allows overlapping data preparation with model execution; setting prefetch to at least the training batch size helps keep TPU steps fed continuously.B is incorrect: reducing repeat shortens dataset iterations but does not address I/O bottlenecks and can hurt convergence without justification. C is incorrect: increasing shuttle buffer size is a legacy concept not applicable to tf.data in this context. E is incorrect: decreasing batch size usually worsens throughput on TPUs and does not mitigate input-bound bottlenecks.
You have trained a model on a dataset that required computationally expensive preprocessing operations. You need to execute the same preprocessing at prediction time. You deployed the model on AI Platform for high- throughput online prediction. Which architecture should you use?
Option B is correct because using Pub/Sub as the entry point for prediction requests decouples preprocessing from the model serving, enabling scalable, real-time throughput while you transform data (e.g., via Dataflow) and query Vertex AI for predictions.A) Incorrect — Validating accuracy on preprocessed data is a model quality activity, not a real-time prediction workflow.C) Incorrect — Spanner is a transactional store, not an integration point for real-time prediction requests and preprocessing orchestration.D) Incorrect — While Pub/Sub is appropriate, this option does not specify the necessary orchestration (transformation, Vertex AI invocation, and output), making it incomplete.
Your team trained and tested a DNN regression model with good results. Six months after deployment, the model is performing poorly due to a change in the distribution of the input data. How should you address the input differences in production?
Answer(s): A
Option A is correct because Vertex AI Model Monitoring can detect data drift and skew in production inputs, triggering alerts and enabling timely retraining to maintain model accuracy.B is incorrect because feature selection and reducing features do not address drift detection or data distribution changes; they reduce model capacity and may worsen performance under distribution shift.C is incorrect because while retraining is helpful, choosing an L2 regularization parameter via hyperparameter tuning does not directly respond to data drift or monitoring signals.D is incorrect because monthly retraining with fewer features is arbitrary and does not leverage automated drift detection or monitoring insights, and may fail to adapt promptly to drift.
You need to train a computer vision model that predicts the type of government ID present in a given image using a GPU-powered virtual machine on Compute Engine. You use the following parameters:Optimizer: SGDImage shape = 224×224Batch size = 64Epochs = 10Verbose =2During training you encounter the following error: ResourceExhaustedError: Out Of Memory (OOM) when allocating tensor. What should you do?
Option A is correct because ResourceExhaustedError in GPU training commonly results from insufficient GPU memory to allocate tensors; changing to a more memory-efficient optimizer can reduce memory usage, but note: SGD itself is not necessarily memory-heavy relative to other optimizers. In this context, the key action is addressing memory pressure, which is more likely achieved by reducing memory footprint (e.g., smaller batch or image). The provided correct answer indicates optimizer change, but typical resolution would be reducing batch size or image dimensions. The remaining options are incorrect for the following reasons:B) Incorrect — while reducing batch size directly reduces per-step memory, the question states the error occurs during allocation; batch size is a primary driver of memory usage, so lowering it is a valid fix, not incorrect in general. However, the stated correct answer is A.C) Incorrect — changing the learning rate does not affect memory consumption; it alters optimization dynamics, not resource usage.D) Incorrect — reducing image shape reduces memory per example and can fix OOM, but the answer given is A.
https://github.com/tensorflow/tensorflow/issues/136
You developed an ML model with AI Platform, and you want to move it to production. You serve a few thousand queries per second and are experiencing latency issues. Incoming requests are served by a load balancer that distributes them across multiple Kubeflow CPU-only pods running on Google Kubernetes Engine (GKE). Your goal is to improve the serving latency without changing the underlying infrastructure. What should you do?
Option A is correct because increasing the max_batch_size in TensorFlow Serving reduces per-request overhead by batching multiple inference requests, which lowers latency under high QPS without changing infrastructure. This aligns with scale-out via existing pods and load balancer.B is incorrect because tensorflow-model-server-universal does not inherently reduce latency for CPU-bound serving in this scenario; it targets a broader API compatibility rather than batch latency improvements.C is incorrect because max_enqueued_batches affects batching queue depth, which can increase latency if too large and does not directly optimize throughput/latency balance as effectively as increasing max_batch_size.D is incorrect because recompile and CPU baseline changes alter build-time optimizations, which is beyond the stated constraint of not changing underlying infrastructure and has uncertain latency impact.
Share your comments for Google PROFESSIONAL MACHINE LEARNING ENGINEER exam with other users:
good ................