exam questions

Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 63 discussion

Actual exam question from Google's Professional Machine Learning Engineer
Question #: 63
Topic #: 1
[All Professional Machine Learning Engineer Questions]

You work for a large retailer and have been asked to segment your customers by their purchasing habits. The purchase history of all customers has been uploaded to BigQuery. You suspect that there may be several distinct customer segments, however you are unsure of how many, and you don’t yet understand the commonalities in their behavior. You want to find the most efficient solution. What should you do?

  • A. Create a k-means clustering model using BigQuery ML. Allow BigQuery to automatically optimize the number of clusters.
  • B. Create a new dataset in Dataprep that references your BigQuery table. Use Dataprep to identify similarities within each column.
  • C. Use the Data Labeling Service to label each customer record in BigQuery. Train a model on your labeled data using AutoML Tables. Review the evaluation metrics to understand whether there is an underlying pattern in the data.
  • D. Get a list of the customer segments from your company’s Marketing team. Use the Data Labeling Service to label each customer record in BigQuery according to the list. Analyze the distribution of labels in your dataset using Data Studio.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
PhilipKoku
4 months, 3 weeks ago
Selected Answer: A
A) K-means is ideal for unsupervised clustering
upvoted 2 times
...
MultiCloudIronMan
7 months ago
Selected Answer: A
K-means algorithm is used for grouping/clustering data in unsupervised learning experiments.
upvoted 3 times
...
M25
1 year, 5 months ago
Selected Answer: A
Went with A
upvoted 4 times
...
CloudKida
1 year, 5 months ago
Selected Answer: A
when to use k-means : Your data may contain natural groupings or clusters of data. You may want to identify these groupings descriptively in order to make data-driven decisions. For example, a retailer may want to identify natural groupings of customers who have similar purchasing habits or locations. This process is known as customer segmentation. https://cloud.google.com/bigquery/docs/kmeans-tutorial
upvoted 4 times
...
tavva_prudhvi
1 year, 7 months ago
A This is the most efficient solution for segmenting customers based on their purchasing habits, as it utilizes BigQuery's built-in machine learning capabilities to identify distinct clusters of customers based on their purchasing behavior. By allowing BigQuery to automatically optimize the number of clusters, you can ensure that the model identifies the most appropriate number of segments based on the data, without having to manually select the number of clusters.
upvoted 2 times
...
ares81
1 year, 9 months ago
Selected Answer: A
I correct myself. It's A: According to the documentation, if you omit the num_clusters option, BigQuery ML will choose a reasonable default based on the total number of rows in the training data.
upvoted 2 times
...
hiromi
1 year, 10 months ago
Selected Answer: A
A https://cloud.google.com/bigquery-ml/docs/kmeans-tutorial https://towardsdatascience.com/how-to-use-k-means-clustering-in-bigquery-ml-to-understand-and-describe-your-data-better-c972c6f5733b
upvoted 3 times
...
wish0035
1 year, 10 months ago
Selected Answer: A
ans: A, pretty sure. C, D => discarded, very time consuming. B => yes, you can identify similarities within each column, but when i read "you don’t yet understand the commonalities in their behavior" i understand that this job would be difficult, because there could be many columns to analyze, and i don't think that this would be efficient. A => BigQuery ML is compatible with kmeans clustering, it's easy and efficient to create, and i would automatically detect the number of clusters. Also from the BigQuery ML docs: "K-means clustering for data segmentation; for example, identifying customer segments." (Source: https://cloud.google.com/bigquery-ml/docs/introduction#supported_models_in)
upvoted 4 times
...
LearnSodas
1 year, 10 months ago
Selected Answer: A
K-means is a good unsupervised learning algorithm to segment a population based on similarity We can usa K-means directly in BQ, so I think it's "the most efficient way" Labeling is not a good option since we don't really know what make a customer similar to another, and why dataprep if we can use directly BQ?
upvoted 3 times
...
ares81
1 year, 10 months ago
It seems B, to me.
upvoted 1 times
...
neochaotic
1 year, 10 months ago
Selected Answer: B
Its B! Dataprep provides Data profiling functionalities
upvoted 1 times
...
japoji
1 year, 10 months ago
The question is about commonalities of clients by characteristics, no about characteristics by client. I mean with B you are looking for segments of the characteristics which define a client. But you need segments of clients defined by characteristics.
upvoted 1 times
...
Vedjha
1 year, 10 months ago
Will go for 'A' as it is easy to build model in BQML where data is already present and optimization would be auto in case of K-mean algo
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago