exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 89 discussion

A Machine Learning Specialist is given a structured dataset on the shopping habits of a company's customer base. The dataset contains thousands of columns of data and hundreds of numerical columns for each customer. The Specialist wants to identify whether there are natural groupings for these columns across all customers and visualize the results as quickly as possible.
What approach should the Specialist take to accomplish these tasks?

  • A. Embed the numerical features using the t-distributed stochastic neighbor embedding (t-SNE) algorithm and create a scatter plot.
  • B. Run k-means using the Euclidean distance measure for different values of k and create an elbow plot.
  • C. Embed the numerical features using the t-distributed stochastic neighbor embedding (t-SNE) algorithm and create a line graph.
  • D. Run k-means using the Euclidean distance measure for different values of k and create box plots for each numerical column within each cluster.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
ac71
Highly Voted 2 years, 7 months ago
A is correct. tSNE can do segmentation or grouping as well. Refer: https://towardsdatascience.com/an-introduction-to-t-sne-with-python-example-5a3a293108d1
upvoted 21 times
...
SophieSu
Highly Voted 2 years, 6 months ago
A is definitely the correct answer. Pay attention to what the question is asking: "whether there are natural groupings for these columns across all customers and visualize the results as quickly as possible" The key point is to visualize the "groupings"(exactly what t-SNE scatter plot does, it visualize high-dimensional data points on 2D space). The question does not ask to visualize how many groups you would classify (K-Means Elbow Plot does not visualize the groupings, it is used to determine the optimal # of groups=K).
upvoted 18 times
...
Mickey321
Most Recent 8 months ago
Selected Answer: A
option A
upvoted 1 times
...
kaike_reis
8 months, 3 weeks ago
B doesn't even answer the question: how are you going to see your customer groups in an elbow plot
upvoted 1 times
windy9
6 months, 3 weeks ago
Elbow plot helps you identify the correct number of clusters during K-Means clustering. The clustering happens basis of all the features and thus group employees. This is to help your understanding. And the correct answer however is still tSNE becuase the question focuses on identifying relationships/similarities between the features / columns in the dataset. The correct answer is A
upvoted 1 times
...
...
kaike_reis
8 months, 3 weeks ago
Selected Answer: A
Euclidean Distance suffers for high dimensional data. tSNE can suffers as well, but from my perspective is the correct one.
upvoted 1 times
...
Sylzys
1 year, 1 month ago
Selected Answer: A
Elbow plot will not help visualize groups, only try to predict an optimal number of clusters. I think A is a better choice here
upvoted 2 times
...
AjoseO
1 year, 2 months ago
Selected Answer: A
A. The t-SNE algorithm is a popular tool for visualizing high-dimensional datasets, as it can transform high-dimensional data into a 2D scatter plot, which makes it easier to visualize and understand the relationships between data points. The scatter plot produced by t-SNE can be interpreted as a map that reveals the structure of the data, showing whether there are natural groupings or clusters within the data. Option A is the quickest and simplest way to visualize the data in a meaningful way, allowing the Specialist to gain insights into the data more efficiently.
upvoted 3 times
...
minkhant19
1 year, 5 months ago
A is correct
upvoted 1 times
...
Shailendraa
1 year, 7 months ago
12-sep exam
upvoted 3 times
...
Morsa
1 year, 9 months ago
Selected Answer: A
A as k-means elbow is erroneous. It does not helping here. Scatter plot and t-sne is the right answer
upvoted 2 times
...
ovokpus
1 year, 10 months ago
Selected Answer: A
An elbow plot (B) will not give you what the question is asking for. A scatter plot will, and t-SNE is first for visualizing before dimensionality reduction.
upvoted 2 times
...
Sadgamaya
2 years ago
A is correct as k means suffer from curse of dimensionality and t-she will be a better option.
upvoted 1 times
...
Mircuz
2 years, 1 month ago
Selected Answer: A
The B,C,D plots are meaningless wrt the problem —> A
upvoted 2 times
...
Mircuz
2 years, 1 month ago
Selected Answer: B
t-SNE suffers curse of dimensionality and is indicated for small datasets
upvoted 1 times
...
AddiWei
2 years, 2 months ago
Additionally the numeric features don't require "embedding". I think they meant to write "standardize"
upvoted 1 times
...
apprehensive_scar
2 years, 2 months ago
Rooting for A
upvoted 1 times
...
bitsplease
2 years, 3 months ago
B & D are wrong--because data contains "thousands of columns" and using k-means with euclidean suffers from "curse of dimensionality" Thus leaving A & C, you CANNOT viz clusters/groups/segments in a line graph so correct answer is A
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago