exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 224 discussion

A data scientist at a food production company wants to use an Amazon SageMaker built-in model to classify different vegetables. The current dataset has many features. The company wants to save on memory costs when the data scientist trains and deploys the model. The company also wants to be able to find similar data points for each test data point.

Which algorithm will meet these requirements?

  • A. K-nearest neighbors (k-NN) with dimension reduction
  • B. Linear learner with early stopping
  • C. K-means
  • D. Principal component analysis (PCA) with the algorithm mode set to random
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Carpediem78
2 months, 1 week ago
Selected Answer: A
to be able to find similar data points for each test data point.
upvoted 2 times
...
sheetalconect
10 months, 2 weeks ago
Selected Answer: C
K-means unsupervised learning
upvoted 1 times
...
ddaanndann
1 year ago
It is A Memory Efficiency: K-nearest neighbors (k-NN) doesn't require storing a model with learned parameters, as it's an instance-based learning algorithm. It simply memorizes the training dataset. Therefore, it saves on memory costs compared to models with learned parameters like linear learners. Dimension Reduction: By employing dimension reduction techniques like Principal Component Analysis (PCA) in conjunction with k-NN, you can reduce the dimensionality of the dataset, which helps in saving memory costs. This makes k-NN with dimension reduction a suitable choice when memory efficiency is a concern. Similar Data Points: K-nearest neighbors naturally provides a measure of similarity between data points. Given a test data point, it finds the k nearest neighbors in the training data. This fulfills the requirement of being able to find similar data points for each test data point.
upvoted 1 times
...
JonSno
1 year ago
A. K-nearest neighbors (k-NN) with dimension reduction ( KNN - Useful for Classification task for different types of veggies based on features + Dimensionality reductions like PCA can be applied prior to KNN to reduce no.of features in dataset , thereby saving memory costs during training and model deployment - also remove noise and data redundancy )
upvoted 1 times
...
Gmishra
1 year ago
Selected Answer: C
https://www.linkedin.com/advice/3/what-difference-between-knn-k-means-skills-computer-science-cx1hc
upvoted 1 times
...
AIWave
1 year, 1 month ago
Selected Answer: C
This is un unsupervised clustering problem not a classification one (A). k-means is a better choice from memory efficiency perspective.
upvoted 2 times
...
akgarg00
1 year, 4 months ago
Selected Answer: C
While A is most voted comment, but knn is really high on memory usage as it stores the data points information to make predictions. Just voting for it because it mentions dimensionality reduction is obtuse. C is the next most probable candidate that fits the bill on every account.
upvoted 2 times
...
DimLam
1 year, 6 months ago
Selected Answer: A
Should be A, as only A is can be used for classification, finding similar data points and dimensionality reduction
upvoted 2 times
...
loict
1 year, 7 months ago
Selected Answer: A
A. YES - K-NN will find the similar datapoints, and dimension reduction will save memory B. NO - Linear learner is for regression or classification, not finding similar data points C. NO - K-means is for unsupervised clustering, not find closest data ponits D. NO - Principal component analysis (PCA) with the algorithm mode set to random
upvoted 2 times
...
kaike_reis
1 year, 8 months ago
Selected Answer: A
C doesn't solve the "too many features" problem + It's well defined the vegetable classes. A is the way
upvoted 1 times
...
blanco750
2 years, 1 month ago
Selected Answer: A
KNN to reduce dimensionality which may help reduce memory utilisation.
upvoted 2 times
...
Valcilio
2 years, 1 month ago
Selected Answer: A
It's A, needs to reduct the dimensionality of the dataset.
upvoted 1 times
...
Chelseajcole
2 years, 1 month ago
Selected Answer: A
They wanna less feature
upvoted 2 times
...
lmimi
2 years, 2 months ago
I will go with A. C is not valid, as K-means is a clustering algorithm that can group similar data points together. However, it does not perform classification, and it is not clear how it addresses the memory cost and similarity search requirements mentioned in the question.
upvoted 1 times
...
Amit11011996
2 years, 2 months ago
Selected Answer: C
It should be C, because it is unsupervised classification problem.
upvoted 1 times
DimLam
1 year, 6 months ago
Not sure that classification is unsupervised problem
upvoted 1 times
...
...
AjoseO
2 years, 2 months ago
Selected Answer: A
option A suggests using the k-nearest neighbors (k-NN) algorithm with dimension reduction. The k-NN algorithm can be used for classification tasks and dimension reduction can help reduce memory costs. Additionally, k-NN can be used for finding similar data points. K-NN is a simple algorithm that works well with high-dimensional data and can find similar data points.
upvoted 2 times
blanco750
2 years, 1 month ago
agree. By reducing the number of dimensions, you may achieve comparable analysis results using less memory and in a shorter amount of time.
upvoted 1 times
...
...
drcok87
2 years, 2 months ago
https://ealizadeh.com/blog/knn-and-kmeans/#:~:text=unsupervised%20learning%20algorithm.-,K%20in%20K%2DMeans%20refers%20to%20the%20number%20of%20clusters,using%20different%20values%20for%20K. KNN does use lot of memory because in lazy learning it stores (memorizes) the training dataset. AWS sagemaker however has an improved version of this algorithm. Because the questions does not mention we have labels, we cannot use supervised learning K-means : unsupervised and helps us to classify different vegetables based on their many features. This also find "similar data points for each test data point" c
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago