Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 224 discussion

Exam question from Amazon's AWS Certified Machine Learning - Specialty

Question #: 224
Topic #: 1

[All AWS Certified Machine Learning - Specialty Questions]

A data scientist at a food production company wants to use an Amazon SageMaker built-in model to classify different vegetables. The current dataset has many features. The company wants to save on memory costs when the data scientist trains and deploys the model. The company also wants to be able to find similar data points for each test data point.

Which algorithm will meet these requirements?

A. K-nearest neighbors (k-NN) with dimension reduction
B. Linear learner with early stopping
C. K-means
D. Principal component analysis (PCA) with the algorithm mode set to random

Show Suggested Answer

Suggested Answer: A 🗳️

by Aninina at Feb. 7, 2023, 2:17 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

Carpediem78

2 months, 1 week ago

Selected Answer: A

to be able to find similar data points for each test data point.

upvoted 2 times

...

sheetalconect

10 months, 2 weeks ago

Selected Answer: C

K-means unsupervised learning

upvoted 1 times

...

ddaanndann

1 year ago

It is A Memory Efficiency: K-nearest neighbors (k-NN) doesn't require storing a model with learned parameters, as it's an instance-based learning algorithm. It simply memorizes the training dataset. Therefore, it saves on memory costs compared to models with learned parameters like linear learners. Dimension Reduction: By employing dimension reduction techniques like Principal Component Analysis (PCA) in conjunction with k-NN, you can reduce the dimensionality of the dataset, which helps in saving memory costs. This makes k-NN with dimension reduction a suitable choice when memory efficiency is a concern. Similar Data Points: K-nearest neighbors naturally provides a measure of similarity between data points. Given a test data point, it finds the k nearest neighbors in the training data. This fulfills the requirement of being able to find similar data points for each test data point.

upvoted 1 times

...

JonSno

1 year ago

A. K-nearest neighbors (k-NN) with dimension reduction ( KNN - Useful for Classification task for different types of veggies based on features + Dimensionality reductions like PCA can be applied prior to KNN to reduce no.of features in dataset , thereby saving memory costs during training and model deployment - also remove noise and data redundancy )

upvoted 1 times

...

Gmishra

1 year ago

Selected Answer: C

https://www.linkedin.com/advice/3/what-difference-between-knn-k-means-skills-computer-science-cx1hc

upvoted 1 times

...

AIWave

1 year, 1 month ago

Selected Answer: C

This is un unsupervised clustering problem not a classification one (A). k-means is a better choice from memory efficiency perspective.

upvoted 2 times

...

akgarg00

1 year, 4 months ago

Selected Answer: C

While A is most voted comment, but knn is really high on memory usage as it stores the data points information to make predictions. Just voting for it because it mentions dimensionality reduction is obtuse. C is the next most probable candidate that fits the bill on every account.

upvoted 2 times

...

DimLam

1 year, 6 months ago

Selected Answer: A

Should be A, as only A is can be used for classification, finding similar data points and dimensionality reduction

upvoted 2 times

...

loict

1 year, 7 months ago

Selected Answer: A

A. YES - K-NN will find the similar datapoints, and dimension reduction will save memory B. NO - Linear learner is for regression or classification, not finding similar data points C. NO - K-means is for unsupervised clustering, not find closest data ponits D. NO - Principal component analysis (PCA) with the algorithm mode set to random

upvoted 2 times

...

kaike_reis

1 year, 8 months ago

Selected Answer: A

C doesn't solve the "too many features" problem + It's well defined the vegetable classes. A is the way

upvoted 1 times

...

blanco750

2 years, 1 month ago

Selected Answer: A

KNN to reduce dimensionality which may help reduce memory utilisation.

upvoted 2 times

...

Valcilio

2 years, 1 month ago

Selected Answer: A

It's A, needs to reduct the dimensionality of the dataset.

upvoted 1 times

...

Chelseajcole

2 years, 1 month ago

Selected Answer: A

They wanna less feature

upvoted 2 times

...

lmimi

2 years, 2 months ago

I will go with A. C is not valid, as K-means is a clustering algorithm that can group similar data points together. However, it does not perform classification, and it is not clear how it addresses the memory cost and similarity search requirements mentioned in the question.

upvoted 1 times

...

Amit11011996

2 years, 2 months ago

Selected Answer: C

It should be C, because it is unsupervised classification problem.

upvoted 1 times

DimLam

1 year, 6 months ago

Not sure that classification is unsupervised problem

upvoted 1 times

...

AjoseO

2 years, 2 months ago

Selected Answer: A

option A suggests using the k-nearest neighbors (k-NN) algorithm with dimension reduction. The k-NN algorithm can be used for classification tasks and dimension reduction can help reduce memory costs. Additionally, k-NN can be used for finding similar data points. K-NN is a simple algorithm that works well with high-dimensional data and can find similar data points.

upvoted 2 times

blanco750

2 years, 1 month ago

agree. By reducing the number of dimensions, you may achieve comparable analysis results using less memory and in a shorter amount of time.

upvoted 1 times

...

drcok87

2 years, 2 months ago

https://ealizadeh.com/blog/knn-and-kmeans/#:~:text=unsupervised%20learning%20algorithm.-,K%20in%20K%2DMeans%20refers%20to%20the%20number%20of%20clusters,using%20different%20values%20for%20K. KNN does use lot of memory because in lazy learning it stores (memorizes) the training dataset. AWS sagemaker however has an improved version of this algorithm. Because the questions does not mention we have labels, we cannot use supervised learning K-means : unsupervised and helps us to classify different vegetables based on their many features. This also find "similar data points for each test data point" c

upvoted 1 times

...

Load full discussion...

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 224 discussion

Comments

Carpediem78

sheetalconect

ddaanndann

JonSno

Gmishra

AIWave

akgarg00

DimLam

loict

kaike_reis

blanco750

Valcilio

Chelseajcole

lmimi

Amit11011996

DimLam

AjoseO

blanco750

drcok87

SY0-701