Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 20 discussion

Exam question from Amazon's AWS Certified Machine Learning - Specialty

Question #: 20
Topic #: 1

[All AWS Certified Machine Learning - Specialty Questions]

An interactive online dictionary wants to add a widget that displays words used in similar contexts. A Machine Learning Specialist is asked to provide word features for the downstream nearest neighbor model powering the widget.
What should the Specialist do to meet these requirements?

A. Create one-hot word encoding vectors.
B. Produce a set of synonyms for every word using Amazon Mechanical Turk.
C. Create word embedding vectors that store edit distance with every other word.
D. Download word embeddings pre-trained on a large corpus.

Show Suggested Answer

Suggested Answer: D 🗳️

by vetal at Dec. 6, 2019, 11:47 a.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

JayK

Highly Voted 3 years, 6 months ago

the solution is word embedding. As it is a interactive online dictionary, we need pre-trained word embedding thus the answer is D. In addition, there is no mention that the online dictonary is unique and does not have a pre-trained word embedding. Thus I strongly feel the answer is D

upvoted 31 times

...

cybe001

Highly Voted 3 years, 6 months ago

D is correct. It is not a specialized dictionary so use the existing word corpus to train the model

upvoted 16 times

...

JonSno

Most Recent 2 months, 1 week ago

Selected Answer: D

D. Download word embeddings pre-trained on a large corpus. Reason : For a nearest neighbor model that finds words used in similar contexts, word embeddings are the best choice. Pre-trained word embeddings capture semantic relationships and contextual similarity between words based on a large text corpus (e.g., Wikipedia, Common Crawl). The Specialist should: Use pre-trained word embeddings like Word2Vec, GloVe, or FastText. Load the embeddings into the model for efficient similarity comparisons. Use a nearest neighbor search algorithm (e.g., FAISS, k-d tree, Annoy) to quickly find similar words.

upvoted 1 times

...

AjoseO

7 months ago

Selected Answer: D

D. Download word embeddings pre-trained on a large corpus. Word embeddings are a type of dense representation of words, which encode semantic meaning in a vector form. These embeddings are typically pre-trained on a large corpus of text data, such as a large set of books, news articles, or web pages, and capture the context in which words are used. Word embeddings can be used as features for a nearest neighbor model, which can be used to find words used in similar contexts. Downloading pre-trained word embeddings is a good way to get started quickly and leverage the strengths of these representations, which have been optimized on a large amount of data. This is likely to result in more accurate and reliable features than other options like one-hot encoding, edit distance, or using Amazon Mechanical Turk to produce synonyms.

upvoted 6 times

...

loict

7 months ago

Selected Answer: D

A. NO - one-hot encoding is a very early featurization stage B. NO - we don't want human labelling C. NO - too costly to do from scratch D. YES - leverage exiting training; the word embeddings will provide vectors than be used to measure distance in the downstream nearest neighbor model

upvoted 3 times

...

game_changer

7 months ago

Selected Answer: D

Pre-trained word embeddings, such as Word2Vec, GloVe, or FastText, capture the semantic and contextual meaning of words based on a large corpus of text data. By downloading pre-trained word embeddings, the Specialist can leverage the semantic relationships between words to provide meaningful word features for the nearest neighbor model powering the widget. Utilizing pre-trained word embeddings allows the model to understand and display words used in similar contexts effectively.

upvoted 2 times

...

game_changer

7 months ago

Selected Answer: D

A. One-hot word encoding vectors: These vectors represent words by marking them as present or absent in a fixed-length binary vector. However, they don't capture relationships between words or their meanings. B. Producing synonyms: This would involve generating similar words for each word manually, which could be time-consuming and might not cover all possible contexts. C. Word embedding vectors based on edit distance: This approach focuses on how similar words are in terms of their spelling or characters, not necessarily their meaning or context in sentences. D. Downloading pre-trained word embeddings: These are vectors that represent words based on their contextual usage in a large dataset, capturing relationships between words and their meanings.

upvoted 5 times

...

elvin_ml_qayiran25091992razor

1 year, 5 months ago

Selected Answer: D

correct D ay tupoy

upvoted 1 times

...

sonoluminescence

1 year, 6 months ago

Selected Answer: D

words that are used in similar contexts will have vectors that are close in the embedding space

upvoted 1 times

...

Mickey321

1 year, 8 months ago

Selected Answer: D

D is correct

upvoted 1 times

...

DavidRou

1 year, 9 months ago

I also believe that D is the correct answer. No reason to create word embeddings from scratch

upvoted 1 times

...

ortamina

1 year, 9 months ago

Selected Answer: D

1. One-hot encoding will blow up the feature space - it is not recommended for a high cardinality problem domain. 2. One still needs to train the word features on large bodies of text to map context to each word

upvoted 1 times

...

Shailendraa

2 years, 7 months ago

12-sep exam

upvoted 1 times

...

helpaws

2 years, 8 months ago

Selected Answer: D

DDDDDDDDDDDDD

upvoted 3 times

...

engomaradel

3 years, 5 months ago

D for sure

upvoted 2 times

...

yeetusdeleetus

3 years, 5 months ago

Definitely D.

upvoted 3 times

...

weslleylc

3 years, 5 months ago

A)It requires that document text be cleaned and prepared such that each word is one-hot encoded. Ref:https://machinelearningmastery.com/what-are-word-embeddings/

upvoted 1 times

...

Load full discussion...

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 20 discussion

Comments

JayK

cybe001

JonSno

AjoseO

loict

game_changer

game_changer

elvin_ml_qayiran25091992razor

sonoluminescence

Mickey321

DavidRou

ortamina

Shailendraa

helpaws

engomaradel

yeetusdeleetus

weslleylc

SY0-701