exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 20 discussion

An interactive online dictionary wants to add a widget that displays words used in similar contexts. A Machine Learning Specialist is asked to provide word features for the downstream nearest neighbor model powering the widget.
What should the Specialist do to meet these requirements?

  • A. Create one-hot word encoding vectors.
  • B. Produce a set of synonyms for every word using Amazon Mechanical Turk.
  • C. Create word embedding vectors that store edit distance with every other word.
  • D. Download word embeddings pre-trained on a large corpus.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
JayK
Highly Voted 3 years, 6 months ago
the solution is word embedding. As it is a interactive online dictionary, we need pre-trained word embedding thus the answer is D. In addition, there is no mention that the online dictonary is unique and does not have a pre-trained word embedding. Thus I strongly feel the answer is D
upvoted 31 times
...
cybe001
Highly Voted 3 years, 6 months ago
D is correct. It is not a specialized dictionary so use the existing word corpus to train the model
upvoted 16 times
...
JonSno
Most Recent 2 months, 1 week ago
Selected Answer: D
D. Download word embeddings pre-trained on a large corpus. Reason : For a nearest neighbor model that finds words used in similar contexts, word embeddings are the best choice. Pre-trained word embeddings capture semantic relationships and contextual similarity between words based on a large text corpus (e.g., Wikipedia, Common Crawl). The Specialist should: Use pre-trained word embeddings like Word2Vec, GloVe, or FastText. Load the embeddings into the model for efficient similarity comparisons. Use a nearest neighbor search algorithm (e.g., FAISS, k-d tree, Annoy) to quickly find similar words.
upvoted 1 times
...
AjoseO
7 months ago
Selected Answer: D
D. Download word embeddings pre-trained on a large corpus. Word embeddings are a type of dense representation of words, which encode semantic meaning in a vector form. These embeddings are typically pre-trained on a large corpus of text data, such as a large set of books, news articles, or web pages, and capture the context in which words are used. Word embeddings can be used as features for a nearest neighbor model, which can be used to find words used in similar contexts. Downloading pre-trained word embeddings is a good way to get started quickly and leverage the strengths of these representations, which have been optimized on a large amount of data. This is likely to result in more accurate and reliable features than other options like one-hot encoding, edit distance, or using Amazon Mechanical Turk to produce synonyms.
upvoted 6 times
...
loict
7 months ago
Selected Answer: D
A. NO - one-hot encoding is a very early featurization stage B. NO - we don't want human labelling C. NO - too costly to do from scratch D. YES - leverage exiting training; the word embeddings will provide vectors than be used to measure distance in the downstream nearest neighbor model
upvoted 3 times
...
game_changer
7 months ago
Selected Answer: D
Pre-trained word embeddings, such as Word2Vec, GloVe, or FastText, capture the semantic and contextual meaning of words based on a large corpus of text data. By downloading pre-trained word embeddings, the Specialist can leverage the semantic relationships between words to provide meaningful word features for the nearest neighbor model powering the widget. Utilizing pre-trained word embeddings allows the model to understand and display words used in similar contexts effectively.
upvoted 2 times
...
game_changer
7 months ago
Selected Answer: D
A. One-hot word encoding vectors: These vectors represent words by marking them as present or absent in a fixed-length binary vector. However, they don't capture relationships between words or their meanings. B. Producing synonyms: This would involve generating similar words for each word manually, which could be time-consuming and might not cover all possible contexts. C. Word embedding vectors based on edit distance: This approach focuses on how similar words are in terms of their spelling or characters, not necessarily their meaning or context in sentences. D. Downloading pre-trained word embeddings: These are vectors that represent words based on their contextual usage in a large dataset, capturing relationships between words and their meanings.
upvoted 5 times
...
elvin_ml_qayiran25091992razor
1 year, 5 months ago
Selected Answer: D
correct D ay tupoy
upvoted 1 times
...
sonoluminescence
1 year, 6 months ago
Selected Answer: D
words that are used in similar contexts will have vectors that are close in the embedding space
upvoted 1 times
...
Mickey321
1 year, 8 months ago
Selected Answer: D
D is correct
upvoted 1 times
...
DavidRou
1 year, 9 months ago
I also believe that D is the correct answer. No reason to create word embeddings from scratch
upvoted 1 times
...
ortamina
1 year, 9 months ago
Selected Answer: D
1. One-hot encoding will blow up the feature space - it is not recommended for a high cardinality problem domain. 2. One still needs to train the word features on large bodies of text to map context to each word
upvoted 1 times
...
Shailendraa
2 years, 7 months ago
12-sep exam
upvoted 1 times
...
helpaws
2 years, 8 months ago
Selected Answer: D
DDDDDDDDDDDDD
upvoted 3 times
...
engomaradel
3 years, 5 months ago
D for sure
upvoted 2 times
...
yeetusdeleetus
3 years, 5 months ago
Definitely D.
upvoted 3 times
...
weslleylc
3 years, 5 months ago
A)It requires that document text be cleaned and prepared such that each word is one-hot encoded. Ref:https://machinelearningmastery.com/what-are-word-embeddings/
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago