Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 207 discussion

Exam question from Amazon's AWS Certified Machine Learning - Specialty

Question #: 207
Topic #: 1

[All AWS Certified Machine Learning - Specialty Questions]

A company is building an application that can predict spam email messages based on email text. The company can generate a few thousand human-labeled datasets that contain a list of email messages and a label of "spam" or "not spam" for each email message. A machine learning (ML) specialist wants to use transfer learning with a Bidirectional Encoder Representations from Transformers (BERT) model that is trained on English Wikipedia text data.

What should the ML specialist do to initialize the model to fine-tune the model with the custom data?

A. Initialize the model with pretrained weights in all layers except the last fully connected layer.
B. Initialize the model with pretrained weights in all layers. Stack a classifier on top of the first output position. Train the classifier with the labeled data.
C. Initialize the model with random weights in all layers. Replace the last fully connected layer with a classifier. Train the classifier with the labeled data.
D. Initialize the model with pretrained weights in all layers. Replace the last fully connected layer with a classifier. Train the classifier with the labeled data.

Show Suggested Answer

Suggested Answer: D 🗳️

by dunhill at Nov. 29, 2022, 5:33 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

2eb8df0

1 month, 1 week ago

Selected Answer: B

Its B. The [CLS] token (first position) represents the embedding of the entire sentence. Doing classification over this token on top makes the most sense

upvoted 2 times

...

giustino98

5 months, 2 weeks ago

Selected Answer: B

I don't see why everyone is voting for D. To fine tune BERT you should add a classifier on top of the [CLS] token representing the hidden state. So it's not clear to me what does the question mean with "last fully connected layer"

upvoted 4 times

...

teka112233

7 months, 1 week ago

Selected Answer: D

D is the right option since initializing the model with pretrained weights, the model can leverage the knowledge learned from a large corpus of text data, such as English Wikipedia text data, to improve its performance on a specific task, such as spam email classification . And Replacing the last fully connected layer with a classifier is necessary because the last layer of BERT is designed for predicting masked words in a sentence, which is different from the task of spam email classification

upvoted 2 times

...

loict

7 months, 2 weeks ago

Selected Answer: B

A. NO - the last fully connected layer will not do SoftMax classification B. YES - output of BERT (word embeddings) can be used as input of classification C. NO - random weights will discard previous transfer learning D. NO - we don't want to loose the word embeddings; "cut the head off" (replacing the last layer) is if we want to learn different classes than what the model was trained for, but here we want to augment

upvoted 1 times

teka112233

7 months, 1 week ago

You should consider that Stacking a classifier on top of the first output position and training it with labeled data is not recommended because it does not take advantage of the knowledge learned from pretraining on a large corpus of text data

upvoted 4 times

...

Mickey321

8 months, 1 week ago

Selected Answer: D

D although was leaning towards B

upvoted 1 times

Mickey321

8 months, 1 week ago

on a second thought going for B

upvoted 2 times

...

kaike_reis

8 months, 1 week ago

Selected Answer: D

Cut the Head Off

upvoted 1 times

...

blanco750

1 year, 1 month ago

Selected Answer: D

D seems correct

upvoted 1 times

...

rrshah83

1 year, 3 months ago

Selected Answer: D

D is a best practice

upvoted 4 times

...

BoroJohn

1 year, 4 months ago

Is B correct? - https://www.analyticsvidhya.com/blog/2020/07/transfer-learning-for-nlp-fine-tuning-bert-for-text-classification/ Freeze the entire architecture – We can even freeze all the layers of the model and attach a few neural network layers of our own and train this new model. Note that the weights of only the attached layers will be updated during model training.

upvoted 2 times

kaike_reis

8 months, 1 week ago

You would have two classifiers stacked, so your predictions would be based in the other classifier.

upvoted 1 times

...

dunhill

1 year, 4 months ago

I think the answer is D.

upvoted 4 times

...

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 207 discussion

Comments

2eb8df0

giustino98

teka112233

loict

teka112233

Mickey321

Mickey321

kaike_reis

blanco750

rrshah83

BoroJohn

kaike_reis

dunhill

SY0-701