Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 25 discussion

Exam question from Amazon's AWS Certified Machine Learning - Specialty

Question #: 25
Topic #: 1

[All AWS Certified Machine Learning - Specialty Questions]

A company wants to classify user behavior as either fraudulent or normal. Based on internal research, a Machine Learning Specialist would like to build a binary classifier based on two features: age of account and transaction month. The class distribution for these features is illustrated in the figure provided.

Based on this information, which model would have the HIGHEST recall with respect to the fraudulent class?

A. Decision tree
B. Linear support vector machine (SVM)
C. Naive Bayesian classifier
D. Single Perceptron with sigmoidal activation function

Show Suggested Answer

Suggested Answer: A 🗳️

by cnethers at Feb. 3, 2021, 12:23 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

E_aws

Highly Voted 3 years, 8 months ago

C is the correct answer because gaussian naive Bayes can do this nicely.

upvoted 12 times

E_aws

3 years, 8 months ago

of course it doesn't mention the gaussian here and refers to naive bayes in general, but I'm still positive with C.

upvoted 1 times

...

blubb

Highly Voted 3 years, 8 months ago

Answer should be A:. B: LINEAR SVM is a linear classifier -> All of these have a linear decision boundary (so it's just a line y = mx+b). This leads to a bad recall and so A must be the right choice.

upvoted 9 times

...

JonSno

Most Recent 4 months, 3 weeks ago

Selected Answer: A

Decision Tree (Best Choice) ✅ Highly flexible: Can capture non-linear decision boundaries, making it effective when the class distribution is not linearly separable. Maximizes recall: A decision tree can prioritize minimizing false negatives by adjusting its splits. Handles imbalanced classes well using class weighting or pruning techniques.

upvoted 3 times

...

MVAS

5 months ago

Selected Answer: C

Gaussian naive Bayes is correct one

upvoted 1 times

...

MintTeaClarity

7 months, 4 weeks ago

Selected Answer: A

A non-linear problem would be a case where linear classifiers, such as naive Bayes, would not be suitable since the classes are not linearly separable. In such a scenario, non-linear classifiers (e.g.,instance-based nearest neighbour classifiers) should be preferred.

upvoted 1 times

...

egorkrash

8 months, 2 weeks ago

Selected Answer: A

decision tree can effectively maximize the recall by drawing a square (3 <= month <= 7, 3 <= age <= 7)

upvoted 2 times

...

MultiCloudIronMan

8 months, 3 weeks ago

Selected Answer: A

Option C. Naive Bayesian classifier is not the best choice for achieving the highest recall for the fraudulent class because it makes strong assumptions about the independence of features. In many real-world scenarios, especially with complex data like user behavior, these assumptions do not hold true, which can lead to suboptimal performance. In contrast, a Decision tree (Option A) can handle feature interactions and is more flexible in capturing the relationships between features, making it more effective in identifying fraudulent behavior and achieving higher recall

upvoted 1 times

...

ML_2

11 months ago

Selected Answer: A

Answer in my opinion is A A Decision Tree Classifier can handle complex decision boundaries and does not assume any particular distribution of data. It is well-suited for cases like this where the decision boundary is non-linear, as seen with the clear separation between the normal and fraudulent transactions. A Naive Bayesian classifier, on the other hand, assumes independence among features and typically performs better when data is normally distributed, which might not be the case here given the data's clustering pattern.

upvoted 1 times

...

ninomfr64

1 year ago

Selected Answer: C

From Claude 3 Haiku: A. NO, decision trees may struggle to capture the linear separability of the classes. B. NO, Linear SVM may not be able to fully exploit the class separation due to its linear decision boundary. C. YES, The Naive Bayesian classifier tends to perform well in situations where the classes are linearly separable. This model requires the features are independent and this is the case D. The single Perceptron with a sigmoidal activation function may not be able to capture the complex class distributions as effectively as the Naive Bayesian classifier.

upvoted 1 times

GrumpyApple

7 months, 3 weeks ago

Funny that if you ask Haiku to explain its reason step by step, it will chose A instead of C ``` Based on the information provided, the model that is likely to have the highest recall with respect to the fraudulent class is the **Decision Tree (Most Voted)**. ```

upvoted 1 times

...

iambasspaul

1 year, 2 months ago

Selected Answer: C

Answer by Claude3: In contrast, the Decision Tree (A) and Linear SVM (B) models are generally more robust to overfitting and can achieve a better balance between recall and precision, but they may not necessarily have the highest recall for the minority class. Considering the importance of maximizing recall for the fraudulent class in this use case, the Naive Bayesian Classifier (C) could be a valid choice, although it may come with the trade-off of lower precision and potentially higher false positive rates.

upvoted 1 times

...

rav009

1 year, 4 months ago

highest recall. So A

upvoted 1 times

...

notbother123

1 year, 4 months ago

Selected Answer: A

Only A (DT) is non-linear among the mentioned algorithms.

upvoted 1 times

...

kyuhuck

1 year, 5 months ago

Selected Answer: A

Given the visualized data, the Decision tree (Option A) is likely the best model to achieve the highest recall for the fraudulent class. It can handle complex patterns and create rules that are more suited for clustered and potentially non-linearly separable classes. Recall is a measure of a model's ability to capture all actual positives, and a decision tree can be tuned to prioritize capturing more of the fraudulent cases at the expense of making more false-positive errors on the normal cases.

upvoted 1 times

...

phdykd

1 year, 6 months ago

if it was highest precision: Given these considerations, the best model for precision would likely be a Support Vector Machine with a non-linear kernel, such as the RBF (Radial Basis Function) kernel. This model can tightly fit the boundary around the fraudulent class, minimizing the inclusion of normal transactions in the fraudulent prediction space, and thus potentially achieving high precision. Precision is sensitive to the false positives, and the flexibility of SVMs with non-linear kernels to create a tight and precise boundary can help to minimize these.

upvoted 1 times

...

phdykd

1 year, 6 months ago

GPT 4 Answer is Decision Tree. Considering the goal is to achieve the highest recall for the fraudulent class, which means we aim to capture as many fraudulent cases as possible even if it means getting more false positives, a Decision Tree would likely be the best option. This is because it can adapt to the complex shape of the class distribution and encapsulate the majority of the fraudulent class within its decision boundaries. Recall is a measure of a model's ability to capture all actual positives, and the decision tree's complex boundary setting capabilities make it well-suited for maximizing recall in this case.

upvoted 2 times

...

taustin2

1 year, 7 months ago

Selected Answer: A

I'm going with A. As pointed out in this article, Naive Bayes performs poorly with non-linear classification problems. The picture shows a case where the classes are not linearly separable. Decision Tree will probably give better results. https://sebastianraschka.com/Articles/2014_naive_bayes_1.html

upvoted 3 times

...

akgarg00

1 year, 8 months ago

Selected Answer: A

Highest recall for fraudulent class means that Precision for Fraudulent predictions can be low. So basically just two conditions Transaction Month nearly greater than 8 and age of accounts greater than 8 can help identify the fraudulent class but it will classify most of non-fraudulent cases as fraudulent.

upvoted 2 times

...

Load full discussion...