exam questions

Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 153 discussion

Actual exam question from Google's Professional Machine Learning Engineer
Question #: 153
Topic #: 1
[All Professional Machine Learning Engineer Questions]

You work for a bank and are building a random forest model for fraud detection. You have a dataset that includes transactions, of which 1% are identified as fraudulent. Which data transformation strategy would likely improve the performance of your classifier?

  • A. Modify the target variable using the Box-Cox transformation.
  • B. Z-normalize all the numeric features.
  • C. Oversample the fraudulent transaction 10 times.
  • D. Log transform all numeric features.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Scipione_
Highly Voted 1 year, 8 months ago
Selected Answer: C
The answer is C beacause it's the only way to improve model performance. Box-Cox transformation: transform feature values according to normal distribution Z-normalization: transform feature values according to x_new = (x – μ) / σ (so {x_new} have mean 0 and std dev 1) Log transform: just log transformation Also, the Random Forest algorithm is not a distance-based model but it is a tree-based model, there's no need of normalization process.
upvoted 5 times
...
fitri001
Most Recent 6 months, 1 week ago
Selected Answer: C
Oversampling is a common technique to address class imbalance and can significantly improve the performance of the random forest model in fraud detection. It's important to note that oversampling can lead to overfitting, so monitoring the model's performance on unseen data (validation set) is crucial. You might also consider exploring other techniques like undersampling the majority class or using SMOTE (Synthetic Minority Oversampling Technique) for a more balanced approach.
upvoted 3 times
fitri001
6 months, 1 week ago
Class Imbalance: The dataset has a significant class imbalance, with only 1% of transactions being fraudulent (minority class). Random forest models can be biased towards the majority class during training. Oversampling: Oversampling replicates instances from the minority class (fraudulent transactions) in this case. By increasing the representation of the fraudulent class (10 times in this scenario), the model is exposed to more examples of fraud, improving its ability to learn and detect fraudulent patterns.
upvoted 1 times
...
...
pinimichele01
6 months, 3 weeks ago
Selected Answer: C
See #60!
upvoted 1 times
...
M25
1 year, 5 months ago
Selected Answer: C
See #60! The End. Good luck everyone!!!
upvoted 2 times
...
TNT87
1 year, 8 months ago
Selected Answer: C
https://towardsdatascience.com/how-to-build-a-machine-learning-model-to-identify-credit-card-fraud-in-5-stepsa-hands-on-modeling-5140b3bd19f1
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago