A data scientist wants to parallelize the training of trees in a gradient boosted tree to speed up the training process. A colleague suggests that parallelizing a boosted tree algorithm can be difficult. Which of the following describes why?
A.
Gradient boosting is not a linear algebra-based algorithm which is required for parallelization.
B.
Gradient boosting requires access to all data at once which cannot happen during parallelization.
C.
Gradient boosting calculates gradients in evaluation metrics using all cores which prevents parallelization.
D.
Gradient boosting is an iterative algorithm that requires information from the previous iteration to perform the next step.
E.
Gradient boosting uses decision trees in each iteration which cannot be parallelized.
D : Gradient boosting is an iterative, sequential algorithm where each tree is trained to correct the errors of the previous trees. This dependency on prior iterations means that each step relies on the output of the previous step
upvoted 1 times
...
Log in to ExamTopics
Sign in:
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
Deuterium44
2 weeks, 3 days ago