You are creating a deep neural network classification model using a dataset with categorical input values. Certain columns have a cardinality greater than 10,000 unique values. How should you encode these categorical values as input into the model?
A.
Convert each categorical value into an integer value.
B.
Convert the categorical string data to one-hot hash buckets.
C.
Map the categorical variables into a vector of boolean values.
D.
Convert each categorical value into a run-length encoded string.
https://cloud.google.com/ai-platform/training/docs/algorithms/wide-and-deep
If the column is categorical with high cardinality, then the column is treated with hashing, where the number of hash buckets equals to the square root of the number of unique values in the column.
https://towardsdatascience.com/getting-deeper-into-categorical-encodings-for-machine-learning-2312acd347c8
When you have millions uniques values try to do: Hash Encoding
B unconditoinally
https://cloud.google.com/ai-platform/training/docs/algorithms/xgboost#analysis
If the column is categorical with high cardinality, then the column is treated with hashing, where the number of hash buckets equals to the square root of the number of unique values in the column.
A categorical column is considered to have high cardinality if the number of unique values is greater than the square root of the number of rows in the dataset.
I think B is correct
Ref.:"
- https://cloud.google.com/ai-platform/training/docs/algorithms/xgboost
- https://stackoverflow.com/questions/26473233/in-preprocessing-data-with-high-cardinality-do-you-hash-first-or-one-hot-encode
Answer A since with 10.000 unique values one-hot shouldn't be a good solution
https://machinelearningmastery.com/how-to-prepare-categorical-data-for-deep-learning-in-python/
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
PhilipKoku
5 months, 2 weeks agoetienne0
8 months, 3 weeks agoM25
1 year, 6 months agoCloudKida
1 year, 6 months agoJamesDoe
1 year, 8 months agoenghabeth
1 year, 9 months agoJohn_Pongthorn
1 year, 10 months agoMithunDesai
1 year, 11 months agohiromi
1 year, 11 months agohiromi
1 year, 11 months agomil_spyro
1 year, 11 months agomil_spyro
1 year, 11 months agoJeanEl
1 year, 11 months agoseifou
1 year, 11 months agoares81
1 year, 11 months agoLearnSodas
1 year, 11 months ago503b759
1 week, 3 days agoetienne0
8 months, 3 weeks ago