Snowflake provides the SYSTEM$CLUSTERING_INFORMATION function to help you assess the effectiveness of clustering by evaluating the average depth of micro-partitions. The average depth indicates how well the data is clustered for the specified clustering key. If the depth is high, it suggests that data is not well-clustered, and you may need to refine the clustering key or re-cluster the table using the RECLUSTER operation.
https://docs.snowflake.com/en/user-guide/tables-clustering-keys#what-is-a-clustering-key
https://docs.snowflake.com/en/user-guide/tables-clustering-keys#calculating-the-clustering-information-for-a-table
It's C : https://docs.snowflake.com/en/user-guide/tables-clustering-keys#strategies-for-selecting-clustering-keys
"Selecting the right columns/expressions for a clustering key can dramatically impact query performance. Analysis of your workload will usually yield good clustering key candidates.
Snowflake recommends prioritizing keys in the order below:
Cluster columns that are most actively used in selective filters"
C
Snowflake recommends prioritizing keys in the order below:
Cluster columns that are most actively used in selective filters. For many fact tables involved in date-based queries (for example “WHERE invoice_date > x AND invoice date <= y”), choosing the date column is a good idea. For event tables, event type might be a good choice, if there are a large number of different event types. (If your table has only a small number of different event types, then see the comments on cardinality below before choosing an event column as a clustering key.)
If there is room for additional cluster keys, then consider columns frequently used in join predicates, for example “FROM table1 JOIN table2 ON table2.column_A = table1.column_B”.
C
https://docs.snowflake.com/en/user-guide/tables-clustering-keys#:~:text=Cluster%20columns%20that%20are%20most%20actively%20used%20in%20selective%20filters
I think it is D, per the doc you listed:
The number of distinct values (i.e. cardinality) in a column/expression is a critical aspect of selecting it as a clustering key. It is important to choose a clustering key that has:
A large enough number of distinct values to enable effective pruning on the table.
A small enough number of distinct values to allow Snowflake to effectively group rows in the same micro-partitions.
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
bor4un
2 months, 3 weeks agoMatthieuDN
2 months, 3 weeks agod22770a
6 months agojoshguy40
7 months, 3 weeks ago08c95eb
10 months, 2 weeks agoJacobr5000
11 months agoLematthew31
11 months, 3 weeks agod22770a
6 months agoyaho5
1 year agoNachoPrendes
1 year agoinduna
1 year agosimus90
1 year ago