Statistical queries are queries that use statistical properties of a data set, rather than individual examples. They can support rich analysis of the data, such as histograms, marginals, distributions, and machine learning models. However, they can also pose a risk of revealing confidential data if not properly controlled.
One technique to prevent users from obtaining confidential data by using statistical queries is to prohibit sequences of queries that refer repeatedly to the same population. This can limit the amount of information that can be inferred from the data by reducing the number of scans and comparisons. Therefore, option A is the most helpful technique to prohibit. Option B, C, and D are not specific to statistical queries and may not prevent users from obtaining confidential data by using other methods.
I worked in analytics at a hospital system for almost 12 years. If HIPAA is involved, users can only access data if they have a reason to - such as a nurse or doctor or researcher (that has an IRB approved study) otherwise, running queries that access confidential data is completely prohibited. Also one query can pull all the information for a population to create a local dataset to be analyzed for statistical analysis, so multiple queries for the same population isn't necessary - you only need one query to pull the data. You can als preventing users from obtaining confidential data by using data masking which would also fall under answer D. For those reasons I'm going with ANSWER D: Running queries that access sensitive but actually NONE of these answers are correct as all of the answers should be prefixed with "prevent running....." and none of them do. https://learn.microsoft.com/en-us/sql/relational-databases/security/dynamic-data-masking?view=sql-server-ver16
Answer A) Sequences of queries that refer repeatedly to the same population
Of course, preventing C and D stops sensitive data from spilling, but that is not the point of Statistical Queries.
B is wrong because going against different databases doesn't narrow down any information one 1 particular database.
A. Sequences of queries that refer repeatedly to the same population.
Prohibiting users from performing sequences of queries that refer repeatedly to the same population, also known as iterative or persistent queries, can help prevent users from obtaining confidential data through statistical queries. This is because such queries allow users to identify sensitive data by iteratively refining the query based on previous query results. By prohibiting these queries, it is more difficult for users to extract sensitive data through statistical queries.
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
l00t
Highly Voted 1 year, 9 months agojackdryan
1 year, 6 months agoGuardianAngel
Most Recent 9 months, 3 weeks agoYesPlease
11 months, 1 week agoAlex71
1 year, 9 months agoRollingalx
1 year, 9 months ago