exam questions

Exam AI-100 All Questions

View all questions & answers for the AI-100 exam

Exam AI-100 topic 2 question 27 discussion

Actual exam question from Microsoft's AI-100
Question #: 27
Topic #: 2
[All AI-100 Questions]

HOTSPOT -
You are designing a solution that will ingest data from an Azure IoT Edge device, preprocess the data in Azure Machine Learning, and then move the data to
Azure HDInsight for further processing.
What should you include in the solution? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:

Show Suggested Answer Hide Answer
Suggested Answer:
Box 1: Export Data -
The Export data to Hive option in the Export Data module in Azure Machine Learning Studio. This option is useful when you are working with very large datasets, and want to save your machine learning experiment data to a Hadoop cluster or HDInsight distributed storage.

Box 2: Apache Hive -
Apache Hive is a data warehouse system for Apache Hadoop. Hive enables data summarization, querying, and analysis of data. Hive queries are written in
HiveQL, which is a query language similar to SQL.

Box 3: Azure Data Lake -
Default storage for the HDFS file system of HDInsight clusters can be associated with either an Azure Storage account or an Azure Data Lake Storage.
References:
https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/export-to-hive-query https://docs.microsoft.com/en-us/azure/hdinsight/hadoop/hdinsight-use-hive

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
troj
Highly Voted 4 years, 11 months ago
The answer is right, Apache Hive is a better choice since Apache Spark is more for in-memory data processing https://www.quora.com/What-is-the-difference-between-Apache-Hive-and-Apache-Spark
upvoted 6 times
Shrek4u
4 years, 10 months ago
But, there is no mention of cost implication. By this, Apache Spark is more validated.
upvoted 1 times
...
...
fred777
Highly Voted 4 years, 10 months ago
From my perspective, Output should be HDFS, since it gonna used by HDInsight.
upvoted 5 times
varga123akos
4 years, 8 months ago
HDInsight storage options are listed here: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-compare-storage-options Based on this, the correct answer should be Azure Data Lake
upvoted 1 times
Nova077
4 years, 7 months ago
I think HDFS is not necessarily a storage option. Its a data file structure within Hadoop. The storage should be something like Cosmos DB or Data Lake. Datalake is a better option as its IOT data.
upvoted 2 times
...
...
...
sayak17
Most Recent 4 years, 7 months ago
Here is the link for the last one https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-use-blob-storage
upvoted 1 times
...
sayak17
4 years, 7 months ago
Is this question still relevant? The question and the solution links direct us to the old azure ml studio(classic). However the current version has no mention of exporting data to Apache Hive https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-module-reference/export-data These are the ones supported now: Azure Blob Container Azure File Share Azure Data Lake Storage Gen1 Azure Data Lake Storage Gen2 Azure SQL database So can this question still come or is it outdated?
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago