exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 4 discussion

A city wants to monitor its air quality to address the consequences of air pollution. A Machine Learning Specialist needs to forecast the air quality in parts per million of contaminates for the next 2 days in the city. As this is a prototype, only daily data from the last year is available.
Which model is MOST likely to provide the best results in Amazon SageMaker?

  • A. Use the Amazon SageMaker k-Nearest-Neighbors (kNN) algorithm on the single time series consisting of the full year of data with a predictor_type of regressor.
  • B. Use Amazon SageMaker Random Cut Forest (RCF) on the single time series consisting of the full year of data.
  • C. Use the Amazon SageMaker Linear Learner algorithm on the single time series consisting of the full year of data with a predictor_type of regressor.
  • D. Use the Amazon SageMaker Linear Learner algorithm on the single time series consisting of the full year of data with a predictor_type of classifier.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
ozan11
Highly Voted 3 years, 7 months ago
answer should be C
upvoted 16 times
...
roytruong
Highly Voted 3 years, 6 months ago
go for C
upvoted 6 times
...
JonSno
Most Recent 2 months, 1 week ago
Selected Answer: C
Amazon SageMaker Linear Learner (Regressor) Why? The Linear Learner algorithm can be used for time series regression. Using predictor_type=regressor, it learns trends and patterns in historical data and extrapolates future values. Given limited historical data (only 1 year), a simple linear regression model might perform well as a baseline. While deep learning models (like Amazon Forecast) may be more advanced, Linear Learner is easier to implement and train for a prototype.
upvoted 1 times
...
loict
7 months ago
Selected Answer: C
A. NO - kNN is not forecasting, it is similarities B. NO - RCF is for anomality detection C. YES - Linear Regression good for forecasting D. NO - we don't want to classify
upvoted 3 times
...
Mickey321
7 months ago
Selected Answer: C
The reason for this choice is that the Linear Learner algorithm is a versatile algorithm that can be used for both regression and classification tasks1. Regression is a type of supervised learning that predicts a continuous numeric value, such as the air quality in parts per million2. The predictor_type parameter specifies whether the algorithm should perform regression or classification3. Since the goal is to forecast a numeric value, the predictor_type should be set to regressor.
upvoted 3 times
...
ninomfr64
10 months, 2 weeks ago
Selected Answer: D
A. Managing Kafka on EC2 is not compatible with least effort requirement B. Doable (in 2024) as Glue supports streaming ETL to consumes streams and supports CSV records -> https://docs.aws.amazon.com/glue/latest/dg/add-job-streaming.html C. Managing an EMR cluster imo is no compatible with least effort requirement D. Firehose supports kinesis data stream as source and it can use lambda to convert CSV records into parquet -> https://docs.aws.amazon.com/firehose/latest/dev/record-format-conversion.html I guess this is a bit old question, pre Glue streaming ETL support (2023) -> https://aws.amazon.com/about-aws/whats-new/2023/03/aws-glue-4-0-streaming-etl/ Thus I'll go for D
upvoted 1 times
...
LocalHero
1 year, 5 months ago
This blog wrote Japanese. but its said using LinearLearner for air pollution prediction. https://aws.amazon.com/jp/blogs/news/build-a-model-to-predict-the-impact-of-weather-on-urban-air-quality-using-amazon-sagemaker/
upvoted 2 times
...
jyrajan69
1 year, 9 months ago
The HyperParameter is . Either “binary_classifier” or “multiclass_classifier” or “regressor”., there is no classifier so the answer is C
upvoted 1 times
...
Venkatesh_Babu
1 year, 9 months ago
Selected Answer: C
Ans should be c
upvoted 1 times
...
ortamina
1 year, 9 months ago
a kNN will require a large value of k to avoid overfitting and we only have 1 year's worth of data - kNNs also face a difficult time extrapolating if the air quality series contains a trend If we had assurances there is no trend in the air quality series (no extrapolation), and we had enough data, then kNN should beat a linear model ... I am inclined to go for C just going off of the cue that "only daily data from last year is available"
upvoted 1 times
ninomfr64
10 months ago
Agree with you analysis, to further expand it: we don't have info about dataset features based on "only daily data from last year is available" this let me think we could be in a situation where our dataset is made up by timestamp and pollution_value so KNN would be pretty useless in this situation.
upvoted 1 times
...
...
brunokiyoshi
2 years, 1 month ago
Selected Answer: C
Random cut forests in timeseries are used for anomaly detection, and not for forecasting. KNN's are classification algorithms. You would use the Linear Learner as a regressor, since forecasting falls into the domain of regression.
upvoted 3 times
brunokiyoshi
2 years, 1 month ago
I mean, you could use KNN's for regression, but for forecasting I don't think so
upvoted 1 times
...
...
Valcilio
2 years, 1 month ago
Selected Answer: C
KNN isn't for time series predicting, go for A!
upvoted 2 times
Valcilio
2 years, 1 month ago
Im sorry, I wanted to say go for C!
upvoted 2 times
...
...
rockyykrish
2 years, 1 month ago
Creating a machine learning model to predict air quality To start small, we will follow the second approach, where we will build a model that will predict the NO2 concentration of any given day based on wind speed, wind direction, maximum temperature, pressure values of that day, and the NO2 concentration of the previous day. For this we will use the Linear Learner algorithm provided in Amazon SageMaker, enabling us to quickly build a model with minimal work. Our model will consist of taking all of the variables in our dataset and using them as features of the Linear Learner algorithm available in Amazon SageMaker
upvoted 1 times
...
AjoseO
2 years, 2 months ago
Selected Answer: A
Answer should be A. k-Nearest-Neighbors (kNN) algorithm will provide the best results for this use case as it is a good fit for time series data, especially for predicting continuous values. The predictor_type of regressor is also appropriate for this task, as the goal is to forecast a continuous value (air quality in parts per million of contaminants). The other options are also viable, but may not provide as good of results as the kNN algorithm, especially with limited data. using the Amazon SageMaker Linear Learner algorithm with a predictor_type of regressor, may still provide reasonable results, but it assumes a linear relationship between the input features and the target variable (air quality), which may not always hold in practice, especially with complex time series data. In such cases, non-linear models like kNN may perform better. Furthermore, the kNN algorithm can handle irregular patterns in the data, which may be present in the air quality data, and provide more accurate predictions.
upvoted 3 times
...
ryuhei
2 years, 7 months ago
Selected Answer: C
Answer is "C" !!!
upvoted 1 times
...
yemauricio
2 years, 7 months ago
answer C
upvoted 1 times
...
Huy
3 years, 5 months ago
I go with A. Linear regression is not suitable for time series data. there is a library that implements knn for time-series https://cran.r-project.org/web/packages/tsfknn/vignettes/tsfknn.html
upvoted 1 times
Huy
3 years, 5 months ago
I mean the air quality have many feature correlations that are not linear.
upvoted 1 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago