A Data Scientist is developing a machine learning model to predict future patient outcomes based on information collected about each patient and their treatment plans. The model should output a continuous value as its prediction. The data available includes labeled outcomes for a set of 4,000 patients. The study was conducted on a group of individuals over the age of 65 who have a particular disease that is known to worsen with age.
Initial models have performed poorly. While reviewing the underlying data, the Data Scientist notices that, out of 4,000 patient observations, there are 450 where the patient age has been input as 0. The other features for these observations appear normal compared to the rest of the sample population
How should the Data Scientist correct this issue?
rajs
Highly Voted 3 years, 6 months agoL2007
3 years, 6 months agoShakespeare
4 months, 2 weeks agovetal
Highly Voted 3 years, 7 months agoJonSno
Most Recent 2 months, 1 week agogrowe
3 months, 4 weeks agoimymoco
9 months, 3 weeks agopn12345
11 months, 3 weeks agorookiee1111
12 months ago3eb0542
1 year agokyuhuck
1 year, 2 months agokyuhuck
1 year, 2 months agoTopg4u
1 year, 2 months agoendeesa
1 year, 5 months agogeoan13
1 year, 5 months agoelvin_ml_qayiran25091992razor
1 year, 5 months agoloict
1 year, 7 months agoFloKo
1 year, 9 months agojyrajan69
1 year, 9 months ago