We have created a machine learning code (random forest) that classifies extremely well (accuracy 0.9+). It was trained with 10 fold cross validation and out of box against overfitting. A 0.2/0.8 SPlit was made (0.2 testset).
The data set originally consisted of 81 test persons (healthy/ill). 90 lines each are
a test person. So we have a total of 7200 observations. If we are using classic
train/test split and work against overfitting out of bag and 10fold cross
we come in the randomized sampels to an acc. of over 0.9.
Then we thought that we would randomly take out some probands (e.g. the first 20)
say: 1800 lines. We train again with the remaining data set
and reach an acc. of again 0.9 or more on the 0.2 testset.
However, if the algorithm is now applied to the 1800 omitted lines
it predicted only one class and thus has an acc. of 0.5 and
16 freelancers are bidding on average €28 for this job
Hey! I'm Helmi. I have a very good experience in machine learning and random forest in specific so I think I best fit your job. Don't hesitate to text me for details and have a great day!
I am good in tuning machine learning models can I have access to the notebook so that I can have a look on it. if I am unable to do the work no need to pay me. I will be waiting for your reply. Thankyou.