Data set of 2000 rows * approximately 97 variables to discover best prediction algorithms (Logistic regression, KNN, SVM, random forest classifier etc) to one target (household income quantile). This analyis should be coded on an easy to understand Ievel for entry level students. See attached files.
Beware that both files from 2014 and 2017 have different variables that mean the same. Ie. 2014 column for emergency funds has 4 variables with "1 very possible", "2 somewhat possible" and "3 not very possible" and "4 not at all possible". While same column for 2017 data are either "possible" or "not possible". Therefore pls summarize 1 and 2 of 2014 year into "possible", 3 and 4 into "not possible". Same for "1 Female" and "Female".
Explain why certain NA columns should be disregarded, and where filling of NA with most frequent etc makes sense. OK to work with 30 of most meaningful columns. Cleansing of data is therefore 40% of the work.
30% of work is description of variables with highest correlations to the target (household income quartile), histograms and key statistics/distplots with graphs and short explanations. Please label graphs with X and Y axis.
Last 30% of work are best prediction algorithms (Logistic regression, KNN, SVM, random forest classifier etc) to one target (household income quantile) by splitting data into test and train.
The result needs to be in .ipynb notebook format and Excel xls. Notebook needs to explain in plain English with # why steps where taken and what the results are.
If you have experience to teach ML to entry level students without overcomplicating coding, you are the person I am looking for.
23 фрилансеров(-а) в среднем готовы выполнить эту работу за $177
I DO NOT OUTSOURCE I have been a freelancer for the past 8 years, I believe that my experience and skill in this background will prove to be of great help to you. Contact me to discuss more on the details
Hi, I am a PhD candidate and python and ML expert , I can teach coureses in ML and data science as I have done before and also did similar analysis like EDA, SVM,... on the titanic datasets contact me for details
I have good skills in ML and by profession I am assistant professor. I am teaching this subject for past 2 years. for more discussion u can message me back.