****You are given a dataset of car attributes and their gas consumption in MPG (Mile Per Gallon). The task is to build a regression model that can predict a car’s MPG given its attributes.
****Car MPG dataset:
The dataset consists of 393 car models, their attributes and their MPG. The columns in the data set are as follows:
1. Car Model Name
2. MPG - Miles Per Gallon. This is the value that we want to predict
3. Number of cylinders
4. Engine Displacement
5. Engine Horse Power
6. Car Weight
7. Acceleration (time needed to reach a speed of 60 miles/hour)
8. Model Year
****Create a Jupyter Notebook that shows how you do the following in python:
1. Load the data from the csv file using Pandas
2. Preview/print the top 10 rows of the data
3. Create the Features matrix (columns 3-9 above – i.e. exclude the model_name and the mpg
4. Create the Labels vector (the mpg column)
5. Plot the relationship between each of the features and the label mpg on a scatter chart. This will be a total of 7 charts.
6. Normalize the features using the StandardScaler class of the [login to view URL] package
7. Split the data into training and test data using the cross_validation class of sklearn
8. Train a regression model on the training subset using the SGDRegressor class of the
sklearn.linear_models package. Set the number of iterations of the learner to be 500 iterations.
Perform the training as follows:
Train a model using one feature at a time. For example, train a model using the cylinders
feature only, then train a model using the displacement feature only, and so on.
Then, train a model using all the features altogether.
9. For each of the models trained in step 8, apply the model to the test subset and then compute the r2_score, the mean_squared_error, and the mean_absolute_error scores for the predictions of each model trained above.
10. Train a model using all features for 500 iterations while setting the regularization type (penalty) to ‘l1’ instead of the default ‘l2’. Apply the model to the test data and compute the evaluation metrics as in step 9.
11. Train a model using all features for 500 iterations with ‘l2’ regularization and an initial learning rate (eta0) set to 10.0. Compute the evaluation metrics as in step 9.