Use the Panda library to load the csv data with the following:
- Load the csv data.
- Get a quick overview by printing the first 10 rows of the DataFrame, viewing the last 5 rows of the DataFrame, and showing all the rows.
- View the number of rows and columns in the data frame
- Get the dataframe information about the data set.
- Check the empty value.
- Replace Empty Values Using the value that appears most frequently.
- Check and Discovering Duplicates.
- Removing Duplicates.
Use the use the train_test_split helper function from Scikit-learn to split the training date and test data the following:
- Create the data features (X) and target labels (y).
- Randomly split the data features and labels into a training and test sets by holding 30% of the data for testing.
Use the TensorFlow and Keras library to build a neural network model with the following:
- Two hidden layer of 10 nodes, and the ReLU activation function
- Plot the model architecture
- Compile the model using the adam as the optimizer and using the mean squared error as the loss function.
- Train the model on the training data using 100 epochs.
- Evaluate the model on the test data using model.evaluate
- Make predictions on test data using model.predict
Repeat Part C but use a normalized version of the data. Recall that one way to normalize the data is by subtracting the mean from the individual predictors and dividing by the standard deviation.
- Repeat Part C but use 250 epochs this time for training.
- prevent the overfitting using tensorflow.keras.callbacks.EarlyStopping(), if 30 epochs with no improvement by monitoring the validation loss, the training will be stopped,
F. Increase the number of hidden layers (5 marks)
Repeat part C but use a neural network with the following instead:
- Four hidden layers, each of 10 nodes and ReLU activation function.