Predictions of Human Activity Recognition¶

Ehsan Shaghaei¶

Nov 2022

Classical machine learning algorithm and observe how our model perform

we will implement these classical machine learning algorithms

  1. Logistic Regression with Grid Search
  2. Linear SVC with GridSearch
  3. Decision Trees with GridSearchCV
  4. Multi Layer Preceptron
  5. Convolution Neural Network
  6. LSTM + Convolution Neural Network

Loading dataset¶

In [129]:
! wget "https://storage.googleapis.com/kaggle-data-sets/226/793070/bundle/archive.zip?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=gcp-kaggle-com%40kaggle-161607.iam.gserviceaccount.com%2F20221117%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20221117T164256Z&X-Goog-Expires=259200&X-Goog-SignedHeaders=host&X-Goog-Signature=0e80df558006d914dfd1209c0a55a3989a810932b88c2184537afe63a66403544f329f790b41da5d8ad8c7d346826212266dd3419f1da06656b9141c87e720829e522d30f4a2ad36c72347cddf3fb1715940d90061dbdfb804c97f5f1f1e55ddb09da31ff4bf33e316abbcf241f7767d1e837a28b61843fddfd59fb71f0193ea407a134612255ac98e78d704705bd4a869d0728700a0f57c6ff864821091f1ad494ac39d26ef4b9f5ce5afc3b6a3d8acd53762da779ce4914df099a67d2925552e46a2571617857c4d92c0ae8988953c3afd548ea2aa90f02140208ccfa14b0823063ade4c1f9f6e4e748bc35770d01c71c263de6c91bbc7591f79b7bef9a596" -O data.zip
! unzip ./data.zip
--2022-11-17 16:48:09--  https://storage.googleapis.com/kaggle-data-sets/226/793070/bundle/archive.zip?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=gcp-kaggle-com%40kaggle-161607.iam.gserviceaccount.com%2F20221117%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20221117T164256Z&X-Goog-Expires=259200&X-Goog-SignedHeaders=host&X-Goog-Signature=0e80df558006d914dfd1209c0a55a3989a810932b88c2184537afe63a66403544f329f790b41da5d8ad8c7d346826212266dd3419f1da06656b9141c87e720829e522d30f4a2ad36c72347cddf3fb1715940d90061dbdfb804c97f5f1f1e55ddb09da31ff4bf33e316abbcf241f7767d1e837a28b61843fddfd59fb71f0193ea407a134612255ac98e78d704705bd4a869d0728700a0f57c6ff864821091f1ad494ac39d26ef4b9f5ce5afc3b6a3d8acd53762da779ce4914df099a67d2925552e46a2571617857c4d92c0ae8988953c3afd548ea2aa90f02140208ccfa14b0823063ade4c1f9f6e4e748bc35770d01c71c263de6c91bbc7591f79b7bef9a596
Resolving storage.googleapis.com (storage.googleapis.com)... 142.251.2.128, 2607:f8b0:4023:c0d::80, 2607:f8b0:4023:c0b::80
Connecting to storage.googleapis.com (storage.googleapis.com)|142.251.2.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 25693584 (25M) [application/zip]
Saving to: ‘data.zip’

data.zip            100%[===================>]  24.50M   143MB/s    in 0.2s    

2022-11-17 16:48:10 (143 MB/s) - ‘data.zip’ saved [25693584/25693584]

Archive:  ./data.zip
replace test.csv? [y]es, [n]o, [A]ll, [N]one, [r]ename: 

Importing libraries¶

In [ ]:
import numpy as np
import pandas as pd

Obtain the train and test data¶

In [ ]:
train = pd.read_csv("train.csv")
test = pd.read_csv("test.csv")
In [ ]:
columns = train.columns

# Removing '()' from column names
columns = columns.str.replace('[()]','')
columns = columns.str.replace('[-]', '')
columns = columns.str.replace('[,]','')

train.columns = columns
test.columns = columns

test.columns
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:4: FutureWarning: The default value of regex will change from True to False in a future version.
  after removing the cwd from sys.path.
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:5: FutureWarning: The default value of regex will change from True to False in a future version.
  """
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:6: FutureWarning: The default value of regex will change from True to False in a future version.
  
Out[ ]:
Index(['tBodyAccmeanX', 'tBodyAccmeanY', 'tBodyAccmeanZ', 'tBodyAccstdX',
       'tBodyAccstdY', 'tBodyAccstdZ', 'tBodyAccmadX', 'tBodyAccmadY',
       'tBodyAccmadZ', 'tBodyAccmaxX',
       ...
       'fBodyBodyGyroJerkMagkurtosis', 'angletBodyAccMeangravity',
       'angletBodyAccJerkMeangravityMean', 'angletBodyGyroMeangravityMean',
       'angletBodyGyroJerkMeangravityMean', 'angleXgravityMean',
       'angleYgravityMean', 'angleZgravityMean', 'subject', 'Activity'],
      dtype='object', length=563)
In [ ]:
y_train = train.Activity
X_train = train.drop(['subject', 'Activity'], axis=1)
y_test = test.Activity
X_test = test.drop(['subject', 'Activity'], axis=1)

print('Training data size : ', X_train.shape)
print('Test data size : ', X_test.shape)
Training data size :  (7352, 561)
Test data size :  (2947, 561)
In [ ]:
train.head(3)
Out[ ]:
tBodyAccmeanX tBodyAccmeanY tBodyAccmeanZ tBodyAccstdX tBodyAccstdY tBodyAccstdZ tBodyAccmadX tBodyAccmadY tBodyAccmadZ tBodyAccmaxX ... fBodyBodyGyroJerkMagkurtosis angletBodyAccMeangravity angletBodyAccJerkMeangravityMean angletBodyGyroMeangravityMean angletBodyGyroJerkMeangravityMean angleXgravityMean angleYgravityMean angleZgravityMean subject Activity
0 0.288585 -0.020294 -0.132905 -0.995279 -0.983111 -0.913526 -0.995112 -0.983185 -0.923527 -0.934724 ... -0.710304 -0.112754 0.030400 -0.464761 -0.018446 -0.841247 0.179941 -0.058627 1 STANDING
1 0.278419 -0.016411 -0.123520 -0.998245 -0.975300 -0.960322 -0.998807 -0.974914 -0.957686 -0.943068 ... -0.861499 0.053477 -0.007435 -0.732626 0.703511 -0.844788 0.180289 -0.054317 1 STANDING
2 0.279653 -0.019467 -0.113462 -0.995380 -0.967187 -0.978944 -0.996520 -0.963668 -0.977469 -0.938692 ... -0.760104 -0.118559 0.177899 0.100699 0.808529 -0.848933 0.180637 -0.049118 1 STANDING

3 rows × 563 columns

Let's model with our data¶

Labels that are useful in plotting confusion matrix¶

In [ ]:
labels=['LAYING', 'SITTING','STANDING','WALKING','WALKING_DOWNSTAIRS','WALKING_UPSTAIRS']

1. Logistic Regression with Grid Search¶

In [ ]:
from sklearn import linear_model
from sklearn import metrics

from sklearn.model_selection import GridSearchCV
In [ ]:
# start Grid search
parameters = {'C':[0.01, 0.1, 1, 10, 20, 30], 'penalty':['l2','l1']}
log_reg = linear_model.LogisticRegression()
log_reg_grid = GridSearchCV(log_reg, param_grid=parameters, cv=3, verbose=1, n_jobs=-1)
log_reg_grid_results =  perform_model(log_reg_grid, X_train, y_train, X_test, y_test, class_labels=labels)
training the model..
Fitting 3 folds for each of 12 candidates, totalling 36 fits
/usr/local/lib/python3.7/dist-packages/sklearn/model_selection/_validation.py:372: FitFailedWarning: 
18 fits failed out of a total of 36.
The score on these train-test partitions for these parameters will be set to nan.
If these failures are not expected, you can try to debug them by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
18 fits failed with the following error:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/sklearn/model_selection/_validation.py", line 680, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/usr/local/lib/python3.7/dist-packages/sklearn/linear_model/_logistic.py", line 1461, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "/usr/local/lib/python3.7/dist-packages/sklearn/linear_model/_logistic.py", line 449, in _check_solver
    % (solver, penalty)
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

  warnings.warn(some_fits_failed_message, FitFailedWarning)
/usr/local/lib/python3.7/dist-packages/sklearn/model_selection/_search.py:972: UserWarning: One or more of the test scores are non-finite: [0.91458247        nan 0.93416876        nan 0.93675357        nan
 0.93171989        nan 0.93403342        nan 0.93199272        nan]
  category=UserWarning,
/usr/local/lib/python3.7/dist-packages/sklearn/linear_model/_logistic.py:818: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG,
Done 
 

training_time(HH:MM:SS.ms) - 0:00:41.126928


Predicting test data
Done 
 

testing time(HH:MM:SS:ms) - 0:00:00.031748


---------------------
|      Accuracy      |
---------------------

    0.9582626399728538


--------------------
| Confusion Matrix |
--------------------

 [[537   0   0   0   0   0]
 [  0 429  59   0   0   3]
 [  0  16 516   0   0   0]
 [  0   0   0 492   3   1]
 [  0   0   0   4 403  13]
 [  0   0   0  23   1 447]]
-------------------------
| Classifiction Report |
-------------------------
                    precision    recall  f1-score   support

            LAYING       1.00      1.00      1.00       537
           SITTING       0.96      0.87      0.92       491
          STANDING       0.90      0.97      0.93       532
           WALKING       0.95      0.99      0.97       496
WALKING_DOWNSTAIRS       0.99      0.96      0.97       420
  WALKING_UPSTAIRS       0.96      0.95      0.96       471

          accuracy                           0.96      2947
         macro avg       0.96      0.96      0.96      2947
      weighted avg       0.96      0.96      0.96      2947

In [ ]:
plt.figure(figsize=(8,8))
plt.grid(b=False)
plot_confusion_matrix(log_reg_grid_results['confusion_matrix'], classes=labels, cmap=plt.cm.Greens, )
plt.show()
In [ ]:
# observe the attributes of the model 
print_grid_search_attributes(log_reg_grid_results['model'])
--------------------------
|      Best Estimator     |
--------------------------

	LogisticRegression(C=1)

--------------------------
|     Best parameters     |
--------------------------
	Parameters of best estimator : 

	{'C': 1, 'penalty': 'l2'}

---------------------------------
|   No of CrossValidation sets   |
--------------------------------

	Total numbre of cross validation sets: 3

--------------------------
|        Best Score       |
--------------------------

	Average Cross Validate scores of best estimator : 

	0.9367535671959523

2. Linear SVC with GridSearch¶

In [ ]:
from sklearn.svm import LinearSVC
In [ ]:
parameters = {'C':[0.125, 0.5, 1, 2, 8, 16]}
lr_svc = LinearSVC(tol=0.00005)
lr_svc_grid = GridSearchCV(lr_svc, param_grid=parameters, n_jobs=-1, verbose=1)
lr_svc_grid_results = perform_model(lr_svc_grid, X_train, y_train, X_test, y_test, class_labels=labels)
training the model..
Fitting 5 folds for each of 6 candidates, totalling 30 fits
/usr/local/lib/python3.7/dist-packages/sklearn/svm/_base.py:1208: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.
  ConvergenceWarning,
Done 
 

training_time(HH:MM:SS.ms) - 0:02:05.641782


Predicting test data
Done 
 

testing time(HH:MM:SS:ms) - 0:00:00.017514


---------------------
|      Accuracy      |
---------------------

    0.9667458432304038


--------------------
| Confusion Matrix |
--------------------

 [[537   0   0   0   0   0]
 [  2 428  58   0   0   3]
 [  0   9 522   1   0   0]
 [  0   0   0 496   0   0]
 [  0   0   0   3 412   5]
 [  0   0   0  17   0 454]]
-------------------------
| Classifiction Report |
-------------------------
                    precision    recall  f1-score   support

            LAYING       1.00      1.00      1.00       537
           SITTING       0.98      0.87      0.92       491
          STANDING       0.90      0.98      0.94       532
           WALKING       0.96      1.00      0.98       496
WALKING_DOWNSTAIRS       1.00      0.98      0.99       420
  WALKING_UPSTAIRS       0.98      0.96      0.97       471

          accuracy                           0.97      2947
         macro avg       0.97      0.97      0.97      2947
      weighted avg       0.97      0.97      0.97      2947

In [ ]:
print_grid_search_attributes(lr_svc_grid_results['model'])
--------------------------
|      Best Estimator     |
--------------------------

	LinearSVC(C=0.5, tol=5e-05)

--------------------------
|     Best parameters     |
--------------------------
	Parameters of best estimator : 

	{'C': 0.5}

---------------------------------
|   No of CrossValidation sets   |
--------------------------------

	Total numbre of cross validation sets: 5

--------------------------
|        Best Score       |
--------------------------

	Average Cross Validate scores of best estimator : 

	0.941792385206972

3. Decision Trees with GridSearchCV¶

In [ ]:
from sklearn.tree import DecisionTreeClassifier
parameters = {'max_depth':np.arange(3,10,2)}
dt = DecisionTreeClassifier()
dt_grid = GridSearchCV(dt,param_grid=parameters, n_jobs=-1)
dt_grid_results = perform_model(dt_grid, X_train, y_train, X_test, y_test, class_labels=labels)
print_grid_search_attributes(dt_grid_results['model'])
training the model..
Done 
 

training_time(HH:MM:SS.ms) - 0:00:35.095267


Predicting test data
Done 
 

testing time(HH:MM:SS:ms) - 0:00:00.012556


---------------------
|      Accuracy      |
---------------------

    0.8707159823549372


--------------------
| Confusion Matrix |
--------------------

 [[537   0   0   0   0   0]
 [  0 373 118   0   0   0]
 [  0  60 472   0   0   0]
 [  0   0   0 469  21   6]
 [  0   0   0  24 351  45]
 [  0   0   0  60  47 364]]
-------------------------
| Classifiction Report |
-------------------------
                    precision    recall  f1-score   support

            LAYING       1.00      1.00      1.00       537
           SITTING       0.86      0.76      0.81       491
          STANDING       0.80      0.89      0.84       532
           WALKING       0.85      0.95      0.89       496
WALKING_DOWNSTAIRS       0.84      0.84      0.84       420
  WALKING_UPSTAIRS       0.88      0.77      0.82       471

          accuracy                           0.87      2947
         macro avg       0.87      0.87      0.87      2947
      weighted avg       0.87      0.87      0.87      2947

--------------------------
|      Best Estimator     |
--------------------------

	DecisionTreeClassifier(max_depth=9)

--------------------------
|     Best parameters     |
--------------------------
	Parameters of best estimator : 

	{'max_depth': 9}

---------------------------------
|   No of CrossValidation sets   |
--------------------------------

	Total numbre of cross validation sets: 5

--------------------------
|        Best Score       |
--------------------------

	Average Cross Validate scores of best estimator : 

	0.85011824988323

4. MLP¶

In [ ]:
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
y_train = le.fit_transform(y_train)
y_test = le.transform(y_test)
In [ ]:
import tensorflow as tf
import keras
from keras import layers
from keras.callbacks import EarlyStopping

def MLP_Model(data):
    model = keras.Sequential([
        layers.Input(shape=data.shape[1:]),
        layers.Dense(1000, activation="relu"),
        layers.Dense(500, activation="relu"),
        layers.Dense(6, activation="sigmoid"),

    ], name="MLP_Sequential")

    model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])


    return model

mlp_model = MLP_Model(X_train)
mlp_model.summary()


earlystopping = EarlyStopping(patience=30, monitor="val_accuracy", restore_best_weights=True)
mlp_history = mlp_model.fit( X_train, y_train, epochs=100, callbacks = [earlystopping],
         validation_data=( X_test, y_test) )
Model: "MLP_Sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense_9 (Dense)             (None, 1000)              562000    
                                                                 
 dense_10 (Dense)            (None, 500)               500500    
                                                                 
 dense_11 (Dense)            (None, 6)                 3006      
                                                                 
=================================================================
Total params: 1,065,506
Trainable params: 1,065,506
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
230/230 [==============================] - 12s 52ms/step - loss: 0.3395 - accuracy: 0.8579 - val_loss: 0.1847 - val_accuracy: 0.9338
Epoch 2/100
230/230 [==============================] - 4s 16ms/step - loss: 0.1311 - accuracy: 0.9444 - val_loss: 0.1849 - val_accuracy: 0.9308
Epoch 3/100
230/230 [==============================] - 4s 16ms/step - loss: 0.1240 - accuracy: 0.9489 - val_loss: 0.2870 - val_accuracy: 0.9053
Epoch 4/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0914 - accuracy: 0.9637 - val_loss: 0.1379 - val_accuracy: 0.9491
Epoch 5/100
230/230 [==============================] - 4s 15ms/step - loss: 0.0662 - accuracy: 0.9746 - val_loss: 0.2974 - val_accuracy: 0.9019
Epoch 6/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0869 - accuracy: 0.9641 - val_loss: 0.2495 - val_accuracy: 0.9179
Epoch 7/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0973 - accuracy: 0.9627 - val_loss: 0.1743 - val_accuracy: 0.9413
Epoch 8/100
230/230 [==============================] - 4s 15ms/step - loss: 0.0695 - accuracy: 0.9728 - val_loss: 0.1397 - val_accuracy: 0.9511
Epoch 9/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0515 - accuracy: 0.9795 - val_loss: 0.1362 - val_accuracy: 0.9549
Epoch 10/100
230/230 [==============================] - 4s 15ms/step - loss: 0.0746 - accuracy: 0.9717 - val_loss: 0.2879 - val_accuracy: 0.9114
Epoch 11/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0653 - accuracy: 0.9757 - val_loss: 0.2240 - val_accuracy: 0.9338
Epoch 12/100
230/230 [==============================] - 4s 15ms/step - loss: 0.0586 - accuracy: 0.9763 - val_loss: 0.3272 - val_accuracy: 0.9138
Epoch 13/100
230/230 [==============================] - 5s 20ms/step - loss: 0.0568 - accuracy: 0.9778 - val_loss: 0.2031 - val_accuracy: 0.9379
Epoch 14/100
230/230 [==============================] - 4s 17ms/step - loss: 0.0634 - accuracy: 0.9761 - val_loss: 0.1687 - val_accuracy: 0.9440
Epoch 15/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0480 - accuracy: 0.9814 - val_loss: 0.1604 - val_accuracy: 0.9535
Epoch 16/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0471 - accuracy: 0.9800 - val_loss: 0.1693 - val_accuracy: 0.9501
Epoch 17/100
230/230 [==============================] - 4s 15ms/step - loss: 0.0421 - accuracy: 0.9838 - val_loss: 0.3548 - val_accuracy: 0.9192
Epoch 18/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0568 - accuracy: 0.9782 - val_loss: 0.2503 - val_accuracy: 0.9274
Epoch 19/100
230/230 [==============================] - 4s 15ms/step - loss: 0.0503 - accuracy: 0.9801 - val_loss: 0.1812 - val_accuracy: 0.9511
Epoch 20/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0350 - accuracy: 0.9863 - val_loss: 0.1510 - val_accuracy: 0.9525
Epoch 21/100
230/230 [==============================] - 4s 15ms/step - loss: 0.0382 - accuracy: 0.9845 - val_loss: 0.1879 - val_accuracy: 0.9525
Epoch 22/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0384 - accuracy: 0.9848 - val_loss: 0.1941 - val_accuracy: 0.9481
Epoch 23/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0375 - accuracy: 0.9853 - val_loss: 0.2624 - val_accuracy: 0.9169
Epoch 24/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0511 - accuracy: 0.9808 - val_loss: 0.1702 - val_accuracy: 0.9488
Epoch 25/100
230/230 [==============================] - 4s 16ms/step - loss: 0.1110 - accuracy: 0.9746 - val_loss: 0.2060 - val_accuracy: 0.9450
Epoch 26/100
230/230 [==============================] - 4s 15ms/step - loss: 0.0382 - accuracy: 0.9846 - val_loss: 0.2139 - val_accuracy: 0.9345
Epoch 27/100
230/230 [==============================] - 4s 15ms/step - loss: 0.0528 - accuracy: 0.9793 - val_loss: 0.2155 - val_accuracy: 0.9440
Epoch 28/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0373 - accuracy: 0.9860 - val_loss: 0.2287 - val_accuracy: 0.9315
Epoch 29/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0350 - accuracy: 0.9860 - val_loss: 0.1975 - val_accuracy: 0.9457
Epoch 30/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0317 - accuracy: 0.9880 - val_loss: 0.2051 - val_accuracy: 0.9447
Epoch 31/100
230/230 [==============================] - 5s 22ms/step - loss: 0.0335 - accuracy: 0.9874 - val_loss: 0.2315 - val_accuracy: 0.9484
Epoch 32/100
230/230 [==============================] - 4s 15ms/step - loss: 0.0260 - accuracy: 0.9893 - val_loss: 0.1805 - val_accuracy: 0.9498
Epoch 33/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0402 - accuracy: 0.9837 - val_loss: 0.4549 - val_accuracy: 0.9050
Epoch 34/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0576 - accuracy: 0.9796 - val_loss: 0.2005 - val_accuracy: 0.9488
Epoch 35/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0247 - accuracy: 0.9905 - val_loss: 0.2001 - val_accuracy: 0.9508
Epoch 36/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0243 - accuracy: 0.9910 - val_loss: 0.1710 - val_accuracy: 0.9572
Epoch 37/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0411 - accuracy: 0.9831 - val_loss: 0.1543 - val_accuracy: 0.9505
Epoch 38/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0314 - accuracy: 0.9876 - val_loss: 0.1771 - val_accuracy: 0.9532
Epoch 39/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0251 - accuracy: 0.9908 - val_loss: 0.2292 - val_accuracy: 0.9511
Epoch 40/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0326 - accuracy: 0.9874 - val_loss: 0.4719 - val_accuracy: 0.9189
Epoch 41/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0348 - accuracy: 0.9880 - val_loss: 0.2484 - val_accuracy: 0.9450
Epoch 42/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0231 - accuracy: 0.9909 - val_loss: 0.2228 - val_accuracy: 0.9515
Epoch 43/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0293 - accuracy: 0.9884 - val_loss: 0.1999 - val_accuracy: 0.9508
Epoch 44/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0239 - accuracy: 0.9913 - val_loss: 0.2548 - val_accuracy: 0.9386
Epoch 45/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0258 - accuracy: 0.9897 - val_loss: 0.4789 - val_accuracy: 0.9131
Epoch 46/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0371 - accuracy: 0.9859 - val_loss: 0.1644 - val_accuracy: 0.9566
Epoch 47/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0169 - accuracy: 0.9939 - val_loss: 0.2256 - val_accuracy: 0.9522
Epoch 48/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0230 - accuracy: 0.9905 - val_loss: 0.1892 - val_accuracy: 0.9515
Epoch 49/100
230/230 [==============================] - 5s 21ms/step - loss: 0.0167 - accuracy: 0.9940 - val_loss: 0.2360 - val_accuracy: 0.9515
Epoch 50/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0155 - accuracy: 0.9943 - val_loss: 0.2821 - val_accuracy: 0.9464
Epoch 51/100
230/230 [==============================] - 4s 15ms/step - loss: 0.0471 - accuracy: 0.9833 - val_loss: 0.5423 - val_accuracy: 0.8741
Epoch 52/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0243 - accuracy: 0.9913 - val_loss: 0.2577 - val_accuracy: 0.9447
Epoch 53/100
230/230 [==============================] - 4s 15ms/step - loss: 0.0278 - accuracy: 0.9914 - val_loss: 0.1767 - val_accuracy: 0.9522
Epoch 54/100
230/230 [==============================] - 4s 15ms/step - loss: 0.0218 - accuracy: 0.9922 - val_loss: 0.1978 - val_accuracy: 0.9498
Epoch 55/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0204 - accuracy: 0.9918 - val_loss: 0.2966 - val_accuracy: 0.9393
Epoch 56/100
230/230 [==============================] - 5s 20ms/step - loss: 0.0194 - accuracy: 0.9918 - val_loss: 0.2628 - val_accuracy: 0.9454
Epoch 57/100
230/230 [==============================] - 4s 17ms/step - loss: 0.0305 - accuracy: 0.9891 - val_loss: 0.2200 - val_accuracy: 0.9471
Epoch 58/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0157 - accuracy: 0.9950 - val_loss: 0.2317 - val_accuracy: 0.9498
Epoch 59/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0113 - accuracy: 0.9958 - val_loss: 0.2379 - val_accuracy: 0.9542
Epoch 60/100
230/230 [==============================] - 4s 15ms/step - loss: 0.0126 - accuracy: 0.9954 - val_loss: 0.2996 - val_accuracy: 0.9464
Epoch 61/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0210 - accuracy: 0.9932 - val_loss: 0.2176 - val_accuracy: 0.9569
Epoch 62/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0169 - accuracy: 0.9937 - val_loss: 0.2815 - val_accuracy: 0.9457
Epoch 63/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0093 - accuracy: 0.9962 - val_loss: 0.2647 - val_accuracy: 0.9525
Epoch 64/100
230/230 [==============================] - 3s 15ms/step - loss: 0.0189 - accuracy: 0.9925 - val_loss: 0.2187 - val_accuracy: 0.9491
Epoch 65/100
230/230 [==============================] - 4s 16ms/step - loss: 0.0741 - accuracy: 0.9830 - val_loss: 0.1727 - val_accuracy: 0.9484
Epoch 66/100
230/230 [==============================] - 4s 19ms/step - loss: 0.0141 - accuracy: 0.9944 - val_loss: 0.3030 - val_accuracy: 0.9416
In [ ]:
import seaborn as sns

def plot_metrics(history, title):
    x = np.arange(len(history.history["val_accuracy"]))

    plt.plot(x, history.history["loss"], label="loss")
    plt.plot(x, history.history["val_loss"], label="val_loss")

    plt.plot(x, history.history["accuracy"], label="accuracy")
    plt.plot(x, history.history["val_accuracy"], label="val_accuracy")
    
    plt.title(title)

    plt.legend()
    plt.show()
plot_metrics(mlp_history, "MLP History")

mlp_pred = mlp_model.predict( X_test )

mlp_pred = np.argmax(mlp_pred, axis=1)
cf = metrics.confusion_matrix(mlp_pred, y_test)

sns.heatmap(cf, annot=True)
plt.show()
93/93 [==============================] - 1s 5ms/step

5. CNN¶

In [ ]:
def CNN_Model(data):
    model = keras.Sequential(name="CNN_Sequential")
    print(data.shape[1:])
    model.add( layers.Input(shape=data.shape[1:]) )
    model.add( layers.Conv1D(64, 3, activation="relu", name="conv_1") )
    model.add( layers.Conv1D(32, 3, activation="relu", name="conv_2") )
    model.add(layers.Flatten())
    model.add( layers.Dense(64, activation="relu", name="dense_1") )
    model.add( layers.Dense(128, activation="relu", name="dense_2") )
    model.add( layers.Dense(6, activation="softmax", name="output") )
    
    
    
    model.compile(optimizer=tf.keras.optimizers.SGD(), 
                  loss="sparse_categorical_crossentropy", metrics=["accuracy"])
    

    
    return model


X_train = X_train.values.reshape(X_train.shape[0], X_train.shape[1], 1)
X_test = X_test.values.reshape(X_test.shape[0], X_test.shape[1], 1)

cnn_model = CNN_Model(X_train)
cnn_model.summary()

earlystopping = EarlyStopping(patience=30, monitor="val_accuracy", restore_best_weights=True)
cnn_history = cnn_model.fit( X_train,
                            y_train, epochs=20, callbacks = [earlystopping],
                            validation_data=(X_test, y_test) )
(561, 1)
Model: "CNN_Sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv_1 (Conv1D)             (None, 559, 64)           256       
                                                                 
 conv_2 (Conv1D)             (None, 557, 32)           6176      
                                                                 
 flatten_4 (Flatten)         (None, 17824)             0         
                                                                 
 dense_1 (Dense)             (None, 64)                1140800   
                                                                 
 dense_2 (Dense)             (None, 128)               8320      
                                                                 
 output (Dense)              (None, 6)                 774       
                                                                 
=================================================================
Total params: 1,156,326
Trainable params: 1,156,326
Non-trainable params: 0
_________________________________________________________________
Epoch 1/20
230/230 [==============================] - 12s 49ms/step - loss: 1.2333 - accuracy: 0.4664 - val_loss: 0.8231 - val_accuracy: 0.7353
Epoch 2/20
230/230 [==============================] - 11s 49ms/step - loss: 0.6345 - accuracy: 0.7391 - val_loss: 0.5253 - val_accuracy: 0.7682
Epoch 3/20
230/230 [==============================] - 11s 50ms/step - loss: 0.3708 - accuracy: 0.8360 - val_loss: 0.3378 - val_accuracy: 0.8738
Epoch 4/20
230/230 [==============================] - 11s 50ms/step - loss: 0.2779 - accuracy: 0.8783 - val_loss: 0.3671 - val_accuracy: 0.8266
Epoch 5/20
230/230 [==============================] - 13s 55ms/step - loss: 0.2202 - accuracy: 0.9079 - val_loss: 0.2075 - val_accuracy: 0.9260
Epoch 6/20
230/230 [==============================] - 11s 49ms/step - loss: 0.1831 - accuracy: 0.9279 - val_loss: 0.5156 - val_accuracy: 0.8171
Epoch 7/20
230/230 [==============================] - 11s 48ms/step - loss: 0.1529 - accuracy: 0.9378 - val_loss: 0.1600 - val_accuracy: 0.9450
Epoch 8/20
230/230 [==============================] - 11s 48ms/step - loss: 0.1325 - accuracy: 0.9482 - val_loss: 0.1709 - val_accuracy: 0.9393
Epoch 9/20
230/230 [==============================] - 11s 48ms/step - loss: 0.1161 - accuracy: 0.9557 - val_loss: 0.1650 - val_accuracy: 0.9369
Epoch 10/20
230/230 [==============================] - 12s 54ms/step - loss: 0.1020 - accuracy: 0.9612 - val_loss: 0.1496 - val_accuracy: 0.9427
Epoch 11/20
230/230 [==============================] - 11s 48ms/step - loss: 0.0818 - accuracy: 0.9708 - val_loss: 0.1302 - val_accuracy: 0.9484
Epoch 12/20
230/230 [==============================] - 11s 48ms/step - loss: 0.0760 - accuracy: 0.9721 - val_loss: 0.1678 - val_accuracy: 0.9332
Epoch 13/20
230/230 [==============================] - 11s 48ms/step - loss: 0.0722 - accuracy: 0.9759 - val_loss: 0.1263 - val_accuracy: 0.9522
Epoch 14/20
230/230 [==============================] - 11s 48ms/step - loss: 0.0584 - accuracy: 0.9797 - val_loss: 0.1120 - val_accuracy: 0.9583
Epoch 15/20
230/230 [==============================] - 11s 48ms/step - loss: 0.0620 - accuracy: 0.9769 - val_loss: 0.1273 - val_accuracy: 0.9528
Epoch 16/20
230/230 [==============================] - 12s 54ms/step - loss: 0.0512 - accuracy: 0.9830 - val_loss: 0.1149 - val_accuracy: 0.9549
Epoch 17/20
230/230 [==============================] - 11s 49ms/step - loss: 0.0482 - accuracy: 0.9823 - val_loss: 0.2100 - val_accuracy: 0.9203
Epoch 18/20
230/230 [==============================] - 11s 49ms/step - loss: 0.0507 - accuracy: 0.9818 - val_loss: 0.1307 - val_accuracy: 0.9542
Epoch 19/20
230/230 [==============================] - 12s 54ms/step - loss: 0.0413 - accuracy: 0.9860 - val_loss: 0.1088 - val_accuracy: 0.9576
Epoch 20/20
230/230 [==============================] - 11s 49ms/step - loss: 0.0425 - accuracy: 0.9864 - val_loss: 0.1349 - val_accuracy: 0.9532
In [ ]:
plot_metrics(cnn_history, "CNN History")
cnn_pred = cnn_model.predict( X_test )
cnn_pred = np.argmax(cnn_pred, axis=1)
cf = metrics.confusion_matrix(cnn_pred, y_test)
sns.heatmap(cf, annot=True)
plt.show()
93/93 [==============================] - 1s 13ms/step

6. CNN + LSTM¶

In [ ]:
def CNN_LSTM_Model(data):
    model = keras.Sequential(name="CNN_Sequential")
    print(data.shape[1:])
    model.add( layers.Input(shape=data.shape[1:]) )
    model.add( layers.Conv1D(64, 3, activation="relu", name="conv_1") )
    model.add( layers.Conv1D(32, 3, activation="relu", name="conv_2") )
    model.add(layers.LSTM(10, activation="tanh", name="LSTM"))
    model.add( layers.Dense(64, activation="relu", name="dense_1") )
    model.add( layers.Dense(128, activation="relu", name="dense_2") )
    model.add( layers.Dense(6, activation="softmax", name="output") )
    
    
    
    model.compile(optimizer=tf.keras.optimizers.SGD(), 
                  loss="sparse_categorical_crossentropy", metrics=["accuracy"])
    
    
    return model

cnn_lstm_model = CNN_LSTM_Model(X_train)
cnn_lstm_model.summary()

earlystopping = EarlyStopping(patience=30, monitor="val_accuracy", restore_best_weights=True)
cnn_lstm_history = cnn_lstm_model.fit( X_train,
                            y_train, epochs=20, callbacks = [earlystopping],
                            validation_data=(X_test, y_test) )
(561, 1)
Model: "CNN_Sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv_1 (Conv1D)             (None, 559, 64)           256       
                                                                 
 conv_2 (Conv1D)             (None, 557, 32)           6176      
                                                                 
 LSTM (LSTM)                 (None, 10)                1720      
                                                                 
 dense_1 (Dense)             (None, 64)                704       
                                                                 
 dense_2 (Dense)             (None, 128)               8320      
                                                                 
 output (Dense)              (None, 6)                 774       
                                                                 
=================================================================
Total params: 17,950
Trainable params: 17,950
Non-trainable params: 0
_________________________________________________________________
Epoch 1/20
230/230 [==============================] - 56s 234ms/step - loss: 1.7846 - accuracy: 0.1991 - val_loss: 1.7814 - val_accuracy: 0.2260
Epoch 2/20
230/230 [==============================] - 54s 233ms/step - loss: 1.7749 - accuracy: 0.2624 - val_loss: 1.7735 - val_accuracy: 0.2623
Epoch 3/20
230/230 [==============================] - 54s 233ms/step - loss: 1.7591 - accuracy: 0.2896 - val_loss: 1.7455 - val_accuracy: 0.2898
Epoch 4/20
230/230 [==============================] - 53s 231ms/step - loss: 1.7086 - accuracy: 0.3966 - val_loss: 1.7994 - val_accuracy: 0.1822
Epoch 5/20
230/230 [==============================] - 53s 229ms/step - loss: 1.7059 - accuracy: 0.3026 - val_loss: 1.7963 - val_accuracy: 0.1683
Epoch 6/20
230/230 [==============================] - 52s 228ms/step - loss: 1.7036 - accuracy: 0.2998 - val_loss: 1.7843 - val_accuracy: 0.1829
Epoch 7/20
230/230 [==============================] - 53s 229ms/step - loss: 1.7750 - accuracy: 0.1998 - val_loss: 1.7746 - val_accuracy: 0.1822
Epoch 8/20
230/230 [==============================] - 56s 245ms/step - loss: 1.7621 - accuracy: 0.2924 - val_loss: 1.7592 - val_accuracy: 0.2243
Epoch 9/20
230/230 [==============================] - 54s 234ms/step - loss: 1.7289 - accuracy: 0.3032 - val_loss: 1.7014 - val_accuracy: 0.4248
Epoch 10/20
230/230 [==============================] - 54s 237ms/step - loss: 1.6056 - accuracy: 0.4684 - val_loss: 1.4610 - val_accuracy: 0.5019
Epoch 11/20
230/230 [==============================] - 52s 228ms/step - loss: 1.3879 - accuracy: 0.4607 - val_loss: 1.2089 - val_accuracy: 0.5120
Epoch 12/20
230/230 [==============================] - 53s 231ms/step - loss: 1.0420 - accuracy: 0.5408 - val_loss: 0.9084 - val_accuracy: 0.5572
Epoch 13/20
230/230 [==============================] - 54s 233ms/step - loss: 0.9623 - accuracy: 0.5424 - val_loss: 0.8397 - val_accuracy: 0.5477
Epoch 14/20
230/230 [==============================] - 54s 233ms/step - loss: 0.8908 - accuracy: 0.5577 - val_loss: 1.5905 - val_accuracy: 0.3492
Epoch 15/20
230/230 [==============================] - 55s 240ms/step - loss: 1.1010 - accuracy: 0.5122 - val_loss: 1.2621 - val_accuracy: 0.4211
Epoch 16/20
230/230 [==============================] - 52s 227ms/step - loss: 0.8940 - accuracy: 0.5639 - val_loss: 0.7996 - val_accuracy: 0.5881
Epoch 17/20
230/230 [==============================] - 53s 232ms/step - loss: 0.8072 - accuracy: 0.5854 - val_loss: 0.7685 - val_accuracy: 0.6023
Epoch 18/20
230/230 [==============================] - 53s 232ms/step - loss: 0.7594 - accuracy: 0.6046 - val_loss: 0.7550 - val_accuracy: 0.5938
Epoch 19/20
230/230 [==============================] - 53s 231ms/step - loss: 0.7445 - accuracy: 0.6159 - val_loss: 0.7323 - val_accuracy: 0.6196
Epoch 20/20
230/230 [==============================] - 54s 237ms/step - loss: 0.7242 - accuracy: 0.6269 - val_loss: 0.7296 - val_accuracy: 0.6464
In [ ]:
plot_metrics(cnn_history, "CNN+LSTM History")
lstm_pred = cnn_lstm_model.predict( X_test )
lstm_pred = np.argmax(lstm_pred, axis=1)
cf = metrics.confusion_matrix(lstm_pred, y_test)

sns.heatmap(cf, annot=True)
plt.show()
93/93 [==============================] - 4s 39ms/step

4. Comparing all models¶

In [ ]:
print('\n                     Accuracy     Error')
print('                     ----------   --------')
print('Logistic Regression : {:.04}%       {:.04}%'.format(log_reg_grid_results['accuracy'] * 100,\
                                                  100-(log_reg_grid_results['accuracy'] * 100)))

print('Linear SVC          : {:.04}%       {:.04}% '.format(lr_svc_grid_results['accuracy'] * 100,\
                                                        100-(lr_svc_grid_results['accuracy'] * 100)))

print('DecisionTree        : {:.04}%        {:.04}% '.format(dt_grid_results['accuracy'] * 100,\
                                                        100-(dt_grid_results['accuracy'] * 100)))

print('MLP                 : {:.04}%        {:.04}% '.format(metrics.accuracy_score(y_test,mlp_pred) * 100,\
                                                        100-(metrics.accuracy_score(y_test,mlp_pred) * 100)))
print('CNN                 : {:.04}%        {:.04}% '.format(metrics.accuracy_score(y_test,cnn_pred) * 100,\
                                                        100-(metrics.accuracy_score(y_test,cnn_pred) * 100)))
print('LSTM+CNN            : {:.04}%        {:.04}% '.format(metrics.accuracy_score(y_test,lstm_pred) * 100,\
                                                        100-(metrics.accuracy_score(y_test,lstm_pred) * 100)))
                                                             
                     Accuracy     Error
                     ----------   --------
Logistic Regression : 95.83%       4.174%
Linear SVC          : 96.67%       3.325% 
DecisionTree        : 87.07%        12.93% 
MLP                 : 95.72%        4.276% 
CNN                 : 95.32%        4.683% 
LSTM+CNN            : 64.64%        35.36% 

Conclusion :¶

**We can choose ___Logistic regression___ or ___Linear SVC___ or MLP for such a classification methods. **¶