Convolutional Neural Network(CNN) project

Topic: Human Activity Recognition Using Smartphones

  • Project Team
    ### MD Sarwar Zahan
    ### Gernot Manfred Rischner

START

(reset - Deleting all variables, just if necessary)

In [1]:
%reset
Once deleted, variables cannot be recovered. Proceed (y/[n])? y
In [2]:
import pandas as pd #pandas package to read data
import tensorflow as tf  #tensorflow
import numpy as np #numpy package
from scipy import stats
import matplotlib.pyplot as plt

Read training and test data

In [3]:
train_x=pd.read_csv("C:\MLProject\X_train.csv"); 
train_y=pd.read_csv(r'C:\MLProject\train\y_train.txt', delim_whitespace=True, header=None);
test_x=pd.read_csv("C:\MLProject\X_test.csv");
test_y=pd.read_csv(r'C:\MLProject\test\y_test.txt', delim_whitespace=True, header=None);

Select features

In [4]:
train_x=train_x.iloc[:,0:-2]; #select features from training data set
test_x=test_x.iloc[:,0:-2]; #select features from test data set
In [5]:
numberOfChannels=train_x.shape[1]
In [6]:
train_x=np.array(train_x); #save training data in numpy array format for better handling
train_y=np.array(train_y); #save training data in numpy array format for better handling

test_x=np.array(test_x); #save training data in numpy array format for better handling
test_y=np.array(test_y); #save training data in numpy array format for better handling

Split the data set into a representation(format), which can then be used to feed the CNN. Here, the data is splited into blocks, each of size 30 (30 was chosen, because if we analyse the dataset, it can be seen, that the average number of instances per label are around 26.24, so just taking a value a bit higher then the average). Due to the fact, that we got 561 features(561 channels) and for each channel a 1-dimensional signal, each block have the size (1,6,561). Furthermore, these blocks are then stacked. To assign a appropriate label to each block, simply, the highest number of labels in each block where choosen.

In [7]:
blockSize=30; #number of instances per block
blockTR=np.empty((0,numberOfChannels,blockSize)); #init empty block
blockTRLabel=np.empty((0)); #init empty label of block

start=0; #temporary starting index
end=0; #temporary ending index

for i in range(1,train_x.shape[0]): #loop to go through dataset
    if(i%blockSize==0): #if blockSize is reached, assign indices
        end=i;
        tempTR=train_x[start:end]; #temporary array with cooresponding block size
        blockTR=np.vstack([blockTR,np.dstack(tempTR)]) #stack blocks
        blockTRLabel=np.append(blockTRLabel,stats.mode(train_y[start:end])[0][0]) #check for the  highest numbers
                                                                                  #of corresponding labels in the
                                                                                  #temporary block and assign this
                                                                                  #label to the block
        start=i;   

Doing the same as above, here for the test data set

In [8]:
blockTS=np.empty((0,numberOfChannels,blockSize)); #init empty block
blockTSLabel=np.empty((0)); #init empty label of block

start=0; #temporary starting index
end=0; #temporary ending index

for i in range(1,test_x.shape[0]): #loop to go through dataset
    if(i%blockSize==0): #if blockSize is reached, assign indices
        end=i;
        tempTS=test_x[start:end]; #temporary array with cooresponding block size
        blockTS=np.vstack([blockTS,np.dstack(tempTS)]) #stack blocks
        blockTSLabel=np.append(blockTSLabel,stats.mode(test_y[start:end])[0][0]) #check for the  highest numbers
                                                                                 #of corresponding labels in the
                                                                                 #temporary block and assign this
                                                                                 #label to the block
        start=i;   

Using get_dummies to get a proper label vector for each block. e.g. Block 1 is assigned with label (1), which then will result in (1,0,0,0,0,0) due to the fact that we have 6 different labels. Or, e.g. label (2) --> (0,1,0,0,0,0); label(3) --> (0,0,1,0,0,0) and so on. Furthermore, also reshape the blocks to an appropriate format.

In [9]:
blockTRLabel=np.asarray(pd.get_dummies(blockTRLabel), dtype = np.int8)
blockTR=blockTR.reshape(len(blockTR),1,blockSize,numberOfChannels)

blockTSLabel=np.asarray(pd.get_dummies(blockTSLabel), dtype = np.int8)
blockTS=blockTS.reshape(len(blockTS),1,blockSize,numberOfChannels)

Assign new format to the training and test data set variables

In [10]:
train_x = blockTR
train_y = blockTRLabel
test_x = blockTS
test_y = blockTSLabel

Define network parameters

In [11]:
kernelSize=10
numberOfLabels=6

batch_size = 15

depth=10
num_hidden=300

dropout=0.6

Define placeholders, convolution and max polling function

Note: Execute the line below, if training the network from the beginning again. Otherwise, error "variable exitsting" occur. This will reset the graph.

In [12]:
tf.reset_default_graph() #reset, if necessary
In [13]:
#Placeholders
x=tf.placeholder(tf.float32,shape=[None,1,blockSize,numberOfChannels]);
y=tf.placeholder(tf.float32,shape=[None,numberOfLabels]);
keep_prob=tf.placeholder(tf.float32);

def conv(x,W):
    return tf.nn.depthwise_conv2d(x,W,strides=[1,1,1,1], padding='VALID')

def max_poll(x):
    return tf.nn.max_pool(x, ksize=[1,1,4,1], strides=[1,1,2,1], padding='VALID')

Building the Convolutional Neural Network Model

In [14]:
def cnn(x,dropout):
        
    #Define weights for the two convolutional layers using Xavier initializer
    weights={'w_convolution1': tf.get_variable(name='w_convolution1', shape=[1,kernelSize,numberOfChannels,depth],initializer=tf.contrib.layers.xavier_initializer()),
             'w_convolution2': tf.get_variable(name='w_convolution2', shape=[1,6,depth*numberOfChannels,depth//10],initializer=tf.contrib.layers.xavier_initializer())}
    
    
    #Define biases for the two convolutional layers using Zero initializer
    biases={'b_convolution1':tf.get_variable(name='b_convolution1', shape=[numberOfChannels*depth], initializer=tf.zeros_initializer()),
            'b_convolution2':tf.get_variable(name='b_convolution2', shape=[numberOfChannels*depth*(depth//10)], initializer=tf.zeros_initializer())}
    
    #Convolutinal layer 1 with max polling
    conv1=conv(x,weights['w_convolution1'])+biases['b_convolution1']
    conv1=max_poll(conv1)
    
    #Convolutional layer 2
    conv2=conv(conv1,weights['w_convolution2'])+biases['b_convolution2']
    
    #Get shape of convolutional layer 2, to correctly size the follwing layers(fullyConnectet and output)
    shape=conv2.get_shape().as_list()
    
    #Define weights for the fully connected and the output layer
    weights={'w_fullyConnected': tf.get_variable(name='w_fullyConnected', shape=[shape[1]*shape[2]*depth*numberOfChannels*(depth//10),num_hidden],initializer=tf.contrib.layers.xavier_initializer()),
             'w_output': tf.get_variable(name='w_output', shape=[num_hidden,numberOfLabels],initializer=tf.contrib.layers.xavier_initializer())}
    
    
    #Define biases for the fully connected and the output layer
    biases={'b_fullyConnected':tf.get_variable(name='b_fullyConnected', shape=[num_hidden], initializer=tf.zeros_initializer()),
            'b_output':tf.get_variable(name='b_output', shape=[numberOfLabels], initializer=tf.zeros_initializer())}
    
    #Fully connected 
    fullyC=tf.reshape(conv2,[-1,shape[1]*shape[2]*shape[3]])
    fullyC=tf.nn.relu(tf.matmul(fullyC,weights['w_fullyConnected'])+biases['b_fullyConnected'])
    #DropOut
    fullyC=tf.nn.dropout(fullyC,dropout)
    
    #Output
    output=tf.matmul(fullyC,weights['w_output'])+biases['b_output']
    
    return output
    

Train Neural Network

Note: train_neural_network() function according to:

pythonprogramming.net/tensorflow-neural-network-session-machine-learning-tutorial

In [15]:
lr=0.0001 #learning rate
In [16]:
def train_neural_network(x):
    
    prediction = cnn(x,keep_prob)
    cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))
    optimizer = tf.train.AdamOptimizer(learning_rate=lr).minimize(cost)

    hm_epochs = 100
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())

        for epoch in range(hm_epochs):
            epoch_loss = 0
            for b in range(int(len(train_y)/batch_size)):

                offset = (b * batch_size) % (train_y.shape[0] - batch_size)
                epoch_x = train_x[offset:(offset + batch_size), :, :, :]
                epoch_y = train_y[offset:(offset + batch_size), :]

                _, c = sess.run([optimizer, cost], feed_dict={x: epoch_x, y: epoch_y, keep_prob: dropout})
                epoch_loss += c

            correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
            accuracyTR = tf.reduce_mean(tf.cast(correct, 'float'))  
            
            print('Epoch', epoch+1, 'completed out of',hm_epochs,'loss:',epoch_loss, 'training accuracy:',accuracyTR.eval({x:train_x, y:train_y, keep_prob: 1.0}))
        
        accuracyTR = tf.reduce_mean(tf.cast(correct, 'float'))
        accuracyTR = accuracyTR.eval({x:train_x, y:train_y, keep_prob: 1.0})
        
        accuracyTS = tf.reduce_mean(tf.cast(correct, 'float'))
        accuracyTS = accuracyTS.eval({x:test_x, y:test_y, keep_prob: 1.0})   
        
        confusionMatrix=tf.contrib.metrics.confusion_matrix(tf.argmax(y, 1),tf.argmax(prediction, 1),dtype=tf.float64)
        confusionMatrix=confusionMatrix.eval({x:test_x, y:test_y, keep_prob: 1.0})
        
        print('Training finished')
        
    return [confusionMatrix, accuracyTR, accuracyTS]

Start training

(Function prints loss and training accuracy for each epoch. Furthermore, returns accuracy and confusion matrix for the test data set)

In [17]:
[cm, acctr, accts]=train_neural_network(x)
Epoch 1 completed out of 100 loss: 28.5563571453 training accuracy: 0.338776
Epoch 2 completed out of 100 loss: 28.2576886415 training accuracy: 0.37551
Epoch 3 completed out of 100 loss: 27.846467495 training accuracy: 0.363265
Epoch 4 completed out of 100 loss: 27.1061726809 training accuracy: 0.404082
Epoch 5 completed out of 100 loss: 26.1132080555 training accuracy: 0.412245
Epoch 6 completed out of 100 loss: 24.6861559153 training accuracy: 0.408163
Epoch 7 completed out of 100 loss: 22.8742803335 training accuracy: 0.436735
Epoch 8 completed out of 100 loss: 21.5710002184 training accuracy: 0.538776
Epoch 9 completed out of 100 loss: 20.3828172684 training accuracy: 0.563265
Epoch 10 completed out of 100 loss: 19.2065564394 training accuracy: 0.530612
Epoch 11 completed out of 100 loss: 18.5754448175 training accuracy: 0.616327
Epoch 12 completed out of 100 loss: 17.4660608768 training accuracy: 0.522449
Epoch 13 completed out of 100 loss: 17.7705532312 training accuracy: 0.677551
Epoch 14 completed out of 100 loss: 16.8120619059 training accuracy: 0.673469
Epoch 15 completed out of 100 loss: 15.8460396528 training accuracy: 0.710204
Epoch 16 completed out of 100 loss: 15.5749290586 training accuracy: 0.722449
Epoch 17 completed out of 100 loss: 15.4605789781 training accuracy: 0.730612
Epoch 18 completed out of 100 loss: 15.125351727 training accuracy: 0.742857
Epoch 19 completed out of 100 loss: 14.6230607033 training accuracy: 0.734694
Epoch 20 completed out of 100 loss: 13.8114423156 training accuracy: 0.75102
Epoch 21 completed out of 100 loss: 12.8250427842 training accuracy: 0.759184
Epoch 22 completed out of 100 loss: 13.1212670803 training accuracy: 0.771429
Epoch 23 completed out of 100 loss: 12.9606990218 training accuracy: 0.763265
Epoch 24 completed out of 100 loss: 12.6447166204 training accuracy: 0.795918
Epoch 25 completed out of 100 loss: 11.8704676628 training accuracy: 0.836735
Epoch 26 completed out of 100 loss: 11.6300948262 training accuracy: 0.795918
Epoch 27 completed out of 100 loss: 11.7334757447 training accuracy: 0.808163
Epoch 28 completed out of 100 loss: 11.1273963451 training accuracy: 0.832653
Epoch 29 completed out of 100 loss: 10.7012423873 training accuracy: 0.844898
Epoch 30 completed out of 100 loss: 10.2186184824 training accuracy: 0.861224
Epoch 31 completed out of 100 loss: 10.1265770197 training accuracy: 0.861224
Epoch 32 completed out of 100 loss: 10.013394773 training accuracy: 0.84898
Epoch 33 completed out of 100 loss: 9.3038610518 training accuracy: 0.861224
Epoch 34 completed out of 100 loss: 9.3766593039 training accuracy: 0.869388
Epoch 35 completed out of 100 loss: 9.09638950229 training accuracy: 0.865306
Epoch 36 completed out of 100 loss: 7.97212132812 training accuracy: 0.881633
Epoch 37 completed out of 100 loss: 8.53071698546 training accuracy: 0.889796
Epoch 38 completed out of 100 loss: 8.10107675195 training accuracy: 0.877551
Epoch 39 completed out of 100 loss: 7.7358109653 training accuracy: 0.893878
Epoch 40 completed out of 100 loss: 7.5514164865 training accuracy: 0.893878
Epoch 41 completed out of 100 loss: 7.90095111728 training accuracy: 0.889796
Epoch 42 completed out of 100 loss: 7.03844767809 training accuracy: 0.893878
Epoch 43 completed out of 100 loss: 6.8074709177 training accuracy: 0.922449
Epoch 44 completed out of 100 loss: 6.65762844682 training accuracy: 0.893878
Epoch 45 completed out of 100 loss: 6.3584318459 training accuracy: 0.906122
Epoch 46 completed out of 100 loss: 6.72536215186 training accuracy: 0.914286
Epoch 47 completed out of 100 loss: 6.21539211273 training accuracy: 0.906122
Epoch 48 completed out of 100 loss: 5.73747712374 training accuracy: 0.934694
Epoch 49 completed out of 100 loss: 5.68550211191 training accuracy: 0.926531
Epoch 50 completed out of 100 loss: 5.27773764729 training accuracy: 0.922449
Epoch 51 completed out of 100 loss: 5.49270182848 training accuracy: 0.926531
Epoch 52 completed out of 100 loss: 5.51763266325 training accuracy: 0.934694
Epoch 53 completed out of 100 loss: 5.40744170547 training accuracy: 0.934694
Epoch 54 completed out of 100 loss: 4.95277041197 training accuracy: 0.946939
Epoch 55 completed out of 100 loss: 5.50052534044 training accuracy: 0.934694
Epoch 56 completed out of 100 loss: 4.87795056403 training accuracy: 0.955102
Epoch 57 completed out of 100 loss: 4.34008590877 training accuracy: 0.938776
Epoch 58 completed out of 100 loss: 4.76628628373 training accuracy: 0.963265
Epoch 59 completed out of 100 loss: 4.19583055377 training accuracy: 0.946939
Epoch 60 completed out of 100 loss: 4.09219875932 training accuracy: 0.959184
Epoch 61 completed out of 100 loss: 4.68889035285 training accuracy: 0.959184
Epoch 62 completed out of 100 loss: 4.40831747651 training accuracy: 0.955102
Epoch 63 completed out of 100 loss: 3.85540105402 training accuracy: 0.95102
Epoch 64 completed out of 100 loss: 3.81341065466 training accuracy: 0.963265
Epoch 65 completed out of 100 loss: 3.99766527116 training accuracy: 0.967347
Epoch 66 completed out of 100 loss: 3.79050111771 training accuracy: 0.963265
Epoch 67 completed out of 100 loss: 3.66843527555 training accuracy: 0.963265
Epoch 68 completed out of 100 loss: 3.54980930686 training accuracy: 0.967347
Epoch 69 completed out of 100 loss: 3.06769222766 training accuracy: 0.967347
Epoch 70 completed out of 100 loss: 3.22222816199 training accuracy: 0.967347
Epoch 71 completed out of 100 loss: 3.21366889775 training accuracy: 0.97551
Epoch 72 completed out of 100 loss: 2.92249038815 training accuracy: 0.979592
Epoch 73 completed out of 100 loss: 2.93818256259 training accuracy: 0.97551
Epoch 74 completed out of 100 loss: 2.84807192534 training accuracy: 0.971429
Epoch 75 completed out of 100 loss: 2.63349810988 training accuracy: 0.979592
Epoch 76 completed out of 100 loss: 2.85458848625 training accuracy: 0.979592
Epoch 77 completed out of 100 loss: 2.8957022205 training accuracy: 0.987755
Epoch 78 completed out of 100 loss: 2.47353527695 training accuracy: 0.971429
Epoch 79 completed out of 100 loss: 2.51578572392 training accuracy: 0.991837
Epoch 80 completed out of 100 loss: 2.51080146432 training accuracy: 0.987755
Epoch 81 completed out of 100 loss: 2.06942618638 training accuracy: 0.987755
Epoch 82 completed out of 100 loss: 2.23707829788 training accuracy: 0.987755
Epoch 83 completed out of 100 loss: 2.31471212208 training accuracy: 0.991837
Epoch 84 completed out of 100 loss: 2.32782707363 training accuracy: 0.991837
Epoch 85 completed out of 100 loss: 2.11623847485 training accuracy: 0.995918
Epoch 86 completed out of 100 loss: 2.11636010185 training accuracy: 0.991837
Epoch 87 completed out of 100 loss: 1.77151628211 training accuracy: 0.995918
Epoch 88 completed out of 100 loss: 1.79175199568 training accuracy: 0.995918
Epoch 89 completed out of 100 loss: 1.96249528974 training accuracy: 0.995918
Epoch 90 completed out of 100 loss: 2.02624561638 training accuracy: 0.991837
Epoch 91 completed out of 100 loss: 1.83186802268 training accuracy: 0.995918
Epoch 92 completed out of 100 loss: 1.85338064656 training accuracy: 0.995918
Epoch 93 completed out of 100 loss: 1.58550003916 training accuracy: 0.995918
Epoch 94 completed out of 100 loss: 1.58713048697 training accuracy: 0.995918
Epoch 95 completed out of 100 loss: 1.49772232026 training accuracy: 0.995918
Epoch 96 completed out of 100 loss: 1.48469721153 training accuracy: 0.995918
Epoch 97 completed out of 100 loss: 1.53948107734 training accuracy: 0.995918
Epoch 98 completed out of 100 loss: 1.49332809076 training accuracy: 0.995918
Epoch 99 completed out of 100 loss: 1.31786047667 training accuracy: 0.995918
Epoch 100 completed out of 100 loss: 1.33247478679 training accuracy: 0.995918
Training finished

Results for the testing data

Accuracy and Confusion Matrix

In [18]:
print('Training Accuracy:',acctr)
print('')
print('Testing Accuracy:',accts)
print('')
print("Confusion Matrix:")
print('')
print(cm)
print('')
normalised_confusion_matrix = (np.array(cm, dtype=np.float32)/(np.sum(cm)))*100

print("Confusion matrix (normalised to % of total test data):")
print('')
print(normalised_confusion_matrix)

LABELS = [
    "WALKING", 
    "WALKING_UPSTAIRS", 
    "WALKING_DOWNSTAIRS", 
    "SITTING", 
    "STANDING", 
    "LAYING"
] 

# Plot Results: 
width = 10
height = 10
plt.figure(figsize=(width, height))

plt.imshow(
    normalised_confusion_matrix, 
    interpolation='nearest', 
    cmap=plt.cm.Blues
)

plt.title("Confusion matrix \n(normalised to % of total test data)")
plt.colorbar()

tick_marks = np.arange(numberOfLabels)

plt.xticks(tick_marks, LABELS, rotation=90)
plt.yticks(tick_marks, LABELS)

plt.ylabel('True label')
plt.xlabel('Predicted label')
plt.show()
Training Accuracy: 0.995918

Testing Accuracy: 0.959184

Confusion Matrix:

[[ 18.   0.   1.   0.   0.   0.]
 [  1.  18.   0.   0.   0.   0.]
 [  1.   1.   7.   0.   0.   0.]
 [  0.   0.   0.  16.   0.   0.]
 [  0.   0.   0.   0.  17.   0.]
 [  0.   0.   0.   0.   0.  18.]]

Confusion matrix (normalised to % of total test data):

[[ 18.36734772   0.           1.02040815   0.           0.           0.        ]
 [  1.02040815  18.36734772   0.           0.           0.           0.        ]
 [  1.02040815   1.02040815   7.14285755   0.           0.           0.        ]
 [  0.           0.           0.          16.32653046   0.           0.        ]
 [  0.           0.           0.           0.          17.34693909   0.        ]
 [  0.           0.           0.           0.           0.          18.36734772]]
In [ ]: