Work in progress
1.0 Titanic Dataset [Code]
2.0 Women Chess Dataset [Code]
3.0 German Credit Dataset: Overfitted Model [Code]
Learning goal:
- Prepare data and NN model is such a way that it will prone to overfit.
- One hot encoding all categorical columns to increase the number of feature columns.
- Increase model complexity (more dense layer, more nodes in each layers).
- Train with high epochs.
Model: "german_credit_model_overfit"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 61)] 0
dense (Dense) (None, 64) 3968
dense_1 (Dense) (None, 32) 2080
dense_2 (Dense) (None, 16) 528
dense_3 (Dense) (None, 8) 136
dense_4 (Dense) (None, 2) 18
=================================================================
Total params: 6,730
Trainable params: 6,730
Non-trainable params: 0
_________________________________________________________________
1.0 CIFAR10 [Code]
Model: "Model4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_12 (Conv2D) (None, 32, 32, 64) 1792
conv2d_13 (Conv2D) (None, 32, 32, 64) 36928
max_pooling2d_7 (MaxPooling (None, 16, 16, 64) 0
2D)
conv2d_14 (Conv2D) (None, 16, 16, 32) 18464
conv2d_15 (Conv2D) (None, 16, 16, 32) 9248
max_pooling2d_8 (MaxPooling (None, 8, 8, 32) 0
2D)
conv2d_16 (Conv2D) (None, 8, 8, 16) 4624
conv2d_17 (Conv2D) (None, 8, 8, 16) 2320
max_pooling2d_9 (MaxPooling (None, 4, 4, 16) 0
2D)
flatten_3 (Flatten) (None, 256) 0
dense_9 (Dense) (None, 32) 8224
dropout (Dropout) (None, 32) 0
dense_10 (Dense) (None, 10) 330
=================================================================
Total params: 81,930
Trainable params: 81,930
Non-trainable params: 0
_________________________________________________________________
2.0 Bread [Code] ![Open in Streamlit](https://camo.githubusercontent.com/492dc2cc5c894cc5c245075202d323f30d70821f2106b40543be4a2cff98d347/68747470733a2f2f7374617469632e73747265616d6c69742e696f2f6261646765732f73747265616d6c69745f62616467655f626c61636b5f77686974652e737667)
Learning goal:
Create, load dataset and build CNN model for "good" and "moldy" bread image classification.
- 300 images for each class were bulk downloaded using imageye Google Chrome extension.
- Three baseline models were built with the following architecture:
- Model 1: Input + Batch Normalization Layer + 3 Convolutional Layers + 1 Dense Layer
- Model 2: Input + Batch Normalization Layer + 4 Convolutional Layers + 1 Dense Layer
- Model 3: Input + Batch Normalization Layer + 5 Convolutional Layers + 1 Dense Layer
- All models were trained for 10 epochs.
- Best accuracy score (88%) is obtained by Model 3.
- Model 3 summary is as below.
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
batch_normalization_2 (Batc (None, 180, 180, 3) 12
hNormalization)
conv2d_7 (Conv2D) (None, 178, 178, 128) 3584
max_pooling2d_7 (MaxPooling (None, 89, 89, 128) 0
2D)
conv2d_8 (Conv2D) (None, 87, 87, 64) 73792
max_pooling2d_8 (MaxPooling (None, 43, 43, 64) 0
2D)
conv2d_9 (Conv2D) (None, 41, 41, 32) 18464
max_pooling2d_9 (MaxPooling (None, 20, 20, 32) 0
2D)
conv2d_10 (Conv2D) (None, 18, 18, 16) 4624
max_pooling2d_10 (MaxPoolin (None, 9, 9, 16) 0
g2D)
conv2d_11 (Conv2D) (None, 7, 7, 8) 1160
max_pooling2d_11 (MaxPoolin (None, 3, 3, 8) 0
g2D)
flatten_2 (Flatten) (None, 72) 0
dense_4 (Dense) (None, 8) 584
dense_5 (Dense) (None, 2) 18
=================================================================
Total params: 102,238
Trainable params: 102,232
Non-trainable params: 6
_________________________________________________________________
1.0 Text Generation [Code]
Learning goal:
Create a text generation model using LSTM. Several LSTM architecture were explored and the best model can be observed in the model summary below and in the attached [code]
- Dataset: Peter Pan [Source]
- Model Architecture: 3 stacked LSTM Layers with 0.1 Dropout
- Preprocessing techniques:
- Remove special characters and whitespaces
- Encoding characters to numerical representation
- Normalization (rescaling the encoded value to the range of 0 to 1)
- Best accuracy score obtained was 71%.
- Possible improvement:
- Currently the model produces repetitive and gibberish text output. Possible improvement would be to reduce model complexity and reduce learning rate.
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm (LSTM) (None, 200, 255) 262140
_________________________________________________________________
lstm_1 (LSTM) (None, 200, 255) 521220
_________________________________________________________________
lstm_2 (LSTM) (None, 255) 521220
_________________________________________________________________
dense (Dense) (None, 34) 8704
=================================================================
Total params: 1,313,284
Trainable params: 1,313,284
Non-trainable params: 0
_________________________________________________________________
2.0 Text Classification [Code]
Learning goal:
Classify intents from text input into 4 categories [BookRestaurant, GetWeather, PlayMusic, RateBook]. In order for chatbot to be able to give appropriate response, it needs to first correctly classify user intends. Therefore, intent classification model usually implemented as the first stacked model in most of all chatbot model.
- Preprocessing Method: 5 models were built to investigate 5 different preprocessing methods (all utilizing the same model architecture - which can be observed below)
ori
: Training set with original inputnorm
: Training set with normalized inputnorm_lemma
: Training set with normalized + lemmatized inputnorm_stem
: Training set with normalized + stemmed inputnorm_stopword
: Training set with normalized + stopwords removal input- Interestingly, all models achieved 99% accuracy after 2nd epoch. This is possibly because of the limited size of the dataset.
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 1186, 32) 240704
lstm (LSTM) (None, 100) 53200
dropout (Dropout) (None, 100) 0
dense (Dense) (None, 4) 404
=================================================================
Total params: 294,308
Trainable params: 294,308
Non-trainable params: 0
_________________________________________________________________
1.0 VGG16 [Code]
Learning goal:
Utilize and fine tune existing pre-trained model on a different dataset.
- Base Model: VGG16 from Tensorflow.
- Dataset: Multiclass image classification dataset from Kaggle.
tf.keras.layers.GlobalAveragePooling2D()
andtf.keras.layers.Dense(len(class_names), activation='softmax')
layers were added after the base model layer.- Model is first fitted with frozen base model
base_model.trainable = False
; and unfreezed during fine tuning.- Final validation accuracy during initial training: 0.8842; validation accuracy during fine tuning: 0.8924.
- Test accuracy: 0.9196.
- Model summary is as below.
- Possible improvement:
- Add more data augmentation layer for image preprocessing (eg image cropping, resizing, zoom) since the model would most likely perform incorrect prediction on images of animal that are further away in the background.
Model: "model_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_10 (InputLayer) [(None, 160, 160, 3)] 0
_________________________________________________________________
sequential (Sequential) (None, 160, 160, 3) 0
_________________________________________________________________
tf.__operators__.getitem_4 ( (None, 160, 160, 3) 0
_________________________________________________________________
tf.nn.bias_add_4 (TFOpLambda (None, 160, 160, 3) 0
_________________________________________________________________
vgg16 (Functional) (None, 5, 5, 512) 14714688
_________________________________________________________________
global_average_pooling2d_5 ( (None, 512) 0
_________________________________________________________________
dropout_4 (Dropout) (None, 512) 0
_________________________________________________________________
dense_6 (Dense) (None, 4) 2052
=================================================================
Total params: 14,716,740
Trainable params: 2,052
Non-trainable params: 14,714,688
_________________________________________________________________