Submitted by mikef0x t3_ylrngf in MachineLearning
[removed]
Submitted by mikef0x t3_ylrngf in MachineLearning
[removed]
I am beginner at ML. So it would be like this?
model = models.Sequential() model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu'))
The general idea is to start with larger shapes and proceed to smaller ones. So the network starts off wide and the size of the tensors gradually reduces.
So maybe you can do:-
Conv2D(64) -> MaxPooling -> Conv2D(32) -> MaxPooling -> Conv2D(8) -> MaxPooling -> Dense(relu) [Optional] -> Dense(softmax)
Start with one Conv layer first and see if there are any improvements. Gradually add layers. If adding layers isn't giving you much of an improvement in accuracy, I'd recommend checking your data.
Making the network too deep (too many layers) might result in overfitting. So the network architecture/size is also a hyperparameter that has an optimal value for the dataset. Do update how it goes
model = tf.keras.models.Sequential()model.add(tf.keras.layers.Conv2D(64, (3, 3), input_shape=(124, 124, 3)))model.add(tf.keras.layers.MaxPooling2D((2, 2)))model.add(tf.keras.layers.Conv2D(32, (3, 3)))model.add(tf.keras.layers.MaxPooling2D((2, 2)))model.add(tf.keras.layers.Conv2D(8, (3, 3)))model.add(tf.keras.layers.Flatten())model.add(tf.keras.layers.Dense(4, activation='softmax'))
so, i've done like this. loss is low now but accuracy is to high :D i mean on epoch 10 its around 0.99.
​
update: on 20 epoch: accuracy is 1
Haha. It might be overfitting now. How does it perform on the test set ? If the accuracy is > 90 on test set I would think it's a good model.
How does it perform on the test set ? If accuracy is bad on test data, you would have to reduce the Conv layers and see if you can get more data.
Can you post the train vs test loss and accuracy here ?
Okay thanks
Blaze have right, CNNs re way to go. You might even get away with fully connected model, but you should use billions of images and quite a large model to reach similar results that much smaler CNN model with much smaller dataset can.
Btw, during last epochs val loss is oscillating, meaning learning rate is too large at that particular point.
Can it also mean learning rate is too small because it could be trapped in a local minima?
If it would be trapped in local minima (gradient vanishing), it would not change loss for a quite margin.
BlazeObsidian t1_iv006oo wrote
The network is a little simple for the task at hand. Consider using Conv layers and a deeper network. Doesn't have to be very large.
Flatten -> Conv -> Conv -> Dense (relu) -> Dense (softmax) might outperform the current network.
As a general approach, CNN's are very effective with image data. Passing the CNN output to Dense layers helps adapt the image understanding to the task at hand.