like_a_tensor t1_j6q5cs1 wrote on February 1, 2023 at 2:32 AM

Are you implementing the CNN from scratch? If so, the problem might be in your implementation.

Play with the batch size and batch norm. Try different optimizers. Your learning rate might also be too large; experiment with smaller learning rates or something like torch's ReduceLROnPlateau.

5500 sample is also pretty small, so maybe try a shallower network.

International_Deer27 OP t1_j6rxo78 wrote on February 1, 2023 at 1:54 PM

Yes I am, I also uploaded the code below in case you can have a look. I'll look into ReduceLROnPlateau

sulpha1 t1_j6qyygr wrote on February 1, 2023 at 7:06 AM

You can also post the code for help, I would also say look to PyTorch's forums for help if you haven't already.

International_Deer27 OP t1_j6rxjtd wrote on February 1, 2023 at 1:54 PM

import torch

import torch.nn as nn

from torch.utils.data import Dataset, DataLoader

from sklearn.model_selection import train_test_split

import numpy as np

df_Y_MACE = np.array(df_Y_MACE)

df_X_MACE = np.array(df_X_MACE)

X = torch.from_numpy(df_X_MACE).float()

Y = torch.from_numpy(df_Y_MACE).float()

# Define the dataset

class ECGDataset(Dataset):

def __init__(self, data, labels):

self.data = data

self.labels = labels

def __len__(self):

return len(self.data)

def __getitem__(self, idx):

return self.data[idx], self.labels[idx]

# Split the data into training and testing sets

train_data, test_data, train_labels, test_labels = train_test_split(X, Y, test_size=0.2)

# Create the dataset and data loader

train_dataset = ECGDataset(train_data, train_labels)

test_dataset = ECGDataset(test_data, test_labels)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)

test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

# Define the CNN

class ECGClassifier(nn.Module):

def __init__(self):

super(ECGClassifier, self).__init__()

self.fc = nn.Linear(128*5, 1)

self.act = nn.ReLU()

self.sigmoid = nn.Sigmoid()

self.dropout = nn.Dropout(0.5)

self.layers = [[],[],[],[],[]]

for i in range(5):

self.layers[i].append(nn.Conv1d(1, 32, kernel_size=20, stride=5))

self.layers[i].append(nn.BatchNorm1d(32))

self.layers[i].append(nn.MaxPool1d(7,2))

self.layers[i].append(nn.Conv1d(32, 64, kernel_size=16, stride=5))

self.layers[i].append(nn.BatchNorm1d(64))

self.layers[i].append(nn.MaxPool1d(7,3))

self.layers[i].append(nn.Conv1d(64, 128, kernel_size=2, stride=3))

self.layers[i].append(nn.BatchNorm1d(128))

self.layers[i].append(nn.Linear(4, 1))

self.layers[i].append(nn.BatchNorm1d(128))

self.layers[i].append(nn.Dropout(0.5))

def forward(self, x):

x_cols = [[], [], [], [], []]

for i in range(5):

x_cols[i] = x[:,:,i].unsqueeze(1)

x_cols[i] = self.layers[i][0](x_cols[i])

x_cols[i] = self.layers[i][1](x_cols[i])

x_cols[i] = self.act(x_cols[i])

x_cols[i] = self.layers[i][2](x_cols[i])

x_cols[i] = self.layers[i][3](x_cols[i])

x_cols[i] = self.layers[i][4](x_cols[i])

x_cols[i] = self.act(x_cols[i])

x_cols[i] = self.layers[i][5](x_cols[i])

x_cols[i] = self.layers[i][6](x_cols[i])

x_cols[i] = self.layers[i][7](x_cols[i])

x_cols[i] = self.act(x_cols[i])

x_cols[i] = self.layers[i][8](x_cols[i])

x_cols[i] = self.layers[i][9](x_cols[i])

x_cols[i] = self.layers[i][10](x_cols[i])

x = torch.cat((*x_cols, ), 1)

x = x.view(-1, 128*5)

x = self.fc(x)

x = self.sigmoid(x)

return x

# Define the model and move it to the device

device = torch.device('cpu')

model = ECGClassifier()

model = model.to(device)

model = model.float()

# Define the loss function and optimizer

criterion = nn.BCELoss()

optimizer = torch.optim.Adam(model.parameters(), lr=0.001, weight_decay=0.001)

# Train the model

for epoch in range(5):

for i, (data, labels) in enumerate(train_loader):

data, labels = data.to(device), labels.to(device)

# Forward pass

with torch.set_grad_enabled(True):

outputs = model(data)

labels = labels.unsqueeze(1)

loss = criterion(outputs, labels)

# Backward and optimize

optimizer.zero_grad()

loss.backward()

optimizer.step()

print ('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, 5, loss.item()))

BlacksmithNo4415 t1_j6tfpkg wrote on February 1, 2023 at 7:39 PM

try using markdowns:

        plotter = DLPlotter()     # add this line
        model = MyModel()
        ...
        total_loss = 0
        for epoch in range(5):
            for step, (x, y) in enumerate(loader):
                ...
                output = model(x)
                loss = loss_func(output, y)
                total_loss += loss.item()
                ...
        config = dict(lr=0.001, batch_size=64, ...)
        plotter.collect_parameter("exp001"", config, total_loss / (5 * len(loader))     # add this line
        plotter.construct()     # add this line

International_Deer27 OP t1_j6usb0i wrote on February 2, 2023 at 12:57 AM

I’m not sure about the DLPlotter, which library did you get it from, I can’t seem to find it? I’m using Python 3

BlacksmithNo4415 t1_j6usty4 wrote on February 2, 2023 at 1:01 AM

no, that was an example code to show you how much better the code is readable when you use markdowns..

DLPlotter is a library i am building in the moment.. :)

International_Deer27 OP t1_j6uufti wrote on February 2, 2023 at 1:13 AM

Ah alright, thanks, I’ll try and see how else I can modify the code and get it working. Good luck with the library!

International_Deer27 OP t1_j6x0tpy wrote on February 2, 2023 at 2:27 PM

I've simplified my model a lot to only take into account 2000x1 tensors as input for X and the prediction is either 0 or 1 as before. I've made it using nn.Sequential with only a few layers to be easier to follow:

import torch

import torch.nn as nn

from torch.utils.data import Dataset, DataLoader

import torch.nn.functional as F

from sklearn.model_selection import train_test_split

import numpy as np

import matplotlib as plt

df_Y_MACE = np.array(df_Y_MACE)

df_X_MACE1 = []

for i in range(len(df_X_MACE)):

df_X_MACE1.append(df_X_MACE[i][0])

df_X_MACE1 = np.array(df_X_MACE1)

X = torch.from_numpy(df_X_MACE1).float()

Y = torch.from_numpy(df_Y_MACE).float()

# Define the dataset

class ECGDataset(Dataset):

def __init__(self, data, labels):

self.data = data

self.labels = labels

def __len__(self):

return len(self.data)

def __getitem__(self, idx):

return self.data[idx], self.labels[idx]

# Split the data into training and testing sets

train_data, test_data, train_labels, test_labels = train_test_split(X, Y, test_size=0.8)

# Create the dataset and data loader

train_dataset = ECGDataset(train_data, train_labels)

test_dataset = ECGDataset(test_data, test_labels)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)

test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

# Define the CNN

class ECGClassifier(nn.Module):

def __init__(self):

super(ECGClassifier, self).__init__()

self.ECG_seq = nn.Sequential(nn.Conv1d(1, 32, kernel_size = 50, stride = 5), nn.ReLU(), nn.MaxPool1d(7,2), nn.Linear(193,1))

self.fc = nn.Linear(32, 1)

self.sigmoid = nn.Sigmoid()

def forward(self, x):

x = x.unsqueeze(1)

out = self.ECG_seq(x)

out = self.fc(out.view(-1,32))

out = self.sigmoid(out)

return out

# Define the model and move it to the device

device = torch.device('cpu')

model = ECGClassifier()

model = model.to(device)

model = model.float()

# Define the loss function and optimizer

criterion = nn.BCELoss()

optimizer = torch.optim.Adam(model.parameters(), lr=0.001, weight_decay=0.01)

total_loss = []

# Train the model

for epoch in range(5):

for i, (data, labels) in enumerate(train_loader):

data, labels = data.to(device), labels.to(device)

# Forward pass

with torch.set_grad_enabled(True):

outputs = model(data)

labels = labels.unsqueeze(1)

loss = criterion(outputs, labels)

total_loss.append(loss)

# Backward and optimize

optimizer.zero_grad()

loss.backward()

optimizer.step()

print ('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, 5, loss.item()))

BlacksmithNo4415 t1_j6x1ugb wrote on February 2, 2023 at 2:34 PM

https://www.freecodecamp.org/news/how-to-format-code-in-markdown/

International_Deer27 OP t1_j6x0yff wrote on February 2, 2023 at 2:28 PM

For this new model the loss function looks pretty much the same:

Epoch [1/5], Loss: 0.8073

Epoch [2/5], Loss: 0.8680

Epoch [3/5], Loss: 0.5826

Epoch [4/5], Loss: 0.7626

Epoch [5/5], Loss: 0.6099

BlacksmithNo4415 t1_j6x2xia wrote on February 2, 2023 at 2:42 PM

i've checked for papers that do exactly what you want.

so as I assumed this data is time sensitive and therefor you need an additional temporal dimension.

this model needs to be more complex in order to solve this problem.

i suggest reading this:

https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-021-01736-y

BTW: have you tried grid search for finding the right hyperparametrs?

oh and your model does improve..

have you increased the data set size??

Etodmitry22 t1_j6rpice wrote on February 1, 2023 at 12:45 PM

The loss will always fluctuate especially for complex networks/tasks, the thing you should care about is loss decreasing overall and metrics giving better results on the test set. No fluctuation in loss and perfect convergence is a very rare thing that is mostly seen in ML tutorials and not real-world cases.

If you do not see any improvement overall try to overfit on a small subset of training data - if your model cannot overfit to small data it means bugs in your model or data.

BlacksmithNo4415 t1_j6tfgmy wrote on February 1, 2023 at 7:37 PM

too me it also sounds like a bad learning rate. have you checked the distribution of your weights for each layer in each step?

P.S: try hyperparameter optimization methods like grid search or baysian. in that way you get faster an answer to your question..

BlacksmithNo4415 t1_j6uwn1n wrote on February 2, 2023 at 1:29 AM

i can try to help you though, i worked as a deep learning engineer in computer vision:

do you mean the dimension of 1 sample is [2000, 5] ? that is a very weird shape for an image. usually they have a shape of [h, w, 3] and [h, w, 4] for video data - a temporal additional dimension is added
what do you want this model should be classifying ? so far it sounds more trivial - but depending on the object it might be a bit more complex.
the more complex your task -> more complex your model must be -> a larger data set you will need
how are the labels distributed in your data set ?
do you use adversarial attacks for robustness ? don't do that at the beginning.
are you sure that a cnn is the proper model for signal classification ?
how do you want to represent your dataset ? what should be the 3rd axes represent as an information ?
btw dropouts makes it also more difficult for the model to overfit. you use this so the model learns to generalize
i think the model is way to complex when the task is actually trivial. but i never did any signal classification
the use of sigmoid can lead to exploding gradients

Loss function fluctuating

Comments

like_a_tensor t1_j6q5cs1 wrote on February 1, 2023 at 2:32 AM

International_Deer27 OP t1_j6rxo78 wrote on February 1, 2023 at 1:54 PM

sulpha1 t1_j6qyygr wrote on February 1, 2023 at 7:06 AM

International_Deer27 OP t1_j6rxjtd wrote on February 1, 2023 at 1:54 PM

BlacksmithNo4415 t1_j6tfpkg wrote on February 1, 2023 at 7:39 PM

International_Deer27 OP t1_j6usb0i wrote on February 2, 2023 at 12:57 AM

BlacksmithNo4415 t1_j6usty4 wrote on February 2, 2023 at 1:01 AM

International_Deer27 OP t1_j6uufti wrote on February 2, 2023 at 1:13 AM

International_Deer27 OP t1_j6x0tpy wrote on February 2, 2023 at 2:27 PM

BlacksmithNo4415 t1_j6x1ugb wrote on February 2, 2023 at 2:34 PM

International_Deer27 OP t1_j6x0yff wrote on February 2, 2023 at 2:28 PM

BlacksmithNo4415 t1_j6x2xia wrote on February 2, 2023 at 2:42 PM

Etodmitry22 t1_j6rpice wrote on February 1, 2023 at 12:45 PM

BlacksmithNo4415 t1_j6tfgmy wrote on February 1, 2023 at 7:37 PM

BlacksmithNo4415 t1_j6uwn1n wrote on February 2, 2023 at 1:29 AM