Hello,

I am working on a project in which I'm detecting cavities in X-rays.

The dataset I have is pretty limited (~100 images). Each X-ray has a black and white mask that shows where in the image are the cavities.

I'm trying to improve my results.

What I've tried so far:

different loss functions: BCE, dice loss, bce+dice, tversky loss, focal tversky loss
modifying the images' gamma to make the cavities more visible
trying out different U-Nets: U-net, V-net, U-net++, UNET 3+, Attention U-net, R2U-net, ResUnet-a, U^2-Net, TransUNET, and Swin-UNET

None of the new U-nets that I've tried improved the results. Probably because they are more suited for a larger dataset.

I'm now looking for other things to try to improve my results. Currently my network is detecting cavities, but it has trouble with the smaller ones.

Comments

You must log in or register to comment.

azorsenpai t1_je6hjpu wrote on March 29, 2023 at 7:27 PM

#2,457,977

Is there any reason you're really restraining to a Unet based model ? I'd recommend testing different architectures such as DeepLab V3 or FPN and see whether stuff improves. If it doesn't I'd recommend looking to your data and the quality of the ground truth as with only 100 data points you should be very much limited by the information contained in your data.

If the data is clean I'd recommend using some kind of ensemble method, this might be overkill, especially with heavy models but having multiple models with random initializations infer on a same input generally gives a few more points of accuracy/dice so if you really need it , this is an option.

MadScientist-1214 t1_je6o0st wrote on March 29, 2023 at 8:08 PM

#2,459,247

Most new architectures based on U-Net do not actually work. Researchers need papers to get published, so they introduce leakage or optimize the seed. Segmentation papers in journals like CVPR are of better quality.

viertys OP t1_je6peyv wrote on March 29, 2023 at 8:17 PM

#2,459,553

Replying to azorsenpai (#2,457,977)

I started with U-Net, but I'm open to other architectures. I will try out DeepLab V3, thank you!

I believe the data is generally clean. Sadly, I can't get more data as all the datasets used in the research papers that I've read are private.

CyberDainz t1_je6qsbb wrote on March 29, 2023 at 8:26 PM

#2,459,868

The success of generalization for segmentation depends not only on the network configuration, but also on the augmentation and pretrain on non mask target.

try my new project Deep Roto https://iperov.github.io/DeepXTools/

deep-yearning t1_je710wy wrote on March 29, 2023 at 9:33 PM

#2,462,134

What accuracy (Dice?) are you getting? 100 training images is pretty small. Have you tried cross-validation?

currentscurrents t1_je7c29r wrote on March 29, 2023 at 10:53 PM

#2,464,573

The architecture probably isn't the problem. You only have 100 images, that's your problem.

If you can't get more labeled data, you should pretrain on unlabeled data that's as close as possible to your task - preferably other dental x-rays. Then you can finetune on your real dataset.

trajo123 t1_je7dgjz wrote on March 29, 2023 at 11:03 PM

#2,464,909

100 images??? Folks, neural nets are data hungry, if you don't have reams of data, don't fiddle with architectures, definitely not at first. The first thing to do when data is limited is to use pre-trained models. Then do data augmentation and only then look at other things like architectures and losses if you really have nothing better to do with your time.

SMP offers a wide variety of segmentation models with the option to use pre-trained weights.

BreakingCiphers t1_je7dlg5 wrote on March 29, 2023 at 11:04 PM

#2,464,939

While testing models and playing with hyperparams can be fun, the real problem is that you are trying to apply deep learning to 100 images.

Get more images.

Adventurous-Mouse849 t1_je7gyqe wrote on March 29, 2023 at 11:29 PM

#2,465,668

Replying to BreakingCiphers (#2,464,939)

And also data augmentation. Rotation, cropping, zooming. This is essential for data scarcity in medical imaging.

itsyourboiirow t1_je7n7p8 wrote on March 30, 2023 at 12:17 AM

#2,466,950

Others have mentioned it, but do data augmentation, crop, resize, rotate, etc. and you'll be able to increase the size of your dataset and improve results.

BrotherAmazing t1_je7vj9v wrote on March 30, 2023 at 1:19 AM

#2,468,769

People saying get more than 100 images are right (all else being equal, yes, get more images!) but you likely can make good progress without as many images for your problem with clever augmentation and a smaller network.

Here’s why:

You only have to detect cavities. It’s not some 1,000-class semantic segmentation problem.
You should be working with single channel grayscale images, and not pixels that naturally come in 3-channel RGB color.
This is X-ray data just of teeth, so you don’t have nearly the amount of complex fine-detailed textures and patterns (with colors) that are exhibited in more general RGB optical datasets of all sorts of objects and environments.

Of course for a real operational system that you will use in commercial products you will need to get far more than 100 images. However, for a simple research problem or prototype demo, you can likely show good results and feasibility (without overfitting, yes) on your dataset with a smaller net and clever augmentation.

Warhouse512 t1_je86y92 wrote on March 30, 2023 at 2:47 AM

#2,471,348

Mask2Former?

m98789 t1_je8gdwb wrote on March 30, 2023 at 4:12 AM

#2,473,454

Replying to MadScientist-1214 (#2,459,247)

CVPR is not a journal

elbiot t1_je8iym9 wrote on March 30, 2023 at 4:38 AM

#2,474,019

Looks like this was trained on just 150 x-rays and does very well: https://paperswithcode.com/paper/xnet-a-convolutional-neural-network-cnn

Edit: did you look for pre-existing solutions? This was like the second google result. If I were you I'd be looking for public datasets I could use for pretraining and then finetune on my data

[deleted] t1_je8uyb4 wrote on March 30, 2023 at 7:04 AM

#2,476,482

[deleted]

dubbitywap t1_je8xewl wrote on March 30, 2023 at 7:39 AM

#2,476,905

Do you have a git repository that we can take a look at?

mofoss t1_je9ayyx wrote on March 30, 2023 at 10:51 AM

#2,479,283

Try segformers after augmentation

NoLifeGamer2 t1_je9gi5u wrote on March 30, 2023 at 11:51 AM

#2,480,529

I recommend using bootstrapping to create more datapoints, then approve the ones you like and add them to the dataset. Then, train based on the larger dataset.

viertys OP t1_je9mjxr wrote on March 30, 2023 at 12:46 PM

#2,481,943

Replying to trajo123 (#2,464,909)

Thank you a lot! I will try SMP

viertys OP t1_je9mlvj wrote on March 30, 2023 at 12:47 PM

#2,481,955

Replying to Adventurous-Mouse849 (#2,465,668)

I didn't mention it in the post, but I'm using the albumentations module. I rotate, shift, rotate, blur, horizontal flip, downscale and use gauss noise. I get around 400 images after doing this. Is there anything you would suggest?

viertys OP t1_je9mocy wrote on March 30, 2023 at 12:47 PM

#2,481,980

Replying to deep-yearning (#2,462,134)

I have an accuracy of 98.50 and I have dice of around 0.30-0.65 for each image

viertys OP t1_je9mpwr wrote on March 30, 2023 at 12:48 PM

#2,481,997

Replying to itsyourboiirow (#2,466,950)

I didn't mention it in the post but I'm using the albumentations module. I rotate, shift, rotate, blur, horizontal flip, downscale and use gauss noise. I get around 400 images after doing this. Is there anything you would suggest?

viertys OP t1_je9nno8 wrote on March 30, 2023 at 12:55 PM

#2,482,261

Replying to BrotherAmazing (#2,468,769)

And yes, the images are grayscale and they are cropped around the teeth area, so only that part of the X-ray remains.

deep-yearning t1_je9qqrf wrote on March 30, 2023 at 1:20 PM

#2,483,038

Replying to viertys (#2,481,980)

Accuracy is not a good metric here given the large number of true negative pixels you will get.

How large is the typical region you are trying to segment (in pixels)? If you've already done data augmentation I would also try to generate images if you can. Use a larger batch size, try different optimizers and a learning rate scheduler. How many images do not have cavities in them?

viertys OP t1_je9srha wrote on March 30, 2023 at 1:36 PM

#2,483,631

Replying to deep-yearning (#2,483,038)

All images have cavities in them and in general the cavities make up 5-10% of the image.

Here is an example: https://imgur.com/a/z0yeH0C The mask on the left is the ground truth and the mask on the right is the predicted one.

I'm currently using Kaggle and I can't use very large batch sizes. My batch size is 4 now. Is there an alternative to Kaggle that you would suggest?

deep-yearning t1_je9te4j wrote on March 30, 2023 at 1:40 PM

#2,483,808

Replying to viertys (#2,483,631)

Train locally on your own machine if you have a GPU, or try using google colab if you don't. Google Colab has V100 which should fit larger batch sizes.

To be honest, given how limited the data set is and how small some of the segmentation masks are, I am not sure other architectures will be able to do any better than U-Net.

I would also try the nnU-Net which should give state-of-the-art performance, and so will give you a good idea of what's possible with the dataset that you have: https://github.com/MIC-DKFZ/nnUNet

viertys OP t1_je9u6ny wrote on March 30, 2023 at 1:46 PM

#2,484,029

Replying to deep-yearning (#2,483,808)

Thank you, I will try nnU-net too

Tight-Lettuce7980 t1_jea4ojl wrote on March 30, 2023 at 3:03 PM

#2,486,845

Replying to trajo123 (#2,464,909)

How about medical images, which are more difficult to obtain due to privacy issues? I don't think it's easy to get for example 1000+ images. Would 300 - 700 or so be sufficient?

[deleted] t1_jea6oxf wrote on March 30, 2023 at 3:18 PM

#2,487,414

[removed]

trajo123 t1_jebbxaf wrote on March 30, 2023 at 7:43 PM

#2,497,619

Replying to Tight-Lettuce7980 (#2,486,845)

Sufficient to train a model from scratch? Unlikely. Sufficient to fine-tune a model pre-trained on 1million+ images (imagenet, etc)? Probably yes. As mentioned, some extra performance can be squeezed out with some smart data augmentation.

Adventurous-Mouse849 t1_jedi4wq wrote on March 31, 2023 at 5:48 AM

#2,517,822

Replying to viertys (#2,481,955)

For augmentation that’s all bases covered. For more high-level or fully generative tasks I would also suggest mix-match (convex combo between similar samples). But you can’t justify that here bc you would have to relabel. Ultimately this does come down to too few images. If there’s a publicly available pretrained CT segmentation model you could fine-tune it to your task, or distill it’s weights to your model… just make sure they did a good job in the first place.

Also some other notes: I’d suggest sticking with distribution losses ie cross entropy. U-Net is sensitive to normalization so I’d also suggest training with and without normalized inputs.