Viewing a single comment thread. View all comments

t1_itsddz6 wrote

Note from the Darknet + YOLO FAQ: "Can I train a neural network using synthetic images?"

>No.
>
>Or, to be more precise, you'll probably end up with a neural network that is great at detecting your synthetic images, but unable to detect much in real-world images.

Source: https://www.ccoderun.ca/programming/darknet_faq/#synthetic_images

I made that statement several years ago, and after all this time, I still think the correct answer is "no". Every time I try to use synthetic images, it never works out as I had planned.

Looking at your "Link1" and "Link2", it is immediately obvious this is not going to work. You cannot crop your objects: https://www.ccoderun.ca/programming/darknet_faq/#crop_training_images

Darknet/YOLO (and under the covers, I believe that Ultralytics is using Darknet) learns from context, not only what is in the bounding boxes. So if you are trying to detect snowboarders with those symbols, then you'll do OK. But if you are expecting to pass in images or video frames with clothes, then that snowboarder and bus are doing nothing to help you.

Want proof? Here is a YOLO neural network video I happened to upload to youtube today: https://www.youtube.com/watch?v=m3Trxxt9RzE

Note the "6" and "9" on those cards. They are correctly recognized, no confusion even though the font used makes those 2 numbers look identical when rotated 180 degrees. YOLO really does look at much more than just the bounding box.

6

OP t1_ituyep7 wrote

Thank you very much for this answer!

I understand now that I should not only select my model architecture based on the performance/reviews I read on some blog posts. It requires digging deeper into the architecture and understanding how it works to find the right one for the use case.

1