Viewing a single comment thread. View all comments

chatterbox272 t1_iw5jfwn wrote

I'll regularly write custom components, but pretty rarely write whole custom networks. Writing custom prediction heads that capture the task more specifically can improve training efficiency and performance (e.g. doing hierarchical classification where it makes sense, customized suppression postprocessing based on domain knowledge, etc.).

When I do write networks from scratch, they're usually variations on existing architectures anyway. E.g. implementing a 1D or 3D CNN using the same approach as existing 2D CNNs like ResNet or ConvNext. I usually find I'm doing this when I'm in a domain task and don't already have access to pretrained networks that are likely to be reasonable initialisation.

15