pm_me_your_ensembles
pm_me_your_ensembles t1_j01xzcw wrote
Reply to [P] Are probabilities from multi-label image classification networks calibrated? by alkaway
The two are not comparable. In a multi-class single-label problem, you do K distinct projections, one for each class, but then they are combined via softmax to give you something that resembles probabilities. Since no such function is applied, it's not possible to compare the two as they don't influence each other in any way.
However, you shouldn't treat whatever a NN outputs as a probability even if it's within [0,1] as NNs are known to be overconfident.
pm_me_your_ensembles t1_ivf2wkb wrote
Reply to [D] Do you think there is a competitive future for smaller, locally trained/served models? by naequs
Network distillation and transfer learning are both reasonable approaches to constructing high quality "compressed" models.
pm_me_your_ensembles t1_iv0ea9x wrote
Reply to comment by pseudorandom_user in [N] Class-action lawsuit filed against GitHub, Microsoft, and OpenAI regarding the legality of GitHub Copilot, an AI-using tool for programmers by Wiskkey
You can opt out of Github's source collecting program.
pm_me_your_ensembles t1_iuk6950 wrote
Reply to comment by boyetosekuji in [News] The Stack: 3 TB of permissively licensed source code - Hugging Face and ServiceNow Research Denis Kocetkov et al 2022 by Singularian2501
If you have to ask :D
pm_me_your_ensembles t1_itw6sti wrote
Reply to [P] Up to 12X faster GPU inference on Bert, T5 and other transformers with OpenAI Triton kernels by pommedeterresautee
Bless you, I needed this :D
pm_me_your_ensembles t1_is6ttve wrote
Reply to comment by Atom_101 in [D] Are GAN(s) still relevant as a research topic? or is there any idea regarding research on generative modeling? by aozorahime
Afaik self conditioning helps with the process, and there has been a lot of work in reducing the number of steps through distillation and quantization.
pm_me_your_ensembles t1_is5ppy3 wrote
Reply to comment by Atom_101 in [D] Are GAN(s) still relevant as a research topic? or is there any idea regarding research on generative modeling? by aozorahime
It's entirely possible to do a lot better than N forward passes.
pm_me_your_ensembles t1_is1d3un wrote
Reply to comment by shahaff32 in [R] Wavelet Feature Maps Compression for Image-to-Image CNNs by shahaff32
Very cool, will take a look, thanks! :D
pm_me_your_ensembles t1_is1busc wrote
Could this work with 1d convolutions?
pm_me_your_ensembles t1_irt5y8h wrote
Phil Wang/lucidrains has phenomenal implementations of stuff, I'd recommend checking them out and reading their code.
Furthermore, I'd recommend simply reading more code and tackling complex problems, e.g. try building a DL framework from "scratch" ontop of jax. Read the Haiku codebase, and compare it to say Equinox (I am a big fan of this one). Go through the huggingface code bases, e.g. transformers. Choose a model and build it from scratch and make it compatible with their API.
pm_me_your_ensembles t1_irjlb0n wrote
Reply to comment by itsstylepoint in [N] I Have Released the YouTube Series Discussing and Implementing Activation Functions by itsstylepoint
Great! Thank you for sharing! :D
pm_me_your_ensembles t1_irjbb1t wrote
Reply to [N] I Have Released the YouTube Series Discussing and Implementing Activation Functions by itsstylepoint
Do you go over numerical stability issues?
pm_me_your_ensembles t1_j02eijz wrote
Reply to comment by alkaway in [P] Are probabilities from multi-label image classification networks calibrated? by alkaway
Never mind my previous comment.
You could normalize both channels, ie for label 1, normalize the NxN tensor pixel, same for label 2.