Erosis
Erosis t1_j72rzdl wrote
Reply to comment by SAbdusSamad in [D] Understanding Vision Transformer (ViT) - What are the prerequisites? by SAbdusSamad
You'll probably be fine learning transformers directly, but a better understanding of RNNs might make some of the NLP tutorials/papers containing transformers more easily comprehensible.
Attention is an very important component of transformers, but attention can be applied to RNNs, too.
Erosis t1_j07aho6 wrote
Reply to comment by Deep-Station-1746 in [P] Implemented Vision Transformers 🚀 from scratch using TensorFlow 2.x by TensorDudee
Yet people here praise Torch when Tensorflow equivalents are often faster in production. Tensorflow still has relevance and gets a bit too much hate here (and I personally prefer pytorch).
Erosis t1_ivwar5d wrote
Reply to comment by DreamyPen in [Discussion] Can we train with multiple sources of data, some very reliable, others less so? by DreamyPen
Yes, this is 'instance' or 'sample' weighting. You can choose to apply this weight to the loss or the gradients before your parameter update.
Erosis t1_ivu2gnv wrote
Reply to comment by DreamyPen in [Discussion] Can we train with multiple sources of data, some very reliable, others less so? by DreamyPen
Trees complicate it a bit more. I've never done it for something like that, but check this instance weight input to xgboost as an example. In the xgboost fit function, there is an input for sample_weight.
I know that tensorflow has a new-ish library for trees. You could manually write a gradient descent loop with modified minibatch gradients there, potentially.
Erosis t1_ivtyokc wrote
Reply to comment by DreamyPen in [Discussion] Can we train with multiple sources of data, some very reliable, others less so? by DreamyPen
You could use a custom training loop where you down-weight the gradients of the unreliable samples before you do parameter updates.
Erosis t1_ivtxuwn wrote
Reply to [Discussion] Can we train with multiple sources of data, some very reliable, others less so? by DreamyPen
Are the outputs of your model binary? You could instead set the target of your uncertain data points to somewhere closer to the middle instead of 0 and 1.
If you are training in batches, you could reduce the size of the gradient updates coming from the unreliable data.
Erosis t1_irewcom wrote
Reply to [N] I Have Released the YouTube Series Discussing and Implementing Activation Functions by itsstylepoint
Your videos have been great so far! Can't wait for more modeling content.
Erosis OP t1_ir9kj9k wrote
Reply to comment by KeikakuAccelerator in [R] Google announces Imagen Video, a model that generates videos from text by Erosis
I'm referring to their new Make-A-Video model, but I suppose they just announced that a few days ago. Hopefully they fully release that model.
Erosis OP t1_ir8cdlx wrote
Reply to comment by IntelArtiGen in [R] Google announces Imagen Video, a model that generates videos from text by Erosis
It seems that Google is being very conservative with the release of their diffusion models compared to even Meta and OpenAI's closed-source approach.
Luckily, Stability AI seems to be working on a video generating diffusion model.
Erosis t1_ir0tjwg wrote
I believe in figure 3.8, there seems to be a small typo.
> g-h) The clipped planes are then weighted
should be:
> g-i) The clipped planes are then weighted
Let me know if I'm mistaken here. Good stuff so far!
Edit: Added github issue regarding this.
Erosis t1_jegj2l9 wrote
Reply to comment by Educational-Net303 in [News] Twitter algorithm now open source by John-The-Bomb-2
Twitter is already established as a brand to near saturation and Elon has more money than god. It's the perfect combo for ML philanthropy. Now waiting for that Tesla vision algorithm...