C0hentheBarbarian
C0hentheBarbarian t1_j4ug85t wrote
Reply to comment by FastestLearner in [D] Idea: SponsorBlock with a neural net as backend by FastestLearner
Training isn’t the main issue wrt cost. Inference is.
C0hentheBarbarian t1_j468jz4 wrote
Its pretty old in the context of Deep Learning but openAI Jukebox uses them for audio if I remember correctly.
C0hentheBarbarian t1_j2sl0n3 wrote
Reply to comment by Purplekeyboard in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon
What about BLOOMZ? Isn’t it fine tuned in a similar way to GPT-3? Instruction fine tuned?
C0hentheBarbarian t1_iyh4kex wrote
Reply to OpenAI ChatGPT [R] by Sea-Photo5230
Results like this make me seriously question if I'll have a job in the future as an ML person. I understand the nature of the job will change etc but I can see myself becoming an overqualified prompt engineer.
C0hentheBarbarian t1_iy73sx4 wrote
Reply to comment by ProfessionalShame900 in [D] Simple Questions Thread by AutoModerator
> How to visualize the cluster in high-dimensional space?
t-SNE could work for this
C0hentheBarbarian t1_ixgr1dg wrote
Reply to [P] BetterTransformer: PyTorch-native free-lunch speedups for Transformer-based models by fxmarty
Is it possible to use sentence-transformer models using BetterTransformer?
C0hentheBarbarian t1_ixbjesg wrote
Reply to Suggestions for a socially valuable project that would welcome an unpaid contributor [D] by AnthonysEye
Huggingface is running an open source community sprint to train whisper on various low resource languages. Take a look at their discord to find out more.
C0hentheBarbarian t1_iutruwo wrote
Reply to comment by 5death2moderation in [D] Machine learning prototyping on Apple silicon? by laprika0
Hey, I was facing issues with sentence transformers and M1 (some missing layers not implemented for MPS). Could you tell me how you are getting around that?
C0hentheBarbarian t1_iuqlo3z wrote
I’ve been using an M1 for prototyping and found a couple of issues with some PyTorch models. It’s a buggy mess at times and even their fallback doesn’t work at times. Here are a few things not implemented yet - these do show up decently often as you see in that GitHub issue
C0hentheBarbarian t1_iss3b1t wrote
Reply to comment by Sbadabam278 in [D] Simple Questions Thread by AutoModerator
Suggest you look at some of the links in the article.. some discuss the math behind diffusion models in detail which should let you understand the paper.
C0hentheBarbarian t1_is96iqz wrote
Reply to comment by Sbadabam278 in [D] Simple Questions Thread by AutoModerator
Highly recommend this post by Jay Alammar. He has one of the best tutorials on how transformers work too (IMO) and this one is up there. I have worked with CV very sporadically recently but his post along with some of the links he has on there explained things to me pretty well. The only math background I can recommend off the top of my head is the probability calculation for lower/upper bounds - you can look up how VAEs work there or the post I linked has resources to understand the same.
C0hentheBarbarian t1_j9o5068 wrote
Reply to [D] Simple Questions Thread by AutoModerator
I work in NLP. Work mainly consists of fine tuning NLP models. With the rise of LLMs I'm seeing a lot of my work becoming Prompt engineering. I'm happy to pick up the new skill but I'd like to know what avenues I have to upskill beyond being a prompt engineer without a PhD. Feels like all the learning I did on model architectures etc is going to waste. There are still a few projects that need me to fine tune a model for text classification etc but as LLMs get better I suspect I need better skills to go beyond becoming a prompt engineer. For anyone else in NLP who doesn't have a PhD and doesn't have any experience building model architectures/training from scratch etc, how are all of you trying to up skill in these times? EDIT: Worded the question to ask only people who don't have a PhD, I would actually like to know everyone's perspective on this.