C0hentheBarbarian t1_j9o5068 wrote on February 23, 2023 at 11:00 AM

Reply to [D] Simple Questions Thread by AutoModerator

I work in NLP. Work mainly consists of fine tuning NLP models. With the rise of LLMs I'm seeing a lot of my work becoming Prompt engineering. I'm happy to pick up the new skill but I'd like to know what avenues I have to upskill beyond being a prompt engineer without a PhD. Feels like all the learning I did on model architectures etc is going to waste. There are still a few projects that need me to fine tune a model for text classification etc but as LLMs get better I suspect I need better skills to go beyond becoming a prompt engineer. For anyone else in NLP who doesn't have a PhD and doesn't have any experience building model architectures/training from scratch etc, how are all of you trying to up skill in these times? EDIT: Worded the question to ask only people who don't have a PhD, I would actually like to know everyone's perspective on this.

C0hentheBarbarian t1_j4ug85t wrote on January 18, 2023 at 9:40 AM

Reply to comment by FastestLearner in [D] Idea: SponsorBlock with a neural net as backend by FastestLearner

Training isn’t the main issue wrt cost. Inference is.

C0hentheBarbarian t1_j468jz4 wrote on January 13, 2023 at 1:30 PM

Reply to [D] Has any work been done on VQ-VAE Language Models? by Avelina9X

Its pretty old in the context of Deep Learning but openAI Jukebox uses them for audio if I remember correctly.

C0hentheBarbarian t1_j2sl0n3 wrote on January 3, 2023 at 5:46 PM

Reply to comment by Purplekeyboard in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

What about BLOOMZ? Isn’t it fine tuned in a similar way to GPT-3? Instruction fine tuned?

C0hentheBarbarian t1_iyh4kex wrote on December 1, 2022 at 10:25 AM

Reply to OpenAI ChatGPT [R] by Sea-Photo5230

Results like this make me seriously question if I'll have a job in the future as an ML person. I understand the nature of the job will change etc but I can see myself becoming an overqualified prompt engineer.

C0hentheBarbarian t1_iy73sx4 wrote on November 29, 2022 at 5:56 AM

Reply to comment by ProfessionalShame900 in [D] Simple Questions Thread by AutoModerator

> How to visualize the cluster in high-dimensional space?

t-SNE could work for this

C0hentheBarbarian t1_ixgr1dg wrote on November 23, 2022 at 9:06 AM

Reply to [P] BetterTransformer: PyTorch-native free-lunch speedups for Transformer-based models by fxmarty

Is it possible to use sentence-transformer models using BetterTransformer?

C0hentheBarbarian t1_ixbjesg wrote on November 22, 2022 at 5:06 AM

Reply to Suggestions for a socially valuable project that would welcome an unpaid contributor [D] by AnthonysEye

Huggingface is running an open source community sprint to train whisper on various low resource languages. Take a look at their discord to find out more.

C0hentheBarbarian t1_iutruwo wrote on November 2, 2022 at 11:24 PM

Reply to comment by 5death2moderation in [D] Machine learning prototyping on Apple silicon? by laprika0

Hey, I was facing issues with sentence transformers and M1 (some missing layers not implemented for MPS). Could you tell me how you are getting around that?

C0hentheBarbarian t1_iuqlo3z wrote on November 2, 2022 at 9:06 AM

Reply to [D] Machine learning prototyping on Apple silicon? by laprika0

I’ve been using an M1 for prototyping and found a couple of issues with some PyTorch models. It’s a buggy mess at times and even their fallback doesn’t work at times. Here are a few things not implemented yet - these do show up decently often as you see in that GitHub issue

C0hentheBarbarian t1_iss3b1t wrote on October 18, 2022 at 8:32 AM

Reply to comment by Sbadabam278 in [D] Simple Questions Thread by AutoModerator

Suggest you look at some of the links in the article.. some discuss the math behind diffusion models in detail which should let you understand the paper.

C0hentheBarbarian t1_is96iqz wrote on October 14, 2022 at 5:29 AM

Reply to comment by Sbadabam278 in [D] Simple Questions Thread by AutoModerator

Highly recommend this post by Jay Alammar. He has one of the best tutorials on how transformers work too (IMO) and this one is up there. I have worked with CV very sporadically recently but his post along with some of the links he has on there explained things to me pretty well. The only math background I can recommend off the top of my head is the probability calculation for lower/upper bounds - you can look up how VAEs work there or the post I linked has resources to understand the same.