2blazen t1_je6axjp wrote on March 29, 2023 at 6:45 PM

Reply to comment by gigglegenius in [D] With ML tools progressing so fast, what are some ways you've taken advantage of them personally? by RedditLovingSun

>- Creative brainstorming for professional work

I struggle with this, I was trying to get it it help me come up with interesting thesis research questions in a very specific audioML field, but it failed to come up with anything original, and I don't know if there's a certain way I should have phrased my questions or it's just creative limitations

2blazen t1_je69rl1 wrote on March 29, 2023 at 6:37 PM

Reply to [D] The best way to train an LLM on company data by jaxolingo

I read about this tool on this sub and looks like what you're looking for https://lm-code-binder.github.io/

2blazen t1_jb1jk5h wrote on March 5, 2023 at 6:53 PM

Reply to comment by Quazar_omega in [P] LazyShell - GPT based autocomplete for zsh by rumovoice

And lazy

Like come on at least have a landing page

2blazen t1_jazyryq wrote on March 5, 2023 at 10:54 AM

Reply to comment by rpnewc in [D] The Sentences Computers Can't Understand, But Humans Can by New_Computer3619

Do you think an LLM can be taught to recognize when a question would require advanced reasoning to answer, or is it inherently impossible?

2blazen t1_jadrdgu wrote on February 28, 2023 at 6:31 PM

Reply to comment by bluebolt789 in [Discussion] Can you use a model trained on tweets/product reviews to do sentiment analysis on IT support tickets? by [deleted]

I think what he means is your question is beneath the sub's standards lol

You may have more luck googling specifically about cross domain sentiment analysis, asking chatgpt, or asking it on r/MLQuestions or r/learndatascience

2blazen t1_j8r3le4 wrote on February 16, 2023 at 10:14 AM

Reply to [P] Struggling with thesis idea and implementation by mems_m

You'd want to find a more in-depth topic for a master's thesis, Reddit scraping and sentiment analysis sounds more like an assignment. Ask your supervisor if they have a topic they're researching on, and if you can join. Look around if your university has example projects or even better, open projects. Look around past year's theses if you can continue working on any of them (hint: future works section) Once you find a topic you're interested in and is niche enough, it's still too broad so you have to filter it down to research questions, for which you have to start an in-depth research about the challenges of the topic and such.

Don't panic, there are many topics that need research. I'm starting my thesis in audio processing - health AI / speaker embeddings / impaired speech / diagnosis assistance and it's wild west over here, partially because the data is not publicly accessible though

2blazen t1_j8i5fyx wrote on February 14, 2023 at 2:17 PM

Reply to comment by NoLifeGamer2 in [Discussion] The need for noise in stable diffusion by AdministrationOk2735

That was my understanding as well, noise ensures "randomness"

2blazen t1_j8378vr wrote on February 11, 2023 at 8:01 AM

Reply to comment by goj-145 in [D] Is it legal to use images or videos with copyright to train a model? by Tlaloc-Es

So you're saying Stability wouldn't have issues if they hired an intern to git clone a watermark remover and put the images through it first?

2blazen t1_j7kc4t6 wrote on February 7, 2023 at 12:40 PM

Reply to [D] Python vs Swift vs Julia, what should I learn? (Any benchmarks?) by lukinhasb

Definitely Python, that's what all major companies support too. However it's not the byte code cache that makes a difference but the fact that machine learning libraries are written in C++ so you're not sacrificing performance by scripting in it.

These kind of questions are more suitable on r/learndatascience though

2blazen t1_j70vh2g wrote on February 3, 2023 at 7:03 AM

Reply to comment by CowardlyVelociraptor in [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata

Might be just me, but I really hate how the reply is returned in the UI. Even if the subscription will solve the random interruptions during generation, the word-by-word printing kills me, I'd rather wait a bit but receive my answer in one piece

2blazen t1_j70ux9o wrote on February 3, 2023 at 6:56 AM

Reply to comment by arhetorical in [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata

I thought so too, but haven't actually notice any difference, other than how the davinci models don't have the extensive content filters.

>if you use it for work, $20 is negligible

If my company pays for it, sure, otherwise I'll always prefer the request-based pricing with a nice API that I can just call from my terminal

2blazen t1_j6yluho wrote on February 2, 2023 at 8:29 PM

Reply to comment by TrevorIRL in [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata

>that’s some pretty amazing margins

That's just the (estimated) hardware uptime cost, you haven't mentioned the wages or the R&D investment

2blazen t1_j6ykrcq wrote on February 2, 2023 at 8:22 PM

Reply to comment by arhetorical in [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata

I've been using the GPT3 API for around 0.4c per request with 0 down time. With my current usage this sums up to around 10c a day, 3usd per month. I don't see how 20usd is reasonable

2blazen OP t1_j4u5jf7 wrote on January 18, 2023 at 7:16 AM

Reply to comment by hayder978 in [D] Speaker diarization: reusing fitted speaker embedding clusters? by 2blazen

With my RTX 3060 it takes 3m50s to diarize 1 hour, 20m to do 3 hours (although can be reduced to 16m by presetting the number of speakers - I didn't check 1h segment like this, also keep in mind it takes time to load the models into vram), however 5 hour episodes keep getting my process killed after around 40m. It's probably a memory issue, and could even happen during the segmentation, but reusing clusters is a common issue on Github, it wouldn't just be for my usecase

2blazen t1_j0zq28h wrote on December 20, 2022 at 4:52 PM

Reply to comment by RageOnGoneDo in Sarcasm Detection model [R]. by Business-Ad6451

To be fair, ChatGPT very confidently bullshits about everything, even about 2+2 being equal to 3. But I agree, AI being able to detect sarcasm shouldn't be far away, however, it definitely won't be solved by BERT

2blazen t1_j0ym9sj wrote on December 20, 2022 at 11:22 AM

Reply to comment by Business-Ad6451 in Sarcasm Detection model [R]. by Business-Ad6451

I think he's just a tiny bit skeptical considering how that's like the biggest challenge of NLP. Probably thousands of people tried it already, but even GPT3 doesn't seem to ace sarcasm yet