2blazen
2blazen t1_je69rl1 wrote
I read about this tool on this sub and looks like what you're looking for https://lm-code-binder.github.io/
2blazen t1_jb1jk5h wrote
Reply to comment by Quazar_omega in [P] LazyShell - GPT based autocomplete for zsh by rumovoice
And lazy
Like come on at least have a landing page
2blazen t1_jazyryq wrote
Reply to comment by rpnewc in [D] The Sentences Computers Can't Understand, But Humans Can by New_Computer3619
Do you think an LLM can be taught to recognize when a question would require advanced reasoning to answer, or is it inherently impossible?
2blazen t1_jadrdgu wrote
Reply to comment by bluebolt789 in [Discussion] Can you use a model trained on tweets/product reviews to do sentiment analysis on IT support tickets? by [deleted]
I think what he means is your question is beneath the sub's standards lol
You may have more luck googling specifically about cross domain sentiment analysis, asking chatgpt, or asking it on r/MLQuestions or r/learndatascience
2blazen t1_j8r3le4 wrote
You'd want to find a more in-depth topic for a master's thesis, Reddit scraping and sentiment analysis sounds more like an assignment. Ask your supervisor if they have a topic they're researching on, and if you can join. Look around if your university has example projects or even better, open projects. Look around past year's theses if you can continue working on any of them (hint: future works section) Once you find a topic you're interested in and is niche enough, it's still too broad so you have to filter it down to research questions, for which you have to start an in-depth research about the challenges of the topic and such.
Don't panic, there are many topics that need research. I'm starting my thesis in audio processing - health AI / speaker embeddings / impaired speech / diagnosis assistance and it's wild west over here, partially because the data is not publicly accessible though
2blazen t1_j8i5fyx wrote
Reply to comment by NoLifeGamer2 in [Discussion] The need for noise in stable diffusion by AdministrationOk2735
That was my understanding as well, noise ensures "randomness"
2blazen t1_j8378vr wrote
Reply to comment by goj-145 in [D] Is it legal to use images or videos with copyright to train a model? by Tlaloc-Es
So you're saying Stability wouldn't have issues if they hired an intern to git clone a watermark remover and put the images through it first?
2blazen t1_j7kc4t6 wrote
Definitely Python, that's what all major companies support too. However it's not the byte code cache that makes a difference but the fact that machine learning libraries are written in C++ so you're not sacrificing performance by scripting in it.
These kind of questions are more suitable on r/learndatascience though
2blazen t1_j70vh2g wrote
Reply to comment by CowardlyVelociraptor in [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata
Might be just me, but I really hate how the reply is returned in the UI. Even if the subscription will solve the random interruptions during generation, the word-by-word printing kills me, I'd rather wait a bit but receive my answer in one piece
2blazen t1_j70ux9o wrote
Reply to comment by arhetorical in [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata
I thought so too, but haven't actually notice any difference, other than how the davinci models don't have the extensive content filters.
>if you use it for work, $20 is negligible
If my company pays for it, sure, otherwise I'll always prefer the request-based pricing with a nice API that I can just call from my terminal
2blazen t1_j6yluho wrote
Reply to comment by TrevorIRL in [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata
>that’s some pretty amazing margins
That's just the (estimated) hardware uptime cost, you haven't mentioned the wages or the R&D investment
2blazen t1_j6ykrcq wrote
Reply to comment by arhetorical in [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata
I've been using the GPT3 API for around 0.4c per request with 0 down time. With my current usage this sums up to around 10c a day, 3usd per month. I don't see how 20usd is reasonable
2blazen OP t1_j4u5jf7 wrote
Reply to comment by hayder978 in [D] Speaker diarization: reusing fitted speaker embedding clusters? by 2blazen
With my RTX 3060 it takes 3m50s to diarize 1 hour, 20m to do 3 hours (although can be reduced to 16m by presetting the number of speakers - I didn't check 1h segment like this, also keep in mind it takes time to load the models into vram), however 5 hour episodes keep getting my process killed after around 40m. It's probably a memory issue, and could even happen during the segmentation, but reusing clusters is a common issue on Github, it wouldn't just be for my usecase
Submitted by 2blazen t3_10bsef1 in MachineLearning
2blazen t1_j0zq28h wrote
Reply to comment by RageOnGoneDo in Sarcasm Detection model [R]. by Business-Ad6451
To be fair, ChatGPT very confidently bullshits about everything, even about 2+2 being equal to 3. But I agree, AI being able to detect sarcasm shouldn't be far away, however, it definitely won't be solved by BERT
2blazen t1_j0ym9sj wrote
Reply to comment by Business-Ad6451 in Sarcasm Detection model [R]. by Business-Ad6451
I think he's just a tiny bit skeptical considering how that's like the biggest challenge of NLP. Probably thousands of people tried it already, but even GPT3 doesn't seem to ace sarcasm yet
2blazen t1_je6axjp wrote
Reply to comment by gigglegenius in [D] With ML tools progressing so fast, what are some ways you've taken advantage of them personally? by RedditLovingSun
>- Creative brainstorming for professional work
I struggle with this, I was trying to get it it help me come up with interesting thesis research questions in a very specific audioML field, but it failed to come up with anything original, and I don't know if there's a certain way I should have phrased my questions or it's just creative limitations