Craksy
Craksy t1_jdzbgzj wrote
Reply to comment by kalakau in [D] FOMO on the rapid pace of LLMs by 00001746
Not at all.
While it doesn't mean the world for the point I was trying to make, it does change the meaning quite a bit.
Thank you for the correction
Craksy t1_jdywiwi wrote
Reply to comment by antonivs in [D] FOMO on the rapid pace of LLMs by 00001746
Well that doesn't really contradict previous comment. They did mention fine tuning as an exception. GPT even stands for Generalized Pretrained Transformer. I'm sure some people like to draw hard lines between transfer learning/specialisation/fine tuning (different task or just different data) but at any rate, what you're describing can hardly be considered "training from scratch".
Indeed very few will need to be able to train models on that scale. In fact that was the whole motivation behind GPT. Training LLMs from scratch consumes a tremendous amount of resources, and 99% of that work goes into building a foundation that happens to generalize very well across many different tasks.
Craksy t1_j1alea2 wrote
Reply to comment by ktpr in [D] Different types of pooling in Neural Nets by Difficult-Race-1188
I think that's something mods should handle on a post-to-post basis. Scrolling through the comments here for instance, people don't seem to mind. If the community is interacting with the content and it can bring people together and spark interesting discussion, then it's a positive contribution in my book.
If it should get out of hand and the sub starts getting flooded with low effort content and self promos, then it might make sense to ban it or restrict it to "medium Mondays" or w/e. It's definitely something to watch out for. Lots of tech related subs just turn into link dumps that people use to promote their blogs.
Anyway I don't think we're anywhere near that point though. I find content here is generally pretty high quality. I don't know if it's due to the community or good moderation.
I just personally dislike Medium. The content is generally low effort, and it bothers me that lately, the first 2 pages of search engine results are mostly behind a pay wall.
Craksy t1_j195my5 wrote
Reply to comment by ktpr in [D] Different types of pooling in Neural Nets by Difficult-Race-1188
At this point, whenever I see a medium link, i kind of just expect it to be some half assed content meant to give search engine coverage or w/e.
Granted I did not read the article. It's entirely possible that this is the exception to the rule, and OP really just wanted to share a good read.
But the answer to your question may very well be "because it was never meant to educate or inform. It was made just to drive clicks"
It's honestly like the OnlyFans of tech news. "Here's a little tease. For only $9.99 a month, you get exclusive access to my entire disappointfolio"
Craksy t1_iz7b0ol wrote
Reply to comment by SeaMeasurement9 in [D] What Image Labelling Services Allow Labelling of NSFW Images? by AviatorPrints
Yeah, I would also be willing to watch po... ehm, label data for you.
In the name of science.
Seriously though I wonder how the NSFW filters on SD etc work... I wonder how much hand labeled adult material was gathered before it could mostly be automated.
I'm kind of amused by the idea of researchers sitting in their office, all sciency, browsing pornhub and taking notes.
Craksy t1_iypi2h6 wrote
Reply to comment by Desperate-Whereas50 in [D] PyTorch 2.0 Announcement by joshadel
I'm no expert either, but you're right that using CUDA requires use of unsafe. I believe kernels are even written in C through macros.
However, using unsafe does not necessarily mean UB. You preferably want to avoid that regardless. And UB is not the only way a compiler can optimize. Unsafe code simply means that you are responsible for memory safety, not that it should be ignored.
I don't know, you're talking about UB as if it was a feature and not an unfortunate development of compilers over the years.
In fact, Rust made it very clear that if you rely on UB that's your pain. Don't come crying in a week when your shit does not compile anymore. No guarantees are made, and no extra consideration is made to maintain compatibility with programs that make up their own rules.
Craksy t1_iuo8wnw wrote
Reply to comment by imnitwit in This data is not beautiful: Iranian regime's recent murders [OC] by imnitwit
Whoa, don't do that to me! Now I just feel like a dick.
Honestly, I was just having myself a bit of a grumpy grandpa moment. Although that is my opinion, I wasn't being very nice, and I could have expressed myself differently.
Again, presentation doesn't have to mean pretty. To me it's just about attention to the actual data, more than what you can conclude or learn from it.
It's perfectly fine to have a message, but consider if there are clever ways to present your data to express it more clearly, or if you can combine different types of visualization to convey more information from the same data. Stuff like that.
...Or just make it fancy. That works for me too!
Anyway, don't apologize. You didn't do anything wrong.
Craksy t1_iunf0ig wrote
Reply to comment by arash2003 in This data is not beautiful: Iranian regime's recent murders [OC] by imnitwit
I'm not saying it has to be pretty. Im just saying I care about presentation, and i believe that is the spirit of this sub.
Presentation can also mean applying some technique to make a relationship more clear, or picking just the right kind of chart for your particular dataset.
But some lazy ass dead boring bar chart? Fuck off
Craksy t1_iun4jiu wrote
I hate to say it but you're right. In every sense. I'm sure there are subs for posting data just for the sake of data.
I follow this sub for the presentation. I don't care how interesting or thought provoking your numbers are, If you didn't give a single fuck about the visual, and just took some default template from Excel, your post doesn't belong here.
Craksy t1_je3tzt3 wrote
Reply to comment by antonivs in [D] FOMO on the rapid pace of LLMs by 00001746
Aah, got you. My bad. Well, I suppose most people mainly think of NLP in these kind of contexts. That's where my mind went, anyway.
Training from scratch on a DSL is indeed an entirely different scale of problem (assuming it's not some enormous, complex DSL that relies heavily on context and thousands of years of culture to make sense of).
Sounds very interesting though. If you're allowed to share more information, I'd love to hear about it