starfries
starfries t1_jdz0f1p wrote
Reply to comment by ReasonablyBadass in [D] FOMO on the rapid pace of LLMs by 00001746
Huh, I could have sworn it was a lot older.
starfries t1_jdyz458 wrote
Reply to comment by sdmat in [D] FOMO on the rapid pace of LLMs by 00001746
No, I mean you don't need anything special or to follow a conventional path.
starfries t1_jdyx0xh wrote
Reply to comment by nxqv in [D] FOMO on the rapid pace of LLMs by 00001746
I feel like Eliezer Yudkowsky proves that everyone can be Eliezer Yudkowsky, going from a crazy guy with a Harry Potter fanfic and a blog to being mentioned in your post alongside those other two names.
starfries t1_j9v9x20 wrote
Reply to [D] Got invited to an ML final interview - have zero statistics/math background by [deleted]
Of course you should go. And depending on the company and the role they actually have in mind that amount of math background could be plenty.
starfries t1_j9ufa2h wrote
Reply to comment by Mefaso in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
Unfortunately there are just too many crackpots in that space. It's like bringing up GUT in physics - worthwhile goal, but you're sharing the bus with too many crazies.
starfries t1_j8gcrzo wrote
Reply to comment by mindmech in [D] Quality of posts in this sub going down by MurlocXYZ
Me too. There's a lot of great people I want to hear from but only when they post about ML, not politics.
starfries t1_j8fp30e wrote
Reply to comment by tysam_and_co in [D] Quality of posts in this sub going down by MurlocXYZ
What's the current understanding of why/when batch norm works? I haven't kept up with the literature but I had the impression there was no real consensus.
starfries t1_j87ypnt wrote
Reply to comment by A_Light_Spark in [P] Introducing arxivGPT: chrome extension that summarizes arxived research papers using chatGPT by _sshin_
Maybe it's a difference in fields. I rarely see people do meta-analysis in ML so it didn't strike me as odd. Most of the reviews are just "here's what people are trying" with some attempt at categorization. But I see what you mean now, it makes sense that having a meta-analysis is important in medical fields where you want to aggregate studies.
starfries t1_j87r1js wrote
Reply to comment by A_Light_Spark in [P] Introducing arxivGPT: chrome extension that summarizes arxived research papers using chatGPT by _sshin_
I have definitely seen the kind of papers you're talking about, but this one seems fine to me? Granted I skimmed it really quickly but the title says it's a review article and the abstract reflects that.
As an aside: I really like the format I see in bio fields (and maybe others, but this is where I've encountered it) of putting the results before the detailed methodology. It doesn't always make sense for a lot of CS papers where the results are the most boring part (essentially being "it works better") but where it does it leads to a much better paper in my opinion.
starfries t1_j6l0aeq wrote
Reply to comment by anony_sci_guy in [R] SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot by Secure-Technology-78
Thanks for that resource, I've been experimenting with the lottery ticket method but that's a lot of papers I haven't seen! Did you initialize the weights as if training from scratch, or did you do something like trying to match the variance of the old and new weights? I'm intrigued that your method didn't hurt performance - most of the things I've tested were detrimental to the network. I have seen some performance improvements under different conditions but I'm still trying to rule out any confounding factors.
starfries t1_j6ec8f7 wrote
Reply to comment by Narfi1 in [Pro/Chef] Orange Cheesecake on Cookie Crumble soil by So6oring
Holy shit, that's incredible.
starfries t1_j64qhqa wrote
Reply to comment by anony_sci_guy in [R] SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot by Secure-Technology-78
Can you elaborate on this? I'm trying something similar, so I'm curious what your results were and if you ran across any literature about this idea.
starfries t1_j63pp6f wrote
Reply to comment by Vivid-Ad6077 in [Discussion] Github like alternative for ML? by angkhandelwal749
wandb is great but I had no idea it also versioned code, I'm still using git for that.
starfries t1_j5uhnv9 wrote
Reply to comment by bitchslayer78 in [D]Are there any known AI systems today that are significantly more advanced than chatGPT ? by Xeiristotle
Wait, has someone actually integrated Wolfram with ChatGPT? I thought it was still in the "would be cool" stage.
starfries t1_iw90r3p wrote
Reply to comment by PredictorX1 in [R] ZerO Initialization: Initializing Neural Networks with only Zeros and Ones by hardmaru
What is that? I can't find a copy online.
starfries t1_jdz0q2b wrote
Reply to comment by sdmat in [D] FOMO on the rapid pace of LLMs by 00001746
That's not what I meant, so no offense taken.