MrFlufypants
MrFlufypants t1_jdpt0vm wrote
Reply to [D] Keeping track of ML advancements by Anis_Mekacher
We do a journal series at work. Rule is every engineer has to do one before we get to do another one. Gives presenting skills and forces us to hear new stuff since we all have different preferences.
Big issue is that recently many of the coolest advancements have been by Facebook, openai, and google and they are increasingly releasing “reports” instead of “papers”. We are getting a lot more “And then they did this incredibly revolutionary thing but only said they used a ‘model’”. They aren’t giving details because they want to keep their work private. Big bummer.
I also read any papers that make the top of this sub, and I’ll usually read a couple of the best performing papers from the big conferences
MrFlufypants t1_j0atiwu wrote
Reply to comment by veb101 in [D] Tensorflow vs. PyTorch Memory Usage by Oceanboi
There are a couple ways to do it. That’s the one I use normally. Sometimes that doesn’t work though. Can’t quite remember the use case where it wasn’t working
MrFlufypants t1_j0ao1ei wrote
Reply to [D] Tensorflow vs. PyTorch Memory Usage by Oceanboi
I’ve had issues where tensorflow automatically grabs the whole gpu while PyTorch only uses what the model asks for. Could totally not be your problem, but if you’re running multiple models it could be your problem
MrFlufypants t1_iqx6bhc wrote
Reply to comment by ZestyData in [D] Why restrict to using a linear function to represent neurons? by MLNoober
The activation functions are key. A linear combination of linear combinations is probably equal to a linear combination, so 10 layers would equate to a single layer, which is only capable of so much. The activation functions destroy the linearity though and are the key ingredient there
MrFlufypants t1_jedtiuv wrote
Reply to comment by ReasonablyBadass in [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
My first question too. What’s to stop OpenAI from “partnering with” a small startup they “definitely don’t own” and giving them the money/S tier research to monopolize this thing’s use by hitting their priority matrix correctly? Stick said company in Ghana and they can play the 3rd world card too. And if you make that impossible by sharing access easily, I doubt anybody will have enough timeshares to train a Large model. Hope I’m wrong, but I’ve become a bit cynical lately about companies not being greedy bastards