trendymoniker
trendymoniker t1_izckgr1 wrote
Reply to [D] If you had to pick 10-20 significant papers that summarize the research trajectory of AI from the past 100 years what would they be by versaceblues
Don’t forget Latent Dirichlet Allocation by Blei, Ng, and Jordan (2003). Deep learning has far surpassed probabilistic models for its simple scalability, but they were ascendant throughout the 2000s, with LDA being probably the most impressively complex yet practical of the lot.
Plus: 45,000 citations earned mostly in the era before machine learning was everywhere and every thing.
trendymoniker t1_iz6ruwq wrote
Reply to [D] Can AI Music Tools Compete with Artists? by Oblipher
No. But just look two papers down the line …
trendymoniker t1_ivf84sd wrote
Reply to comment by IndieAIResearcher in [D] Do you think there is a competitive future for smaller, locally trained/served models? by naequs
Easy answer is distillations like EfficientNet or DistillBERT. You can also get an intuition for the process by taking a small easy dataset — like MNIST or CIFAR — and running a big hyperparameter search over models. There will be small models which perform close to the best models.
These days nobody uses ResNet or Inception but there was a time they were the bleeding edge. Now it’s all smaller more precise stuff.
There other dimension you can win over big models is hardcoding in your priors.
trendymoniker t1_ivf3lp5 wrote
Reply to [D] Are Intelligent Systems and Calculus 3 worth doing under these circumstances for someone who wants a career in A.I.? by uluzg
You absolutely need calc 3 and linear algebra for AI. Backdrop is nothing but partial derivatives + ordered bookkeeping. And matrix math is the computational heart of neural networks.
trendymoniker t1_ivf0kjq wrote
Reply to [D] Do you think there is a competitive future for smaller, locally trained/served models? by naequs
I think you’ve got the right idea. It makes sense that big companies are developing and pushing big models. They’ve got the resources to train them. But you can often get a lot done with a much smaller, boutique model — thats one of the next frontiers.
trendymoniker t1_iunwsjv wrote
It all comes down to what you’re optimizing for. I’d you have a system want to optimize for likes, count likes. If you want to optimize for shares or watch time count those. If you’re interested in some sort of omnibus “popularity” count whatever weighted sum of those makes sense
trendymoniker t1_iswaasf wrote
What about full time?
trendymoniker t1_irl6lhy wrote
Reply to [D] How can a prospective PhD applicant gain ML-related research experience beforehand? by pumpkinsmasher76
What are you interested in?
trendymoniker t1_ir3ofzx wrote
Reply to [R] Self-Programming Artificial Intelligence Using Code-Generating Language Models by Ash3nBlue
Do you want Skynet!? Cause this is how you get Skynet!
trendymoniker t1_iqsl2og wrote
Reply to comment by Even_Information4853 in [D] Types of Machine Learning Papers by Lost-Parfait568
Bengio - Daphne Koller - Fei Fei Li?
Howard - Jeff dean - ??
trendymoniker t1_j0acn6e wrote
Reply to comment by Far-Butterscotch-436 in [D] Dealing with extremely imbalanced dataset by hopedallas
👆
1e6:1 is extreme. 1e3:1 is often realistic (think views to shares on social media). 18:1 is a actually a pretty good real world ratio.
If it were me, I’d just change the weights for each class in the loss function to get them more or less equal.
190m examples isn’t that many either — don’t worry about it. Compute is cheap — it’s ok if it takes more than one machine and/or more time.