Screye
Screye t1_jcl549n wrote
Reply to comment by Spiritual-Reply5896 in [D] PyTorch 2.0 Native Flash Attention 32k Context Window by super_deap
Context length is also a hard limit on how many logical-hops the model can make.
If each back-n-forth takes 500-ish tokens, then the model can only reason over 16 hops over 8k tokens. With 32k tokens, it can reason over 64 hops. This might allow for emergent behaviors towards tasks that have previously been deemed impossible due to needing at least a minimum number of hops to reason about.
For what it's worth, I think memory retrieval will work just fine for 90% of scenarios and will stay relevant even for 32k tokens. Esp. if the wiki you are retrieving from is millions of lines.
Screye t1_j6tu8mc wrote
> in ten years?
10 years ago was 2012. Deep Learning didn't even exist as field back then.
Tempting as it might be, I'd recommend caution in predicting the future of a field that went from non-existence to near-dominance within its profession in the last 10 years.
Screye t1_j6688sf wrote
Reply to [D] MusicLM: Generating Music From Text by carlthome
I am done man. How is someone supposed to keep up with this pace of research ?
Screye t1_izauw5w wrote
Reply to comment by Tejas_Garhewal in [D] If you had to pick 10-20 significant papers that summarize the research trajectory of AI from the past 100 years what would they be by versaceblues
He is the UIUC of Deep learning's mount rushmore.
Just as people think of Stanford, MIT, CMU, Berkley as the big CS universities and forget that UIUC is almost just as good.....people take the names of Hinton, LeCun, Bengio and forget that Schmidhuber(' lab) did a lot of important foundational work in deep learning.
Sadly, he is a curmudgeon who complains a lot and claims even more than he has actually achieved.....so people have kind of soured on him lately.
Screye t1_ivzs8xs wrote
Reply to comment by bumbo-pa in [D] Current Job Market in ML by diffusion-xgb
AFAIK, There aren't a lot of series A or seed rounds happening.
But pre-established startups like Jasper are getting funded because premier investors have already invested a ton into them. In for a penny, in for a pound.
Screye t1_ivzcjb3 wrote
Reply to [D] Current Job Market in ML by diffusion-xgb
Big companies are firing more non-essential members of the team. Also, research & unreliable money makers get cut first.
So it makes sense that SWEs don't get fired because they maintain the systems. On the other hand, a lot of AI products are not making a shit ton of money just yet, the research costs are very high and the AI Scientists don't usually do the job of maintaining an AI service.
So they get fired with a higher priority than SWEs.
Now, in a downturn, cost cutting takes major priority.
- AI tools allow expensive humans to be replaced with cheaper algorithms
- 3rd party startups can sell their AI toolkit for lower prices than Azure AI / Google AI
- If you didn't expect the startup to make money for 3-5 years anyway, then the market conditions don't really matter that much
- All other startup industries are in the dumpster. Gig economy startups burn too much money. End users stop using convenience based startups in times of high inflation. And don't even get me started on crypto. So really, health-tech and ML are the only 2 startup sectors where it still makes some sense to invest.
Those 4 things have made it a rather decent time to be in an ML startup, but not so great time to be in ML at a bigtech company.
Screye t1_jcmpd5i wrote
Reply to comment by VarietyElderberry in [D] PyTorch 2.0 Native Flash Attention 32k Context Window by super_deap
This is more derived from extensive personal experience with prompt engineering / fine tuning over the last 2 years.
Simply put:
Idk if that makes sense. Our field keeps moving away from math, and as embarrassing as it is to antromorphize the model, it does make it easier to get the point across.