limpbizkit4prez
limpbizkit4prez t1_jclabza wrote
What's value in using this instead of "jax.numpy as np"?
limpbizkit4prez t1_jai7l96 wrote
Reply to comment by _Arsenie_Boca_ in [R] EvoPrompting: Language models can create novel and effective deep neural architectures. These architectures are also able to outperform those designed by human experts (with few-shot prompting) by MysteryInc152
It matters because the authors continue to increase model capacity to do better on a single task and that's it. They also determined that strategy, not the LLM. It would be way cooler if they constrained the problem to roughly the same number of parameters and showed generalization across multiple tasks. Again, it's neat, just not innovative or sexy.
limpbizkit4prez t1_jahhmhd wrote
Reply to comment by MysteryInc152 in [R] EvoPrompting: Language models can create novel and effective deep neural architectures. These architectures are also able to outperform those designed by human experts (with few-shot prompting) by MysteryInc152
Lol, I strongly disagree. There are already methods out there that provide architecture design. This is a "that's neat" type of project, but I'd be really disappointed to see this anywhere other than arxiv.
limpbizkit4prez t1_jahaq8v wrote
Reply to [R] EvoPrompting: Language models can create novel and effective deep neural architectures. These architectures are also able to outperform those designed by human experts (with few-shot prompting) by MysteryInc152
The authors kept increasing model size until the model overfit the task. I'm not sure if that's high impact. It's cool and everything, but over fitting a data set is never really valuable.
limpbizkit4prez t1_j9h3nbm wrote
Reply to comment by blueSGL in [R] ChatGPT for Robotics: Design Principles and Model Abilities by CheapBreakfast9
If you don't know how to code, then regardless of how you interface it's going to be difficult to execute. If you do know how to code, then you'll probably want better encapsulation. I guess what I'm most curious about is if those code examples they give in their paper are able to be ran, like are those libraries that easy to use
limpbizkit4prez t1_j9gmme3 wrote
If there are existing APIs that make these tasks so simple, what's the point of using ChatGPT? Why not just write the 5-10lines of code?
limpbizkit4prez t1_j9644lp wrote
Reply to comment by [deleted] in [D] Lack of influence in modern AI by I_like_sources
What is your deal? Why are you being such a dick to everyone? It seems like you just want to yell at people, not have a discussion.
limpbizkit4prez t1_j5t7cl5 wrote
Reply to comment by NadaBrothers in [R] Easiest way to train RNN's in MATLAB or Julia? by NadaBrothers
Ok, yeah that's what I was thinking. That totally makes sense. Good luck!
limpbizkit4prez t1_j5rmvr6 wrote
I know you said you are interested in MATLAB or Julia, but I'm interested in why not a python library? I mean a simple Google search would show lots of pytorch HFO solutions.
limpbizkit4prez t1_j4sarps wrote
Reply to [P] RWKV 14B Language Model & ChatRWKV : pure RNN (attention-free), scalable and parallelizable like Transformers by bo_peng
What does RWKV stand for?
limpbizkit4prez t1_iuy9ivn wrote
Reply to comment by ojiber in [P] How to reverse engineer a neural network to get inputs from the outputs by ojiber
I think this is it, but if not I'm guessing your googling skillz can take it from there https://arxiv.org/abs/1711.01768
limpbizkit4prez t1_iux01hn wrote
There was a paper I read a few years ago about a group of researchers estimating the architecture and parameters of a NN just by querying it a bunch. If I get the time I'll try to find and share it. I know it's not exactly what you're looking for, but that might be a step in the right direction
limpbizkit4prez t1_iud9604 wrote
Reply to comment by harishprab in [R] Open source inference acceleration library - voltaML by harishprab
Oh wow, I have no idea how I missed the other parts of the readme that shows other types of applications. Do you plan on showing any benchmarks against other frameworks?
limpbizkit4prez t1_iu965oz wrote
Do you have any benchmarks against other frameworks and have you benchmarked other types of models or are you doing something specific for NLP?
limpbizkit4prez t1_itnoeqq wrote
I've always applied an annealing schedule like that to LMs. Imo, it works incredibly well and generalizes great
limpbizkit4prez t1_jcld70o wrote
Reply to comment by r_linux_mod_isahoe in [N] Jumpy 1.0 has now been released by the Farama Foundation by jkterry1
Yeah, but if I have a code base written in numpy and want to use jax, wouldn't I need to do the same amount of refactoring to integrate this as I would with regular jax? Are there a lot of functions in numpy that don't exist in jax.numpy?