limpbizkit4prez t1_jcld70o wrote on March 17, 2023 at 5:47 PM

Reply to comment by r_linux_mod_isahoe in [N] Jumpy 1.0 has now been released by the Farama Foundation by jkterry1

Yeah, but if I have a code base written in numpy and want to use jax, wouldn't I need to do the same amount of refactoring to integrate this as I would with regular jax? Are there a lot of functions in numpy that don't exist in jax.numpy?

limpbizkit4prez t1_jclabza wrote on March 17, 2023 at 5:29 PM

Reply to [N] Jumpy 1.0 has now been released by the Farama Foundation by jkterry1

What's value in using this instead of "jax.numpy as np"?

limpbizkit4prez t1_jai7l96 wrote on March 1, 2023 at 4:55 PM

Reply to comment by _Arsenie_Boca_ in [R] EvoPrompting: Language models can create novel and effective deep neural architectures. These architectures are also able to outperform those designed by human experts (with few-shot prompting) by MysteryInc152

It matters because the authors continue to increase model capacity to do better on a single task and that's it. They also determined that strategy, not the LLM. It would be way cooler if they constrained the problem to roughly the same number of parameters and showed generalization across multiple tasks. Again, it's neat, just not innovative or sexy.

limpbizkit4prez t1_jahhmhd wrote on March 1, 2023 at 1:59 PM

Reply to comment by MysteryInc152 in [R] EvoPrompting: Language models can create novel and effective deep neural architectures. These architectures are also able to outperform those designed by human experts (with few-shot prompting) by MysteryInc152

Lol, I strongly disagree. There are already methods out there that provide architecture design. This is a "that's neat" type of project, but I'd be really disappointed to see this anywhere other than arxiv.

limpbizkit4prez t1_jahaq8v wrote on March 1, 2023 at 1:00 PM

Reply to [R] EvoPrompting: Language models can create novel and effective deep neural architectures. These architectures are also able to outperform those designed by human experts (with few-shot prompting) by MysteryInc152

The authors kept increasing model size until the model overfit the task. I'm not sure if that's high impact. It's cool and everything, but over fitting a data set is never really valuable.

limpbizkit4prez t1_j9h3nbm wrote on February 21, 2023 at 10:51 PM

Reply to comment by blueSGL in [R] ChatGPT for Robotics: Design Principles and Model Abilities by CheapBreakfast9

If you don't know how to code, then regardless of how you interface it's going to be difficult to execute. If you do know how to code, then you'll probably want better encapsulation. I guess what I'm most curious about is if those code examples they give in their paper are able to be ran, like are those libraries that easy to use

limpbizkit4prez t1_j9gmme3 wrote on February 21, 2023 at 8:37 PM

Reply to [R] ChatGPT for Robotics: Design Principles and Model Abilities by CheapBreakfast9

If there are existing APIs that make these tasks so simple, what's the point of using ChatGPT? Why not just write the 5-10lines of code?

limpbizkit4prez t1_j9644lp wrote on February 19, 2023 at 3:24 PM

Reply to comment by [deleted] in [D] Lack of influence in modern AI by I_like_sources

What is your deal? Why are you being such a dick to everyone? It seems like you just want to yell at people, not have a discussion.

limpbizkit4prez t1_j5t7cl5 wrote on January 25, 2023 at 11:56 AM

Reply to comment by NadaBrothers in [R] Easiest way to train RNN's in MATLAB or Julia? by NadaBrothers

Ok, yeah that's what I was thinking. That totally makes sense. Good luck!

limpbizkit4prez t1_j5rmvr6 wrote on January 25, 2023 at 1:59 AM

Reply to [R] Easiest way to train RNN's in MATLAB or Julia? by NadaBrothers

I know you said you are interested in MATLAB or Julia, but I'm interested in why not a python library? I mean a simple Google search would show lots of pytorch HFO solutions.

limpbizkit4prez t1_j4sarps wrote on January 17, 2023 at 10:27 PM

Reply to [P] RWKV 14B Language Model & ChatRWKV : pure RNN (attention-free), scalable and parallelizable like Transformers by bo_peng

What does RWKV stand for?

limpbizkit4prez t1_iuy9ivn wrote on November 3, 2022 at 10:21 PM

Reply to comment by ojiber in [P] How to reverse engineer a neural network to get inputs from the outputs by ojiber

I think this is it, but if not I'm guessing your googling skillz can take it from there https://arxiv.org/abs/1711.01768

limpbizkit4prez t1_iux01hn wrote on November 3, 2022 at 5:04 PM

Reply to [P] How to reverse engineer a neural network to get inputs from the outputs by ojiber

There was a paper I read a few years ago about a group of researchers estimating the architecture and parameters of a NN just by querying it a bunch. If I get the time I'll try to find and share it. I know it's not exactly what you're looking for, but that might be a step in the right direction

limpbizkit4prez t1_iud9604 wrote on October 30, 2022 at 1:31 PM

Reply to comment by harishprab in [R] Open source inference acceleration library - voltaML by harishprab

Oh wow, I have no idea how I missed the other parts of the readme that shows other types of applications. Do you plan on showing any benchmarks against other frameworks?

limpbizkit4prez t1_iu965oz wrote on October 29, 2022 at 3:09 PM

Reply to [R] Open source inference acceleration library - voltaML by harishprab

Do you have any benchmarks against other frameworks and have you benchmarked other types of models or are you doing something specific for NLP?

limpbizkit4prez t1_itnoeqq wrote on October 24, 2022 at 11:53 PM

Reply to [D] would diffusion language models make sense? by hapliniste

I've always applied an annealing schedule like that to LMs. Imo, it works incredibly well and generalizes great