mgostIH t1_j46xnpc wrote on January 13, 2023 at 4:22 PM

Reply to [D] Bitter lesson 2.0? by Tea_Pearce

The real bitter lesson is how Standford got so many authors cited for introducing nothing but a less descriptive name than "Large models"

mgostIH t1_j02vbuy wrote on December 13, 2022 at 6:11 PM

Reply to comment by gwern in [D] G. Hinton proposes FF – an alternative to Backprop by mrx-ai

All the layers are trained independently at the same time, you can use gradients but you don't need backprop because you can use explicit descriptions since each layer will have as a problem maximizing ||W * x||^2 for good samples, minimizing it for bad samples (each layer gets a normalized version of the previous output).

The issue I find in this is (besides generating good contrastive examples) that I don't understand how this would lead a big network to discover interesting structure: circuits require multiple layers to do something interesting, but here each layer greedily optimizes its own evaluation. In some sense we are hoping that the output of the past layers will orient things in a way that doesn't make it too hard for the next layers, which have only linear dynamics.

mgostIH t1_izkzqdm wrote on December 9, 2022 at 9:39 PM

Reply to [D] Making a regression NN estimate its own regression error by Alex-S-S

The paper Epistemic Neural Networks does this formally and efficiently. Much more than Bayesian networks, at the cost of slightly more than your standard forward pass.

mgostIH t1_iw4dks1 wrote on November 12, 2022 at 9:37 PM

Reply to Relative representations enable zero-shot latent space communication by 51616

Like how a reviewer noted, the "zero shot" part is a bit overclaimed, given that one of the models has to be already trained with these relatives encodings, but the concept of the paper is an interesting phenomenon that points to there being a "true layout" of concepts in latent space that different type of models end up discovering.

mgostIH t1_iv1mz53 wrote on November 4, 2022 at 4:36 PM

Reply to comment by yaosio in [D] DALL·E to be made available as API, OpenAI to give users full ownership rights to generated images by TiredOldCrow

The act of prompting the AI for the generation of the image is what grants you authorship of the latter.

mgostIH t1_iv1fosb wrote on November 4, 2022 at 3:48 PM

Reply to comment by yaosio in [D] DALL·E to be made available as API, OpenAI to give users full ownership rights to generated images by TiredOldCrow

If you'd look at any of the articles before stopping to the title you'd understand that's what referred to "AIs work can't be copywritten" is that you can't attribute copyright to the artificial intelligence itself, but all of these judgements allow any human that puts any minimal effort into the generation (for example typing the prompt) to own the copyright for the image instead.

mgostIH t1_ir7euf7 wrote on October 5, 2022 at 9:42 PM

Reply to comment by Ulfgardleo in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada

You can apply it on the top call of your matrix mul and do everything inside the standard way, you still gain the efficiency since these algorithms also work in block matrix form.