SuperSpaceEye

SuperSpaceEye t1_jdxhwi6 wrote

  1. Yeah, moore's law is already ending, but it doesn't really matter for neural networks. Why? As they are massively parallelizable, GPU makers can just stack more cores on a chip (be it by making chips larger, or thicker (3d stacking)) to speedup training further.
  2. True, but we don't know where is that limit, and it just has to be better than humans.
  3. I really doubt it.
2

SuperSpaceEye t1_iwrtt69 wrote

That will depend on what are "we". If our consciousness arises from computation of neurons, then it wouldn't matter what device does the computation or in what form it is done. If, however, there is something more to our consciousness (some quantum stuff, maybe even existence of souls), then I don't think this question can be answered until we learn more about these processes. I'm myself a materialist, but who knows...

2

SuperSpaceEye t1_iwht6hf wrote

Two different tasks. Language model in SD just encodes text to some abstract representation that diffusion part of the model then uses. Text-to-text model such as GPT-J does different task which is much harder. Also, GPT-J is 6B parameters, which will only take like 12GB or VRAM, not hundreds.

3

SuperSpaceEye t1_iwhjfk1 wrote

Well, if you want to generate a coherent text you need a quite large model because you will easily find logical and writing errors as smaller models will give artifacts that will ruin the quality of output. The same with music as we are quite perceptive in small inaccuracies. Now images on the other hand can have "large" errors and still be beautiful to look at. Also, images can have large variations in textures, backgrounds, etc, making it easier for model to make "good enough" picture which won't work for text or audio, allowing for much smaller models.

6

SuperSpaceEye t1_itynygn wrote

GATO is able to retain knowledge across different tasks. It’s not, however, is able to "generalize"(i.e. the improvement in one task did not lead to improvement in a different task) (if i remember the paper correctly). So no, AGI was not solved.

1