simmol t1_jdxlcco wrote on March 27, 2023 at 10:48 PM

Gary Marcus is wrong on this. There have been already papers published that trains simple machine learning models on publications made before date X and demonstrating that the algorithm can find concepts found in publications after date X. These were not even using LLM but simple Word2Vec abstractions where each of the words in the publications were mapped to vectors and the ML model learned the relationships between the numerical vectors for all papers published before date X.

MattAbrams t1_je07108 wrote on March 28, 2023 at 1:59 PM

This isn't how science works. It's easy to say the machine works when you already have the papers you're looking for.

But this happens all the time in bitcoin trading, like I do. It can predict lots of things with high probability. They are all much more likely than things that make no sense. But just because they make sense doesn't mean that you have an easy way to actually choose which one is "correct."

If we ran this machine in year X, it would spit out a large number of papers in year Y, some of which may be correct, but there still needs to be a way to actually test all of them, which would take a huge amount of effort.

My guess is that there will never be an "automatic discoverer" that suddenly jumps 100x in an hour, because the testing process is long and the machines required to test become significantly more complicated in parallel to the abilities of the computer - look at the size increases of particle accelerators, for example.