Submitted by Ok_Telephone4183 t3_117pmhr in singularity
SoylentRox t1_j9dq24m wrote
Note: I work in AI , and have friends who work at OpenAI.
Computer science.
The reason why the other 2 subjects don't matter is they essentially are not used now. Neither neuroscience or cognitive science is relevant for current AI research. Current methods have long since left needing to borrow from nature. The transformer or current activation functions for ANNs do not borrow anything but the vaguest ideas from looking at old neuroscience data.
Current AI research is empirical. We have tasks we want the AI to do, or output we want it to produce, and we will use whatever actually works.
The road to AGI - which may happen before you graduate, it's happening rapidly - will be likely from recursion. Task an existing AI with designing a better AI. By this route, less and less human ideas or prior human knowledge will be used as the AI architectures are evolved in whatever direction maximizes performance.
For an analogy: only for a brief early period in aviation history did anyone study birds. Later aerofoil advancements were made by building fixed shapes and methodically studying variations on those shapes in a wind tunnel. Eventually control surfaces like flaps and other active wing surfaces were developed, still nothing from birds - the shapes all came from empirical data, and later CFD data.
Similarly, none of the other key element of aviation: engines: came from studying nature either. The krebs cycle was never, ever used in the process of making ever more powerful combustion engines. They are so different there is nothing useful to be learned.
Destiny_Knight t1_j9dxbeq wrote
Aren't there only like 400 employees at OpenAI or something? That's like saying you have a friend who won the lottery. That's pretty amazing. What's their experience like? Anything they can share? Is it all secretive?
SoylentRox t1_j9dytn4 wrote
Several friends. Others at AI startups. Somehow they are self taught. Good at Python, has a framework that uses some cool hacks included automated function memoization.
Note that until very recently, like 2 months now, OpenAI was kind of not the best option for elite programmers. It was all people on a passion project. The lottery winners were at Deepmind or Meta.
Have several friends there also. The Meta friends are all the usual background, with the graduate degree and 15+ yoe in high performance GPU work.
fangfried t1_j9e14pq wrote
Why is python so widely used in AI when it’s a really inefficient language under the hood? Wouldn’t Rust be better to optimize models? Or do you just need that optimization at the infrastructure level while the models are so high level it doesn’t matter?
Also it’s really cool there’s people in the forefront of AI on this sub. I’m at a big tech company right now, and I want to transfer into infrastructure for AI there. Then hopefully, I’ll build a resume to get into a top PhD program. After that I could work in AI research.
SoylentRox t1_j9e39rk wrote
>Why is python so widely used in AI when it’s a really inefficient language under the hood? Wouldn’t Rust be better to optimize models? Or do you just need that optimization at the infrastructure level while the models are so high level it doesn’t matter?
You make calls to a high level framework, usually pytorch, that have the effect of creating a pipeline. "Take this shape of input, inference it through this architecture using this activation function, calculate the error, backprop using this optimizer".
The python calls can be translated to a graph. I usually see these in *.onnx files though there are several other representations. These describe how the data will flow.
In the python code, you form the object, then call a function to actually inference it a step.
So internally it's taking that graph, creating a GPU kernel that is modified for the shapes of your data, compiling it, and then running it on the target GPU. (or on the project i work on, it compiles it for what is a TPU).
The compile step is slow, using a compiler that is likely C++. The loading step is slow. But once it's all up and running, you get essentially the same performance as if all the code were in C/C++, but all the code you need to touch to do AI work is in Python.
CertainMiddle2382 t1_j9e2oju wrote
Python is just scripting for whatever talks to the metal…
Destiny_Knight t1_j9dzwf0 wrote
What's your prediction for when a ChatGPT that doesn't make mistakes in answering and has 10x more memory will occur? What's your timeline for AGI, singularity?
SoylentRox t1_j9e2m1x wrote
Mistakes: Depends on the outcome of efforts to try to reduce answering errors. If self introspection works, months.
More context memory: Weeks to months. There already are papers that set up the groundwork: https://arxiv.org/abs/2302.04761 . Searching the past log for this same session (past our token window) is easily integratable with the toolformer architecture.
There are also alternate architectures that may also enormously increase the window.
AGI : it is possible within a few years. Whether it happens depends on the trajectory of outside investment. If Google and Microsoft go into an all out AI war where each are spending 100B plus annually? A few years. If current approaches "cap out" and the hyper diminishes? Could take decades.
Singularity: shortly after AGI is good enough to control robotics for most tasks. So shortly after AGI probably. (shortly meaning a matter of months to a few years)
Viewing a single comment thread. View all comments