Submitted by AutoModerator t3_zcdcoo in MachineLearning
_PYRO42_ t1_iyxivdk wrote
I want to create a new type of neural network, but it might be nothing new. I struggle to find anything about it on Google Scholar. I am missing the nomenclature associated with such a technique.
I want to create a neural network with conditional execution. Instead of executing every neuron, layer-by-layers, I wish to build a system where the network can NOT execute a neuron and any subsequent paths after it. By not executing, I mean, no CPU cycles, no computation, no electricity, and no power consumed.
This non-execution of code is conditional. Example: IF A>0.5 THEN execute LEFT neuron ELSE execute RIGHT neuron
Do such systems already exist? What do we call them? I need a name to search for it! :)
Thank you for your help!
HandSchuhbacca t1_iyxoq9r wrote
Maybe have a look at mixture of experts? That is a popular method where different blocks are executed conditionally.
_PYRO42_ t1_izml827 wrote
Oh lord, that's not a bad one. I love it!
I will use the GPU while retaining recursion and conditionality. Blocks of GPU-processable neurons, linked with particular conditional/recursive neurons.
Superschlenz t1_iyy2jx0 wrote
Normally, compute is saved by pruning away slow changing weights which are close to zero.
And you seem to want to prune away fast changing activations.
Don't the machine learning libraries have a dropout mechanism where you can zero out activations with a binary mask? I don't know. You would have to compute the forward activations for the first layer, then compare the activations with a threshold to set the mask bits, then activate the dropout mask for that layer before computing the next layer's activations. Sounds like a lot of overhead instead of a saving.
Edit: You may also manually force the activations to zero if they are low. The hardware has built-in energy saving circuitry that skips multiplications by zero, maybe by 1 and additions of zero as well. But it still needs to move the data around.
_PYRO42_ t1_izmknte wrote
I have an intuition: Larger models are successful not because of the amount of computation they can take advantage of but because of the amount of knowledge they can encode. I want to try an ultra-large, ultra-deep neural network with Giga bytes of neurons that would consume no more than 50 Watts of power. The human brain uses 20 Watts; I feel we are making a mistake when we start poking in the 100-200W of power on a single network. I want to control machines, not generate pieces of art. I want Factorio not to be a game but a reality of ours.
I will bring edge computing to this world. I will make it a thing you can wear not on your skin but as your skin.
_PYRO42_ t1_iyyyqw6 wrote
That's about what I was looking for:LIU, Lanlan et DENG, Jia. Dynamic deep neural networks: Optimizing accuracy-efficiency trade-offs by selective execution. In : Proceedings of the AAAI Conference on Artificial Intelligence. 2018.
Problem: Control nodes prevent the direct application of back-propagation to learn.I have an idea of how we could solve that... >:)A way to remove control nodes while still retaining the concept of control
I only need to add recursion. A truly Turing complete NN, with Billions of Neurons but a small execution path. Encoding knowledge, but using it only when needed!
Brudaks t1_iz04r4f wrote
Thing is, it's generally more compute-efficient to do the exact opposite and replace conditional execution with something that always does the exact same operations in parallel but just multiplies them by zero or something like that if they're not needed. Parallelism and vectorization is how we get effective execution nowadays.
Viewing a single comment thread. View all comments