Lairv
Lairv t1_ir7qmlc wrote
Reply to [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
Cool paper, worth noting that such systems requires huge resources to be trained, they quickly mention it in the appendix "1.600 actors TPUv4 to play games, and 64 TPUv3 to train the networks, during a week". For reference, AlphaZero for Go was trained with 5.000 actors TPUv1 to generate games, and 64 TPUv2 to train networks, during 8 hours. I still find it unfortunate that not much work has been done to reduce resources needed to train AlphaZero-like systems, which is already 5 years old
Lairv t1_ir7p0xt wrote
Reply to comment by neanderthal_math in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
In the article they try 2 types of reward: minimizing the rank of the tensor decomposition (i.e. minimizing total number of multiplication), and minimizing the runtime of the algorithm on a given hardware (they tried with nvidia V100 and TPUv2)
The latter could be actually useful since their graphs shows that the algorithms discovered reach better performances than cuBLAS (Fig.5)
Lairv t1_ira9aul wrote
Reply to comment by Thorusss in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
My point is that considering that these methods can be applied in about any scientific field, it would be beneficial if not only Google, Microsoft, Facebook and OpenAI could train them