@dair_ai
Training of Physical Neural Networks Could we train AI models 1000x larger than today's? Could we run them privately on edge devices like smartphones? The answer might be yes, but not with GPUs. This paper suggests that the path forward may require physical neural networks. Physical Neural Networks (PNNs) use properties of physical systems to perform computation. Optical systems, photonics, analog electronics, and even mechanical substrates. Physics can compute certain operations far more efficiently than digital transistors. The problem isn't inference. The problem is training. Backpropagation has powered deep learning's success, but implementing it in physical hardware faces fundamental challenges. Weight transport, gradient communication across layers, and precise knowledge of activation functions. This review maps the landscape of PNN training methods: 1) In-silico training: Create digital twins of physical systems, optimize them computationally, then deploy to hardware. Fast iteration but limited by model fidelity. Fabrication imperfections, misalignments, and detection noise break the digital-physical correspondence. 2) Physics-aware training: Physical system performs forward pass, digital model handles backpropagation. A hybrid approach that mitigates experimental noise while maintaining gradient-based optimization. Successfully demonstrated across optical, mechanical, and electronic systems. 3) Equilibrium Propagation: For energy-based systems that naturally minimize a Lyapunov function. Weight updates use local contrastive rules comparing equilibrium states. Implemented on memristor crossbar arrays with potential energy gains of 4 orders of magnitude versus GPUs. 4) Local learning methods: Avoid global gradient communication entirely. Physical Local Learning uses forward-mode differentiation through physical perturbations. No digital model required. Demonstrated on multimode optical fibers with 10,000+ trainable parameters. The emerging hardware spans optical correlators, photonic integrated circuits, spintronic devices, memristor crossbars, exciton-polariton condensates, and quantum circuits. No method yet scales to backpropagation's performance on digital hardware. But the trajectory is clear: diverse training techniques are converging on practical PNN implementations. As AI scaling hits GPU limits, physical computing offers a path to models orders of magnitude larger and more energy-efficient than what's currently possible. Paper: https://t.co/AiTbVWMZSP Learn to build with LLMs and AI Agents in our academy: https://t.co/zQXQt0PMbG