This is a small, self-contained framework for training and querying neural networks. Most notably, it contains a lightning fast "fully fused" multi-layer perceptron as well as support for various advanced input encodings, losses, and optimizers.
Real-time Neural Radiance Caching for Path Tracing Thomas Müller, Fabrice Rousselle, Jan Novák, Alexander Keller To appear: ACM Transactions on Graphics (SIGGRAPH) 2021
For business inquiries, please contact email@example.com. For press and other inquiries, please contact Hector Marinez at firstname.lastname@example.org.
Fully fused networks vs. TensorFlow v2.5.0 w/ XLA. Measured on 64 (solid line) and 128 (dashed line) neurons wide multi-layer perceptrons on an RTX 3090. Generated by benchmarks/bench_ours.cu and benchmarks/bench_tensorflow.py.
Special thanks go to the NRC authors for helpful discussions and to Nikolaus Binder for providing part of the infrastructure of this framework, as well as for help with utilizing TensorCores from within CUDA.