The goal is not to thrive for completeness, full maintenance or abstraction, but instead to provide a simple largely static alternative to torch.optim

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-11-17 16:30:06

The goal is not to thrive for completeness, full maintenance or abstraction, but instead to provide a simple largely static alternative to torch.optim with more and better optimizers.

Currently (2024-11-17, 0.15.0), the recommended stable optimizer is PrecondSchedulePaLMForeachSOAP (see below). The recommended experimental optimizer is ForeachPSGDKron.

Second order optimizers make it difficult to estimate memory usage, as it depends on shapes and hyperparameters. To estimate your memory usage, you may use test/test_memory.py which attempts to ensure there are no regressions. Furthermore, you can find real-world memory usage of a 300M parameters video diffusion model below:

Leave a Comment