LeanRL is a lightweight library consisting of single-file, pytorch-based implementations of popular Reinforcement Learning (RL) algorithms. The primar

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-09-20 00:30:02

LeanRL is a lightweight library consisting of single-file, pytorch-based implementations of popular Reinforcement Learning (RL) algorithms. The primary goal of this library is to inform the RL PyTorch user base of optimization tricks to cut training time by half or more.

More precisely, LeanRL is a fork of CleanRL, where hand-picked scripts have been re-written using PyTorch 2 features, mainly torch.compile and cudagraphs. The goal is to provide guidance on how to run your RL script at full speed with minimal impact on the user experience.

Disclaimer: This repo is a highly simplified version of CleanRL that lacks many features such as detailed logging or checkpointing - its only purpose is to provide various versions of similar training scripts to measure the plain runtime under various constraints. However, we welcome contributions that re-implement these features.

torch.compile: Introduced in PyTorch 2.0, torch.compile serves as the primary framework for accelerating the execution of PyTorch code during both training and inference phases. This compiler translates Python code into a series of elementary operations and identifies opportunities for fusion. A significant advantage of torch.compile is its ability to minimize the overhead of transitioning between the Python interpreter and the C++ runtime. Unlike PyTorch's eager execution mode, which requires numerous such boundary crossings, torch.compile generates a single C++ executable, thereby minimizing the need to frequently revert to Python. Additionally, torch.compile is notably resilient to graph breaks, which occur when an operation is not supported by the compiler (due to design constraints or pending integration of the Python operator). This robustness ensures that virtually any Python code can be compiled in principle.

Leave a Comment