@PyTorch
Built on PyTorch, Ray, SGLang, and NVIDIA Megatron-LM, Miles is an open source framework from RadixArk for large-scale LLM reinforcement learning post-training. Miles uses PyTorch for models, numerics, profiling, and extensibility; Ray for orchestration; SGLang for rollout generation; and Megatron-LM for distributed training. The framework supports asynchronous rollout and training, NCCL/RDMA weight synchronization, MoE-aware rollout/training alignment, low-precision recipes, LoRA, fault tolerance, observability, and extension points for custom algorithms and model architectures. š Read more in our latest blog from the Miles Team: https://t.co/fekP7rOoH7