Overview
- Added support for training Qwen3.5 series models.
- Upgraded core dependencies: transformers to v5.3.0, verl to v0.7.1, and vLLM to v0.19.0.
- Enabled the use of auxiliary models within the experience pipeline.
- Added support for stream mode in the rollout API.
- Enabled integration with external APIs (e.g., OpenAI) during training and benchmarking.
- Added experience data visualization capabilities.
- Optimized serialization and deserialization of experience data.
What's Changed
- Update Readme and docs by @hiyuchang in #508
- Support vLLM v0.16.0 by @pan-x-c in #510
- [Example] Clip_B and Clip_V from entropy dynamics by @hiyuchang in #509
- Support Transformers V5 by @chenyushuo in #512
- Using auxiliary models in experience pipeline & OpenAI API supports stream mode by @pan-x-c in #513
- Optimize serialize / deserialize of Experience by @pan-x-c in #514
- Support Qwen3.5 by @chenyushuo in #515
- Bug fix: vlm model with seq parallel by @chenyushuo in #517
- Add SQL Buffer Viewer to CLI by @pan-x-c in #520
- Update news in March by @hiyuchang in #522
- Rename
TaskSelectortoDataSelector. by @chenyushuo in #519 - Add openai_api model for explorer by @hiyuchang in #523
- feat: upgrade veRL to v0.7.1 with trainer file migration by @chenyushuo in #525
- Fix: dummy response text when prompt truncation is activated by @yanxi-chen in #524
- Bumping version to 0.5.2 by @pan-x-c in #526
Full Changelog: v0.5.1...v0.5.2