Weave has released a source-available model router that integrates with coding agents like Claude Code and Cursor to intelligently route requests to the most cost-effective LLMs based on task complexity. By training an RL model on agent traces, they achieved a 40% reduction in token costs without compromising code generation quality or velocity. The tool allows users to leverage cheaper models for simple tasks while reserving frontier models for complex reasoning.
Background
As AI coding assistants become ubiquitous, the high cost of API calls to frontier models like Opus and GPT-5 has become a significant barrier for developers and teams seeking to scale usage efficiently.
- Source
- Hacker News (RSS)
- Published
- Jun 27, 2026 at 12:40 AM
- Score
- 7.0 / 10