Local Inference

Model Routing: Stop Using One Model for Everything

Model Routing: Stop Using One Model for Everything

The right model for the right task.

Running a 70B parameter model to summarize a 200-word email is wasteful. Running a 3B model to review production code is reckless. Most systems live somewhere in between — and that’s where model routing comes in.