Use case guide
Model rerouting
Route requests to the cheapest model that meets your reliability threshold. When risk is high, reroute to a stronger model.
Overview
Problem
A single-model policy either wastes money (always using the best model) or ships errors (always using the cheapest model).
Reality Signal
Convert a score into prob_est + uncertainty and a boolean decision.
Policy
Try a fast/cheap model first. If prob_est is below threshold or uncertainty is high, reroute to a stronger model and re-run the task.
Architecture
- Base model produces a score for the decision you care about.
- Send the score to
/decide. - Use
prob_est+uncertaintyto route safely. - When the true outcome is known, call
/feedbackto improve calibration.
Routing actions: automate, reroute, reprompt, or escalate.
1) Decide whether to reroute
Call /decide with your model’s score. Route based on calibrated probability and uncertainty.
Request
bash
curl -X POST https://onprem-api-sowl.jollysand-1b9ed42e.swedencentral.azurecontainerapps.io/decide \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"features": { "score": 0.82 }
}'Response
json
{
"decision_id": 123,
"prob_est": 0.62,
"uncertainty": 0.08,
"decision": false
}Use
prob_est + uncertainty as routing signals.2) Feedback from final outcome
When ground truth is known, send /feedback with the decision_id so calibration improves over time.
Feedback (cURL)
bash
curl -X POST https://onprem-api-sowl.jollysand-1b9ed42e.swedencentral.azurecontainerapps.io/feedback \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"decision_id": 123,
"feedback": 1,
"force_retrain": false
}'Reference implementation
Python
python
import requests
API_URL = "https://onprem-api-sowl.jollysand-1b9ed42e.swedencentral.azurecontainerapps.io"
HEADERS = {"x-api-key": "YOUR_API_KEY"}
def rc_decide(score: float):
r = requests.post(f"{API_URL}/decide", json={"features": {"score": score}, headers=HEADERS)
r.raise_for_status()
return r.json()
answer, score = cheap_model(prompt)
rc = rc_decide(score)
if rc["decision"] and rc["uncertainty"] <= 0.15:
final = answer
else:
final = strong_model(prompt)If you also reprompt, treat reprompt as an intermediate action before rerouting.