You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<p> The average correctness across all of our dataset's queries.</p>
405
+
</div>
406
+
407
+
<divclassName="metric-details">
408
+
<h4>Definition</h4>
409
+
<p>
410
+
We calculate accuracy as the average correctness of the answers generated by the router's selected models across all of our dataset's queries
411
+
</p>
412
+
413
+
414
+
<p>
415
+
<strong>Range:</strong> [0, 100]
416
+
</p>
417
+
</div>
418
+
</div>
419
+
420
+
<divclassName="metric-card">
421
+
<divclassName="metric-summary">
422
+
<h3>Cost/1k Queries</h3>
423
+
<p>Measures the cost incurred by a router’s routing decisions per 1000 queries.</p>
424
+
</div>
425
+
426
+
<divclassName="metric-details">
427
+
<h4>Definition</h4>
428
+
<p>
429
+
This is the average token cost incurred by the router's selected models for 1000 queries from our dataset.
430
+
<br/>
431
+
We obtain the per-token cost for the specific models a router
432
+
chooses using the official API pricing published by their providers. For unpopular models that are not served by commercial providers, we deploy them ourselves for experiments.
433
+
In such cases, we approximate their costs using the pricing tiers published by commercial hosting
0 commit comments