Language Model operational metrics
Operational metrics for LanguageModel runtime counters.
All metrics in this file read counters that the LM populates on each
provider call. Routing across inference, reward, and optimizer
phases follows the synalinks_op_scope global flag set by the trainer.
Class hierarchy:
LMOperationalMetric (base, _phase = "inference")
├── LMRewardsOperationalMetric (_phase = "reward")
└── LMOptimizersOperationalMetric (_phase = "optimizer")
AvgCostPerCall
Bases: LMOperationalMetric
Average provider cost per LM call over this run.
Source code in synalinks/src/metrics/lm_metrics.py
AvgInputTokensPerCall
Bases: LMOperationalMetric
Average input tokens per LM call over this run.
Source code in synalinks/src/metrics/lm_metrics.py
AvgOptimizerCostPerCall
Bases: LMOptimizersOperationalMetric
Average LM-call cost during the optimizer step.
Source code in synalinks/src/metrics/lm_metrics.py
AvgOptimizerInputTokensPerCall
Bases: LMOptimizersOperationalMetric
Average input tokens per LM call during the optimizer step.
Source code in synalinks/src/metrics/lm_metrics.py
AvgOptimizerOutputTokensPerCall
Bases: LMOptimizersOperationalMetric
Average output tokens per LM call during the optimizer step.
Source code in synalinks/src/metrics/lm_metrics.py
AvgOutputTokensPerCall
Bases: LMOperationalMetric
Average output tokens per LM call over this run.
Source code in synalinks/src/metrics/lm_metrics.py
AvgRewardCostPerCall
Bases: LMRewardsOperationalMetric
Average LM-call cost during reward computation.
Source code in synalinks/src/metrics/lm_metrics.py
AvgRewardInputTokensPerCall
Bases: LMRewardsOperationalMetric
Average input tokens per LM call during reward computation.
Source code in synalinks/src/metrics/lm_metrics.py
AvgRewardOutputTokensPerCall
Bases: LMRewardsOperationalMetric
Average output tokens per LM call during reward computation.
Source code in synalinks/src/metrics/lm_metrics.py
CacheCreationTokens
Bases: LMOperationalMetric
Tokens written to the prompt cache during this run (Anthropic
cache_creation_input_tokens; you pay a higher rate for these).
Source code in synalinks/src/metrics/lm_metrics.py
CacheHitRate
Bases: LMOperationalMetric
Fraction of prompt tokens served from cache: cached / prompt_tokens.
A high value here is one of the biggest cost levers; aim for 0.7+ on a production workload with stable system prompts.
Source code in synalinks/src/metrics/lm_metrics.py
CachedTokens
Bases: LMOperationalMetric
Prompt tokens served from provider-side prompt cache during this run.
For Anthropic this is reported as cache_read_input_tokens; for OpenAI
as cached_tokens. LiteLLM normalizes both into
usage.prompt_tokens_details.cached_tokens.
Source code in synalinks/src/metrics/lm_metrics.py
Cost
Bases: LMOperationalMetric
Cumulated provider cost (USD, as reported by litellm) for this run.
Source code in synalinks/src/metrics/lm_metrics.py
InputTokens
Bases: LMOperationalMetric
Cumulated input (prompt) tokens consumed during this run.
Source code in synalinks/src/metrics/lm_metrics.py
LMOperationalMetric
Bases: Metric
Base class for LanguageModel runtime-counter metrics.
Subclasses set _phase to one of "inference", "reward", or
"optimizer" to read from the corresponding counter set on each
bound LM. Counters are populated by the LM based on the
synalinks_op_scope global flag set by the trainer.
The metric binds itself automatically to every LanguageModel
reachable from the program (and their .fallback chains) when
program.compile() is called, and counters are summed across all.
Source code in synalinks/src/metrics/lm_metrics.py
LMOptimizersOperationalMetric
Bases: LMOperationalMetric
Base for LM metrics scoped to the optimizer phase.
Reads from each bound LM's optimizer_cumulated_* counters, which
the LM populates while Optimizer.optimize is running (but not
while nested reward computation is in progress — those go to the
rewards bucket).
Source code in synalinks/src/metrics/lm_metrics.py
LMRewardsOperationalMetric
Bases: LMOperationalMetric
Base for LM metrics scoped to the reward-computation phase.
Reads from each bound LM's reward_cumulated_* counters, which the
LM populates while Trainer.compute_reward is running.
Source code in synalinks/src/metrics/lm_metrics.py
OptimizerCacheCreationTokens
Bases: LMOptimizersOperationalMetric
Tokens written to the prompt cache during the optimizer step.
Source code in synalinks/src/metrics/lm_metrics.py
OptimizerCacheHitRate
Bases: LMOptimizersOperationalMetric
Prompt cache hit rate during the optimizer step.
Source code in synalinks/src/metrics/lm_metrics.py
OptimizerCachedTokens
Bases: LMOptimizersOperationalMetric
Prompt tokens served from cache during the optimizer step.
Source code in synalinks/src/metrics/lm_metrics.py
OptimizerCost
Bases: LMOptimizersOperationalMetric
Provider cost (USD) of LM calls during the optimizer step.
Source code in synalinks/src/metrics/lm_metrics.py
OptimizerInputTokens
Bases: LMOptimizersOperationalMetric
Input (prompt) tokens consumed by LM calls during the optimizer step.
Source code in synalinks/src/metrics/lm_metrics.py
OptimizerOutputTokens
Bases: LMOptimizersOperationalMetric
Output tokens generated by LM calls during the optimizer step.
Source code in synalinks/src/metrics/lm_metrics.py
OptimizerReasoningTokenShare
Bases: LMOptimizersOperationalMetric
Reasoning share of completion tokens during the optimizer step.
Source code in synalinks/src/metrics/lm_metrics.py
OptimizerReasoningTokens
Bases: LMOptimizersOperationalMetric
Reasoning tokens produced during the optimizer step.
Source code in synalinks/src/metrics/lm_metrics.py
OptimizerThroughput
Bases: LMOptimizersOperationalMetric
LM calls per second (RPS) during the optimizer step.
Source code in synalinks/src/metrics/lm_metrics.py
OptimizerTokensPerSecond
Bases: LMOptimizersOperationalMetric
Throughput in tokens per second during the optimizer step.
Source code in synalinks/src/metrics/lm_metrics.py
OptimizerTotalTokens
Bases: LMOptimizersOperationalMetric
Total tokens consumed by LM calls during the optimizer step.
Source code in synalinks/src/metrics/lm_metrics.py
OutputTokens
Bases: LMOperationalMetric
Cumulated output (completion) tokens generated during this run.
Source code in synalinks/src/metrics/lm_metrics.py
ReasoningTokenShare
Bases: LMOperationalMetric
Fraction of completion tokens spent on reasoning: reasoning_tokens / completion_tokens. Signals whether a thinking model is actually thinking on the workload.
Source code in synalinks/src/metrics/lm_metrics.py
ReasoningTokens
Bases: LMOperationalMetric
Reasoning/thinking tokens produced during this run (Claude extended thinking, OpenAI o-series). Not included in the visible completion content but billed as output tokens.
Source code in synalinks/src/metrics/lm_metrics.py
RewardCacheCreationTokens
Bases: LMRewardsOperationalMetric
Tokens written to the prompt cache during reward computation.
Source code in synalinks/src/metrics/lm_metrics.py
RewardCacheHitRate
Bases: LMRewardsOperationalMetric
Prompt cache hit rate during reward computation.
Source code in synalinks/src/metrics/lm_metrics.py
RewardCachedTokens
Bases: LMRewardsOperationalMetric
Prompt tokens served from cache during reward computation.
Source code in synalinks/src/metrics/lm_metrics.py
RewardCost
Bases: LMRewardsOperationalMetric
Provider cost (USD) of LM calls during reward computation.
Source code in synalinks/src/metrics/lm_metrics.py
RewardInputTokens
Bases: LMRewardsOperationalMetric
Input (prompt) tokens consumed by LM calls during reward computation.
Source code in synalinks/src/metrics/lm_metrics.py
RewardOutputTokens
Bases: LMRewardsOperationalMetric
Output tokens generated by LM calls during reward computation.
Source code in synalinks/src/metrics/lm_metrics.py
RewardReasoningTokenShare
Bases: LMRewardsOperationalMetric
Reasoning share of completion tokens during reward computation.
Source code in synalinks/src/metrics/lm_metrics.py
RewardReasoningTokens
Bases: LMRewardsOperationalMetric
Reasoning tokens produced during reward computation.
Source code in synalinks/src/metrics/lm_metrics.py
RewardThroughput
Bases: LMRewardsOperationalMetric
LM calls per second (RPS) during reward computation.
Source code in synalinks/src/metrics/lm_metrics.py
RewardTokensPerSecond
Bases: LMRewardsOperationalMetric
Throughput in tokens per second during reward computation.
Source code in synalinks/src/metrics/lm_metrics.py
RewardTotalTokens
Bases: LMRewardsOperationalMetric
Total tokens consumed by LM calls during reward computation.
Source code in synalinks/src/metrics/lm_metrics.py
Throughput
Bases: LMOperationalMetric
Throughput in LM calls per second (RPS) over this run.
Source code in synalinks/src/metrics/lm_metrics.py
TokensPerSecond
Bases: LMOperationalMetric
Throughput in total tokens per second over this run.
Source code in synalinks/src/metrics/lm_metrics.py
TotalTokens
Bases: LMOperationalMetric
Cumulated total tokens (input + output) for this run.