Observability with MLflow
Synalinks provides built-in observability through MLflow, enabling you to trace and monitor your LM programs in production.
Overview
The observability system automatically creates spans for each module call, capturing:
- Inputs and outputs of each module
- Duration of each call
- Cost information (when available from the language model)
- Success/failure status
- Parent-child relationships between nested module calls
Quick Start
Enable Observability
Important: You must call
enable_observability()BEFORE creating any modules. Hooks are registered when modules are instantiated, so enabling observability after module creation will not trace those modules.
import synalinks
# Enable FIRST, before creating any modules
synalinks.enable_observability(
tracking_uri="http://localhost:5000",
experiment_name="my_experiment"
)
# Now create your modules - they will be automatically traced
inputs = synalinks.Input(data_model=Question)
outputs = await synalinks.Generator(...)(inputs)
Once enabled, all module calls in your program will be automatically traced.
Example Usage
import asyncio
import synalinks
# Enable observability before creating your program
synalinks.enable_observability(
tracking_uri="http://localhost:5000",
experiment_name="question_answering"
)
class Question(synalinks.DataModel):
question: str = synalinks.Field(description="The user's question")
class Answer(synalinks.DataModel):
answer: str = synalinks.Field(description="The answer to the question")
async def main():
language_model = synalinks.LanguageModel(model="openai/gpt-4.1")
# Create a simple question-answering program
inputs = synalinks.Input(data_model=Question)
outputs = await synalinks.Generator(
data_model=Answer,
language_model=language_model,
)(inputs)
program = synalinks.Program(
inputs=inputs,
outputs=outputs,
name="qa_program",
description="A simple QA program",
)
# Run the program - traces will be sent to MLflow
result = await program(Question(question="What is the capital of France?"))
if result:
print(result.prettify_json())
if __name__ == "__main__":
asyncio.run(main())
Running MLflow with Docker
Using Docker
Run MLflow tracking server locally with artifact proxying enabled:
docker run -d \
--name mlflow \
-p 5000:5000 \
-v mlflow-data:/mlflow \
ghcr.io/mlflow/mlflow:latest \
mlflow server \
--host 0.0.0.0 \
--port 5000 \
--backend-store-uri sqlite:///mlflow/mlflow.db \
--default-artifact-root mlflow-artifacts:/ \
--serve-artifacts \
--artifacts-destination /mlflow/artifacts
Important flags:
--serve-artifacts: Enables the MLflow server to proxy artifact uploads from clients--default-artifact-root mlflow-artifacts:/: Tells clients to use the server as an artifact proxy--artifacts-destination /mlflow/artifacts: Where the server stores artifacts on disk
Then configure Synalinks to use it:
import synalinks
synalinks.enable_observability(
tracking_uri="http://localhost:5000",
experiment_name="synalinks_traces"
)
Using Docker Compose
For a more complete setup with persistent storage, create a docker-compose.yml:
services:
mlflow:
image: ghcr.io/mlflow/mlflow:latest
container_name: mlflow
ports:
- "5000:5000"
volumes:
- ./mlflow-data:/mlflow
command: >
mlflow server
--host 0.0.0.0
--port 5000
--backend-store-uri sqlite:///mlflow/mlflow.db
--default-artifact-root mlflow-artifacts:/
--serve-artifacts
--artifacts-destination /mlflow/artifacts
restart: unless-stopped
Start the services:
Access the MLflow UI at http://localhost:5000.
Production Setup with PostgreSQL
For production deployments, use PostgreSQL as the backend store:
services:
postgres:
image: postgres:16-alpine
container_name: mlflow-postgres
environment:
POSTGRES_USER: mlflow
POSTGRES_PASSWORD: mlflow
POSTGRES_DB: mlflow
volumes:
- postgres-data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U mlflow"]
interval: 5s
timeout: 5s
retries: 5
restart: unless-stopped
mlflow:
image: ghcr.io/mlflow/mlflow:latest
container_name: mlflow
depends_on:
postgres:
condition: service_healthy
ports:
- "5000:5000"
volumes:
- mlflow-artifacts:/mlflow/artifacts
environment:
MLFLOW_BACKEND_STORE_URI: postgresql://mlflow:mlflow@postgres:5432/mlflow
command: >
mlflow server
--host 0.0.0.0
--port 5000
--backend-store-uri postgresql://mlflow:mlflow@postgres:5432/mlflow
--default-artifact-root mlflow-artifacts:/
--serve-artifacts
--artifacts-destination /mlflow/artifacts
restart: unless-stopped
volumes:
postgres-data:
mlflow-artifacts:
Understanding Traces
When you run a Synalinks program with observability enabled, MLflow captures detailed traces.
Span Types
Synalinks automatically categorizes spans based on module type for better visualization in MLflow:
| Module | Span Type |
|---|---|
Generator, ChainOfThought, SelfCritique |
LLM |
FunctionCallingAgent |
AGENT |
EmbedKnowledge, RetrieveKnowledge, UpdateKnowledge |
RETRIEVER |
Tool |
TOOL |
| Other modules | CHAIN |
Span Attributes
Each span includes these attributes:
| Attribute | Description |
|---|---|
synalinks.call_id |
Unique identifier for this call |
synalinks.parent_call_id |
ID of the parent call (for nested modules) |
synalinks.module |
Module class name (e.g., Generator) |
synalinks.module_name |
Custom name given to the module |
synalinks.module_description |
Module description |
synalinks.is_symbolic |
Whether the call was symbolic (graph building) |
synalinks.duration |
Call duration in seconds |
synalinks.success |
Whether the call succeeded |
synalinks.cost |
LLM API cost (when available) |
Exception Events
When a module call fails, the span automatically records an exception event with:
- exception.type: The exception class name
- exception.message: The exception message
Viewing Traces
- Open MLflow UI at
http://localhost:5000 - Navigate to your experiment
- Click on a run to see detailed traces
- Use the trace view to explore the call hierarchy
Configuration Options
Environment Variables
You can also configure MLflow using environment variables:
Then in your code:
import synalinks
# Will use MLFLOW_TRACKING_URI from environment
synalinks.enable_observability(experiment_name="my_experiment")
Direct Monitor Hook
For fine-grained control, you can create a Monitor hook directly:
import synalinks
monitor = synalinks.hooks.Monitor(
tracking_uri="http://localhost:5000",
experiment_name="custom_experiment"
)
# Add to a specific module
generator = synalinks.Generator(
data_model=Answer,
hooks=[monitor]
)
Training Metrics and Artifacts
The Monitor callback logs training metrics and program artifacts to MLflow during fit().
Basic Usage
import synalinks
# Create the monitor callback
monitor = synalinks.callbacks.Monitor(
tracking_uri="http://localhost:5000",
experiment_name="training_experiment",
run_name="my_training_run",
log_program_plot=True, # Save program visualization as artifact
)
# Use during training
program.fit(
x=train_inputs,
y=train_labels,
epochs=10,
callbacks=[monitor]
)
Program Plot Artifact
When log_program_plot=True (the default), the Monitor callback automatically saves
a visualization of your program architecture as an MLflow artifact at the start of training.
The plot is saved under program_plots/ in the artifacts folder and includes:
- Module names and types
- Input/output schemas
- Trainable status of each module
You can view the program plot in the MLflow UI under the "Artifacts" tab of your run.
Program Model Artifact
When log_program_model=True (the default), the Monitor callback saves the program's
trainable state at the end of training. This includes:
model/state_tree.json: Contains all trainable variables (few-shot examples, optimized prompts, etc.)model/model_info.json: Metadata about the program (name, description, number of trainable variables)
This is useful for:
- Checkpointing learned parameters during optimization
- Comparing different training runs
- Restoring program state for inference
Callback Parameters
| Parameter | Default | Description |
|---|---|---|
experiment_name |
Program name | MLflow experiment name |
run_name |
Auto-generated | MLflow run name |
tracking_uri |
Local ./mlruns |
MLflow tracking server URI |
log_batch_metrics |
False |
Log metrics at batch level |
log_epoch_metrics |
True |
Log metrics at epoch level |
log_program_plot |
True |
Save program visualization as artifact |
log_program_model |
True |
Save program trainable state as artifact |
tags |
{} |
Additional tags for the run |
Example with Full Configuration
import synalinks
monitor = synalinks.callbacks.Monitor(
tracking_uri="http://localhost:5000",
experiment_name="gsm8k_optimization",
run_name="chain_of_thought_v1",
log_batch_metrics=True,
log_epoch_metrics=True,
log_program_plot=True,
tags={
"model": "gpt-4o-mini",
"optimizer": "RandomFewShot",
"dataset": "gsm8k"
}
)
program.fit(
x=train_questions,
y=train_answers,
epochs=5,
callbacks=[monitor]
)
Combining Tracing with Training
When using both enable_observability() and the Monitor callback for training, traces
are created in different experiments depending on the context:
-
During program building (symbolic calls): Traces go to the experiment specified in
enable_observability() -
During training (
fit()): Traces are associated with the training run and go to the experiment specified in theMonitorcallback
Full Example
import synalinks
# Enable tracing for all module calls
synalinks.enable_observability(
tracking_uri="http://localhost:5000",
experiment_name="synalinks_traces" # Traces during setup go here
)
# Create your program (symbolic traces created here)
inputs = synalinks.Input(data_model=Question)
outputs = await synalinks.Generator(
data_model=Answer,
language_model=language_model,
)(inputs)
program = synalinks.Program(inputs=inputs, outputs=outputs, name="my_program")
# Create Monitor callback for training
monitor = synalinks.callbacks.Monitor(
tracking_uri="http://localhost:5000",
experiment_name="training_runs", # Training metrics + traces go here
run_name="experiment_v1",
)
# Train - traces during fit() are associated with the training run
program.compile(reward=reward, optimizer=optimizer)
await program.fit(x=train_x, y=train_y, epochs=5, callbacks=[monitor])
After training, you'll have: - synalinks_traces experiment: Setup traces (symbolic module calls) - training_runs experiment: Training run with metrics, artifacts, and execution traces
Best Practices
- Enable observability early in your script, before creating any modules
- Use meaningful experiment names to organize your traces by project or feature
- Use persistent storage (PostgreSQL) for production deployments
- Set up retention policies to manage storage for long-running applications
Troubleshooting
No traces being created
If you don't see any traces in MLflow:
-
Check call order: Ensure
enable_observability()is called before creating any modules -
Verify observability is enabled: Check with
synalinks.is_observability_enabled() -
Check the correct experiment: During training, traces go to the training experiment, not the observability experiment
MLflow not receiving traces
- Verify the MLflow server is running:
curl http://localhost:5000/health - Check the tracking URI is correct
- Ensure
mlflowpackage is installed:pip install mlflow
Artifacts not showing in MLflow UI
If artifacts are uploaded but don't appear in the MLflow UI:
- Check server configuration: Ensure the MLflow server is started with
--serve-artifactsflag - Verify artifact root: The server must use
--default-artifact-root mlflow-artifacts:/for remote clients - Check permissions: The server needs write access to
--artifacts-destinationpath
Correct server configuration:
mlflow server \
--serve-artifacts \
--default-artifact-root mlflow-artifacts:/ \
--artifacts-destination /mlflow/artifacts
Common mistake - Missing --serve-artifacts causes clients to try writing directly to the server's local filesystem, resulting in permission errors like:
Missing cost information
Cost tracking requires the language model to return usage information. Ensure your LLM provider supports this feature.