Observability with MLflow

Synalinks provides built-in observability through MLflow, enabling you to trace and monitor your LM programs in production.

Overview

The observability system automatically creates spans for each module call, capturing:

Inputs and outputs of each module
Duration of each call
Cost information (when available from the language model)
Success/failure status
Parent-child relationships between nested module calls

Quick Start

Enable Observability

Important: You must call enable_observability() BEFORE creating any modules. Hooks are registered when modules are instantiated, so enabling observability after module creation will not trace those modules.

import synalinks

# Enable FIRST, before creating any modules
synalinks.enable_observability(
    tracking_uri="http://localhost:5000",
    experiment_name="my_experiment"
)

# Now create your modules - they will be automatically traced
inputs = synalinks.Input(data_model=Question)
outputs = await synalinks.Generator(...)(inputs)

Once enabled, all module calls in your program will be automatically traced.

Example Usage

import asyncio
import synalinks

# Enable observability before creating your program
synalinks.enable_observability(
    tracking_uri="http://localhost:5000",
    experiment_name="question_answering"
)


class Question(synalinks.DataModel):
    question: str = synalinks.Field(description="The user's question")


class Answer(synalinks.DataModel):
    answer: str = synalinks.Field(description="The answer to the question")


async def main():
    language_model = synalinks.LanguageModel(model="openai/gpt-4.1")

    # Create a simple question-answering program
    inputs = synalinks.Input(data_model=Question)
    outputs = await synalinks.Generator(
        data_model=Answer,
        language_model=language_model,
    )(inputs)

    program = synalinks.Program(
        inputs=inputs,
        outputs=outputs,
        name="qa_program",
        description="A simple QA program",
    )

    # Run the program - traces will be sent to MLflow
    result = await program(Question(question="What is the capital of France?"))
    if result:
        print(result.prettify_json())

if __name__ == "__main__":
    asyncio.run(main())

Running MLflow with Docker

Using Docker

Run MLflow tracking server locally with artifact proxying enabled:

docker run -d \
    --name mlflow \
    -p 5000:5000 \
    -v mlflow-data:/mlflow \
    ghcr.io/mlflow/mlflow:latest \
    mlflow server \
    --host 0.0.0.0 \
    --port 5000 \
    --backend-store-uri sqlite:///mlflow/mlflow.db \
    --default-artifact-root mlflow-artifacts:/ \
    --serve-artifacts \
    --artifacts-destination /mlflow/artifacts

Important flags:

--serve-artifacts: Enables the MLflow server to proxy artifact uploads from clients
--default-artifact-root mlflow-artifacts:/: Tells clients to use the server as an artifact proxy
--artifacts-destination /mlflow/artifacts: Where the server stores artifacts on disk

Then configure Synalinks to use it:

import synalinks

synalinks.enable_observability(
    tracking_uri="http://localhost:5000",
    experiment_name="synalinks_traces"
)

Using Docker Compose

For a more complete setup with persistent storage, create a docker-compose.yml:

services:
  mlflow:
    image: ghcr.io/mlflow/mlflow:latest
    container_name: mlflow
    ports:
      - "5000:5000"
    volumes:
      - ./mlflow-data:/mlflow
    command: >
      mlflow server
      --host 0.0.0.0
      --port 5000
      --backend-store-uri sqlite:///mlflow/mlflow.db
      --default-artifact-root mlflow-artifacts:/
      --serve-artifacts
      --artifacts-destination /mlflow/artifacts
    restart: unless-stopped

Start the services:

docker compose up -d

Access the MLflow UI at http://localhost:5000.

Production Setup with PostgreSQL

For production deployments, use PostgreSQL as the backend store:

services:
  postgres:
    image: postgres:16-alpine
    container_name: mlflow-postgres
    environment:
      POSTGRES_USER: mlflow
      POSTGRES_PASSWORD: mlflow
      POSTGRES_DB: mlflow
    volumes:
      - postgres-data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U mlflow"]
      interval: 5s
      timeout: 5s
      retries: 5
    restart: unless-stopped

  mlflow:
    image: ghcr.io/mlflow/mlflow:latest
    container_name: mlflow
    depends_on:
      postgres:
        condition: service_healthy
    ports:
      - "5000:5000"
    volumes:
      - mlflow-artifacts:/mlflow/artifacts
    environment:
      MLFLOW_BACKEND_STORE_URI: postgresql://mlflow:mlflow@postgres:5432/mlflow
    command: >
      mlflow server
      --host 0.0.0.0
      --port 5000
      --backend-store-uri postgresql://mlflow:mlflow@postgres:5432/mlflow
      --default-artifact-root mlflow-artifacts:/
      --serve-artifacts
      --artifacts-destination /mlflow/artifacts
    restart: unless-stopped

volumes:
  postgres-data:
  mlflow-artifacts:

Understanding Traces

When you run a Synalinks program with observability enabled, MLflow captures detailed traces.

Span Types

Synalinks automatically categorizes spans based on module type for better visualization in MLflow:

Module	Span Type
`Generator`, `ChainOfThought`, `SelfCritique`	`LLM`
`FunctionCallingAgent`	`AGENT`
`EmbedKnowledge`, `RetrieveKnowledge`, `UpdateKnowledge`	`RETRIEVER`
`Tool`	`TOOL`
Other modules	`CHAIN`

Span Attributes

Each span includes these attributes:

Attribute	Description
`synalinks.call_id`	Unique identifier for this call
`synalinks.parent_call_id`	ID of the parent call (for nested modules)
`synalinks.module`	Module class name (e.g., `Generator`)
`synalinks.module_name`	Custom name given to the module
`synalinks.module_description`	Module description
`synalinks.is_symbolic`	Whether the call was symbolic (graph building)
`synalinks.duration`	Call duration in seconds
`synalinks.success`	Whether the call succeeded
`synalinks.cost`	LLM API cost (when available)

Exception Events

When a module call fails, the span automatically records an exception event with: - exception.type: The exception class name - exception.message: The exception message

Viewing Traces

Open MLflow UI at http://localhost:5000
Navigate to your experiment
Click on a run to see detailed traces
Use the trace view to explore the call hierarchy

Configuration Options

Environment Variables

You can also configure MLflow using environment variables:

export MLFLOW_TRACKING_URI=http://localhost:5000

Then in your code:

import synalinks

# Will use MLFLOW_TRACKING_URI from environment
synalinks.enable_observability(experiment_name="my_experiment")

Direct Monitor Hook

For fine-grained control, you can create a Monitor hook directly:

import synalinks

monitor = synalinks.hooks.Monitor(
    tracking_uri="http://localhost:5000",
    experiment_name="custom_experiment"
)

# Add to a specific module
generator = synalinks.Generator(
    data_model=Answer,
    hooks=[monitor]
)

Training Metrics and Artifacts

The Monitor callback logs training metrics and program artifacts to MLflow during fit().

Basic Usage

import synalinks

# Create the monitor callback
monitor = synalinks.callbacks.Monitor(
    tracking_uri="http://localhost:5000",
    experiment_name="training_experiment",
    run_name="my_training_run",
    log_program_plot=True,  # Save program visualization as artifact
)

# Use during training
program.fit(
    x=train_inputs,
    y=train_labels,
    epochs=10,
    callbacks=[monitor]
)

Program Plot Artifact

When log_program_plot=True (the default), the Monitor callback automatically saves a visualization of your program architecture as an MLflow artifact at the start of training.

The plot is saved under program_plots/ in the artifacts folder and includes:

Module names and types
Input/output schemas
Trainable status of each module

You can view the program plot in the MLflow UI under the "Artifacts" tab of your run.

Program Model Artifact

When log_program_model=True (the default), the Monitor callback saves the program's trainable state at the end of training. This includes:

model/state_tree.json: Contains all trainable variables (few-shot examples, optimized prompts, etc.)
model/model_info.json: Metadata about the program (name, description, number of trainable variables)

This is useful for:

Checkpointing learned parameters during optimization
Comparing different training runs
Restoring program state for inference

Callback Parameters

Parameter	Default	Description
`experiment_name`	Program name	MLflow experiment name
`run_name`	Auto-generated	MLflow run name
`tracking_uri`	Local `./mlruns`	MLflow tracking server URI
`log_batch_metrics`	`False`	Log metrics at batch level
`log_epoch_metrics`	`True`	Log metrics at epoch level
`log_program_plot`	`True`	Save program visualization as artifact
`log_program_model`	`True`	Save program trainable state as artifact
`tags`	`{}`	Additional tags for the run

Example with Full Configuration

import synalinks

monitor = synalinks.callbacks.Monitor(
    tracking_uri="http://localhost:5000",
    experiment_name="gsm8k_optimization",
    run_name="chain_of_thought_v1",
    log_batch_metrics=True,
    log_epoch_metrics=True,
    log_program_plot=True,
    tags={
        "model": "gpt-4o-mini",
        "optimizer": "RandomFewShot",
        "dataset": "gsm8k"
    }
)

program.fit(
    x=train_questions,
    y=train_answers,
    epochs=5,
    callbacks=[monitor]
)

Combining Tracing with Training

When using both enable_observability() and the Monitor callback for training, traces are created in different experiments depending on the context:

During program building (symbolic calls): Traces go to the experiment specified in enable_observability()
During training (fit()): Traces are associated with the training run and go to the experiment specified in the Monitor callback

Full Example

import synalinks

# Enable tracing for all module calls
synalinks.enable_observability(
    tracking_uri="http://localhost:5000",
    experiment_name="synalinks_traces"  # Traces during setup go here
)

# Create your program (symbolic traces created here)
inputs = synalinks.Input(data_model=Question)
outputs = await synalinks.Generator(
    data_model=Answer,
    language_model=language_model,
)(inputs)

program = synalinks.Program(inputs=inputs, outputs=outputs, name="my_program")

# Create Monitor callback for training
monitor = synalinks.callbacks.Monitor(
    tracking_uri="http://localhost:5000",
    experiment_name="training_runs",  # Training metrics + traces go here
    run_name="experiment_v1",
)

# Train - traces during fit() are associated with the training run
program.compile(reward=reward, optimizer=optimizer)
await program.fit(x=train_x, y=train_y, epochs=5, callbacks=[monitor])

After training, you'll have: - synalinks_traces experiment: Setup traces (symbolic module calls) - training_runs experiment: Training run with metrics, artifacts, and execution traces

Best Practices

Enable observability early in your script, before creating any modules
Use meaningful experiment names to organize your traces by project or feature
Use persistent storage (PostgreSQL) for production deployments
Set up retention policies to manage storage for long-running applications

Troubleshooting

No traces being created

If you don't see any traces in MLflow:

Check call order: Ensure enable_observability() is called before creating any modules

# Wrong - modules created before enabling observability
inputs = synalinks.Input(data_model=Question)
synalinks.enable_observability()  # Too late!

# Correct - enable first
synalinks.enable_observability()
inputs = synalinks.Input(data_model=Question)  # Now traces will be created

Verify observability is enabled: Check with synalinks.is_observability_enabled()
Check the correct experiment: During training, traces go to the training experiment, not the observability experiment

MLflow not receiving traces

Verify the MLflow server is running: curl http://localhost:5000/health
Check the tracking URI is correct
Ensure mlflow package is installed: pip install mlflow

Artifacts not showing in MLflow UI

If artifacts are uploaded but don't appear in the MLflow UI:

Check server configuration: Ensure the MLflow server is started with --serve-artifacts flag
Verify artifact root: The server must use --default-artifact-root mlflow-artifacts:/ for remote clients
Check permissions: The server needs write access to --artifacts-destination path

Correct server configuration:

mlflow server \
    --serve-artifacts \
    --default-artifact-root mlflow-artifacts:/ \
    --artifacts-destination /mlflow/artifacts

Common mistake - Missing --serve-artifacts causes clients to try writing directly to the server's local filesystem, resulting in permission errors like:

PermissionError: [Errno 13] Permission denied: '/mlflow'

Missing cost information

Cost tracking requires the language model to return usage information. Ensure your LLM provider supports this feature.