Skip to content

JSON Ops

JSON Operations

In Lesson 5a, you learned about data model operators (+, &, |, ^, ~). This lesson covers JSON operations - functions for transforming, filtering, and reshaping data models.

Categories of Operations

1. Masking Operations (Filtering Fields)

Operation Description Example
in_mask Keep only specified fields ops.in_mask(data, mask=["answer"])
out_mask Remove specified fields ops.out_mask(data, mask=["thinking"])

2. Renaming Operations

Operation Description Example
prefix Add prefix to field names ops.prefix(data, prefix="v1_")
suffix Add suffix to field names ops.suffix(data, suffix="_draft")

3. Aggregation Operations

Operation Description Example
factorize Group similar fields into lists ops.factorize(combined)

4. Logical Operations (Function Form)

Operation Equivalent Description
ops.concat + Merge fields with custom naming
ops.logical_and & Safe merge
ops.logical_or | First non-None
ops.logical_xor ^ Exactly one non-None

Why Use These Operations?

graph LR
    subgraph Before
        A[thinking: ...<br/>answer: 42]
    end
    subgraph in_mask
        B[answer: 42]
    end
    A -->|in_mask| B
  1. Data Preparation: Transform data before passing to next module
  2. Field Selection: Keep only relevant fields for downstream processing
  3. Conflict Resolution: Rename fields to avoid collisions when merging
  4. Aggregation: Combine multiple similar outputs into lists

Complete Example: Filtering Fields

import asyncio
from dotenv import load_dotenv
import synalinks

class Query(synalinks.DataModel):
    query: str = synalinks.Field(description="The user query")

class AnswerWithThinking(synalinks.DataModel):
    thinking: str = synalinks.Field(description="Your step by step thinking")
    answer: str = synalinks.Field(description="The correct answer")

async def main():
    load_dotenv()
    language_model = synalinks.LanguageModel(model="openai/gpt-4.1")

    inputs = synalinks.Input(data_model=Query)
    x = await synalinks.Generator(
        data_model=AnswerWithThinking,
        language_model=language_model,
    )(inputs)

    # Keep only the "answer" field, discard "thinking"
    outputs = await synalinks.ops.in_mask(x, mask=["answer"])

    program = synalinks.Program(inputs=inputs, outputs=outputs)

    result = await program(Query(query="What is 2 + 2?"))
    print(f"Fields: {list(result.keys())}")  # Only ['answer']

asyncio.run(main())

Key Takeaways

  • in_mask: Keep only specified fields from a data model. Useful for filtering out intermediate fields like "thinking".
  • out_mask: Remove specified fields, keeping all others.
  • prefix/suffix: Add constant text before/after field values.
  • factorize: Split a data model into multiple single-field data models for independent processing.
  • Training Integration: Use masks to evaluate only relevant fields when computing rewards during training.

Program Visualizations

in_mask_example factorize_example

API References

Answer

Bases: DataModel

A simple answer.

Source code in examples/5b_json_ops.py
class Answer(synalinks.DataModel):
    """A simple answer."""

    answer: str = synalinks.Field(description="The correct answer")

AnswerWithThinking

Bases: DataModel

An answer with reasoning.

Source code in examples/5b_json_ops.py
class AnswerWithThinking(synalinks.DataModel):
    """An answer with reasoning."""

    thinking: str = synalinks.Field(description="Your step by step thinking")
    answer: str = synalinks.Field(description="The correct answer")

Query

Bases: DataModel

A user query.

Source code in examples/5b_json_ops.py
class Query(synalinks.DataModel):
    """A user query."""

    query: str = synalinks.Field(description="The user query")