CypherAgent module

`CypherAgent`

Bases: FunctionCallingAgent

A ready-to-use Cypher agent backed by a knowledge base.

CypherAgent is a thin specialization of FunctionCallingAgent that pre-wires three Cypher tools bound to a KnowledgeBase (graph adapter):

get_graph_schema: discovers all node and relation labels with their properties.
get_node_sample: fetches a few nodes from a given label so the LM can see the data shape before writing queries.
run_cypher_query: executes a read-only Cypher query via KnowledgeBase.cypher with read_only=True.

The constructor mirrors FunctionCallingAgent — every parameter on that class is accepted here with identical semantics. The only additions are knowledge_base (required, must expose a graph adapter) and output_format (controls the Cypher tools' result rendering). User-supplied tools are appended to the three built-in tools.

Safety is enforced by the knowledge base, not by string filtering. The Ladybug adapter scans the query (after stripping comments and string literals) for write/admin keywords and rejects anything that could mutate state or access files (CREATE, MERGE, SET, DELETE, DETACH, REMOVE, DROP, ALTER, COPY, INSTALL, LOAD).

Example:

import synalinks
import asyncio

class Person(synalinks.Entity):
    name: str = synalinks.Field(description="Person name")

class City(synalinks.Entity):
    name: str = synalinks.Field(description="City name")

class LivesIn(synalinks.Relation):
    subj: Person
    obj: City

class Query(synalinks.DataModel):
    query: str = synalinks.Field(description="Natural language question")

class CypherAnswer(synalinks.DataModel):
    answer: str = synalinks.Field(description="Answer in natural language")
    cypher_query: str = synalinks.Field(description="Cypher that produced it")

async def main():
    kb = synalinks.KnowledgeBase(
        graph_uri="ladybug://my_graph.lb",
        entity_models=[Person, City],
        relation_models=[LivesIn],
        embedding_model=synalinks.EmbeddingModel(model="ollama/mxbai-embed-large"),
    )
    await kb.update_relations(
        LivesIn(subj=Person(name="Alice"), obj=City(name="Paris"))
    )

    lm = synalinks.LanguageModel(model="ollama/mistral")

    inputs = synalinks.Input(data_model=Query)
    outputs = await synalinks.CypherAgent(
        knowledge_base=kb,
        language_model=lm,
        data_model=CypherAnswer,
    )(inputs)
    agent = synalinks.Program(inputs=inputs, outputs=outputs)

    result = await agent(Query(query="Who lives in Paris?"))
    print(result.get("answer"))
    print(result.get("cypher_query"))

asyncio.run(main())

Parameters:

Name	Type	Description	Default
`knowledge_base`	`KnowledgeBase`	The knowledge base to query. Must have a graph adapter attached (i.e. constructed with `graph_uri=...`). Required.	`None`
`k`	`int`	Maximum page size (rows per call) the LM can pull through `get_node_sample` and `run_cypher_query`. `get_node_sample` clamps the LM's `limit` argument to `min(limit, k)`. `run_cypher_query` post-caps the engine result to `k` rows so even unbounded `MATCH (n) RETURN n` queries can't drain a large label into the conversation. A `LIMIT` inside the LM's own query still applies first. Defaults to 50.	`50`
`output_format`	`str`	How the Cypher tools render result sets to the LM. `"csv"` (default) is compact and minimizes input tokens; `"json"` returns a list of dicts. Applies to both `get_node_sample` and `run_cypher_query`.	`'csv'`
`tools`	`list`	Additional `Tool` instances (or plain async functions) to expose alongside the three built-in Cypher tools — for example a calculator, a datetime helper, a web-search tool. Tool names must not collide with the built-ins (`get_graph_schema`, `get_node_sample`, `run_cypher_query`) or a `ValueError` is raised.	`None`
`schema`	`dict`	JSON schema for the final answer.	`None`
`data_model`	`DataModel`	DataModel for the final answer. Mutually exclusive with `schema`.	`None`
`language_model`	`LanguageModel`	The language model that drives the agent loop.	`None`
`prompt_template`	`str`	Forwarded to the tool-call generator.	`None`
`examples`	`list`	Few-shot examples for the tool-call generator.	`None`
`instructions`	`str`	Override the default system instructions. When omitted, the default is built from the knowledge base's node and relation labels so the LM knows what's available without an extra schema call.	`None`
`final_instructions`	`str`	Instructions for the final-answer generator. Defaults to `instructions`.	`None`
`temperature`	`float`	LM sampling temperature. Defaults to None (the model's own default applies). for deterministic Cypher generation.	`None`
`max_tokens`	`int`	Optional. Maximum number of tokens to generate. Default None (the model's own default; caps generation length).	`None`
`top_p`	`float`	Optional. Nucleus sampling probability. Default None (the model's own default).	`None`
`top_k`	`int`	Optional. Top-k sampling cutoff. Default None (the model's own default).	`None`
`use_inputs_schema`	`bool`	Include the input schema in the prompt.	`False`
`use_outputs_schema`	`bool`	Include the output schema in the prompt.	`False`
`reasoning_effort`	`str`	Forwarded to the generators (for reasoning-capable LMs).	`None`
`use_chain_of_thought`	`bool`	When `True`, the tool-call generator emits a `thinking` field per round.	`False`
`autonomous`	`bool`	When `True` (default), the agent runs the tool loop end-to-end. When `False`, returns one step at a time for human-in-the-loop workflows.	`True`
`return_inputs_with_trajectory`	`bool`	When `True` (default), the full message trajectory is included alongside the final answer.	`True`
`max_iterations`	`int`	Maximum number of tool-call rounds. Defaults to 5.	`5`
`streaming`	`bool`	Stream the final answer when no `schema` is set. Defaults to `False`.	`False`
`workdir`	`str`	Optional. Path to a working directory. When it contains an `AGENTS.md` file, its contents are injected as an additional input so the agent follows the declared project conventions. Defaults to `None`.	`None`
`skills`	`list`	Optional. Folder paths (Agent Skill roots) whose skills are listed for the agent as an `<available_skills>` context message (see `FunctionCallingAgent`). Defaults to `None`.	`None`
`name`	`str`	Module name.	`None`
`description`	`str`	Module description.	`None`

Source code in synalinks/src/modules/agents/cypher_agent.py

@synalinks_export(
    [
        "synalinks.modules.CypherAgent",
        "synalinks.CypherAgent",
    ]
)
class CypherAgent(FunctionCallingAgent):
    """A ready-to-use Cypher agent backed by a knowledge base.

    CypherAgent is a thin specialization of `FunctionCallingAgent`
    that pre-wires three Cypher tools bound to a `KnowledgeBase`
    (graph adapter):

    - ``get_graph_schema``: discovers all node and relation labels
      with their properties.
    - ``get_node_sample``: fetches a few nodes from a given label so
      the LM can see the data shape before writing queries.
    - ``run_cypher_query``: executes a read-only Cypher query via
      `KnowledgeBase.cypher` with ``read_only=True``.

    The constructor mirrors `FunctionCallingAgent` — every
    parameter on that class is accepted here with identical semantics.
    The only additions are ``knowledge_base`` (required, must expose a
    graph adapter) and ``output_format`` (controls the Cypher tools'
    result rendering). User-supplied ``tools`` are appended to the
    three built-in tools.

    Safety is enforced by the knowledge base, not by string filtering.
    The Ladybug adapter scans the query (after stripping comments and
    string literals) for write/admin keywords and rejects anything
    that could mutate state or access files (``CREATE``, ``MERGE``,
    ``SET``, ``DELETE``, ``DETACH``, ``REMOVE``, ``DROP``, ``ALTER``,
    ``COPY``, ``INSTALL``, ``LOAD``).

    Example:

    ```python
    import synalinks
    import asyncio

    class Person(synalinks.Entity):
        name: str = synalinks.Field(description="Person name")

    class City(synalinks.Entity):
        name: str = synalinks.Field(description="City name")

    class LivesIn(synalinks.Relation):
        subj: Person
        obj: City

    class Query(synalinks.DataModel):
        query: str = synalinks.Field(description="Natural language question")

    class CypherAnswer(synalinks.DataModel):
        answer: str = synalinks.Field(description="Answer in natural language")
        cypher_query: str = synalinks.Field(description="Cypher that produced it")

    async def main():
        kb = synalinks.KnowledgeBase(
            graph_uri="ladybug://my_graph.lb",
            entity_models=[Person, City],
            relation_models=[LivesIn],
            embedding_model=synalinks.EmbeddingModel(model="ollama/mxbai-embed-large"),
        )
        await kb.update_relations(
            LivesIn(subj=Person(name="Alice"), obj=City(name="Paris"))
        )

        lm = synalinks.LanguageModel(model="ollama/mistral")

        inputs = synalinks.Input(data_model=Query)
        outputs = await synalinks.CypherAgent(
            knowledge_base=kb,
            language_model=lm,
            data_model=CypherAnswer,
        )(inputs)
        agent = synalinks.Program(inputs=inputs, outputs=outputs)

        result = await agent(Query(query="Who lives in Paris?"))
        print(result.get("answer"))
        print(result.get("cypher_query"))

    asyncio.run(main())
    ```

    Args:
        knowledge_base (KnowledgeBase): The knowledge base to query.
            Must have a graph adapter attached (i.e. constructed with
            ``graph_uri=...``). Required.
        k (int): Maximum page size (rows per call) the LM can pull
            through ``get_node_sample`` and ``run_cypher_query``.
            ``get_node_sample`` clamps the LM's ``limit`` argument
            to ``min(limit, k)``. ``run_cypher_query`` post-caps the
            engine result to ``k`` rows so even unbounded
            ``MATCH (n) RETURN n`` queries can't drain a large label
            into the conversation. A ``LIMIT`` inside the LM's own
            query still applies first. Defaults to 50.
        output_format (str): How the Cypher tools render result sets
            to the LM. ``"csv"`` (default) is compact and minimizes
            input tokens; ``"json"`` returns a list of dicts. Applies
            to both ``get_node_sample`` and ``run_cypher_query``.
        tools (list): Additional `Tool` instances (or plain
            async functions) to expose alongside the three built-in
            Cypher tools — for example a calculator, a datetime
            helper, a web-search tool. Tool names must not collide
            with the built-ins (``get_graph_schema``, ``get_node_sample``,
            ``run_cypher_query``) or a ``ValueError`` is raised.
        schema (dict): JSON schema for the final answer.
        data_model (DataModel): DataModel for the final answer.
            Mutually exclusive with ``schema``.
        language_model (LanguageModel): The language model that drives
            the agent loop.
        prompt_template (str): Forwarded to the tool-call generator.
        examples (list): Few-shot examples for the tool-call generator.
        instructions (str): Override the default system instructions.
            When omitted, the default is built from the knowledge
            base's node and relation labels so the LM knows what's
            available without an extra schema call.
        final_instructions (str): Instructions for the final-answer
            generator. Defaults to ``instructions``.
        temperature (float): LM sampling temperature. Defaults to None (the model's own default applies).
            for deterministic Cypher generation.
        max_tokens (int): Optional. Maximum number of tokens to generate.
            Default None (the model's own default; caps generation length).
        top_p (float): Optional. Nucleus sampling probability. Default None
            (the model's own default).
        top_k (int): Optional. Top-k sampling cutoff. Default None (the
            model's own default).
        use_inputs_schema (bool): Include the input schema in the
            prompt.
        use_outputs_schema (bool): Include the output schema in the
            prompt.
        reasoning_effort (str): Forwarded to the generators (for
            reasoning-capable LMs).
        use_chain_of_thought (bool): When ``True``, the tool-call
            generator emits a ``thinking`` field per round.
        autonomous (bool): When ``True`` (default), the agent runs the
            tool loop end-to-end. When ``False``, returns one step at
            a time for human-in-the-loop workflows.
        return_inputs_with_trajectory (bool): When ``True`` (default),
            the full message trajectory is included alongside the
            final answer.
        max_iterations (int): Maximum number of tool-call rounds.
            Defaults to 5.
        streaming (bool): Stream the final answer when no ``schema``
            is set. Defaults to ``False``.
        workdir (str): Optional. Path to a working directory. When it contains an
            ``AGENTS.md`` file, its contents are injected as an additional input
            so the agent follows the declared project conventions. Defaults to
            ``None``.
        skills (list): Optional. Folder paths (Agent Skill roots) whose skills
            are listed for the agent as an ``<available_skills>`` context message
            (see `FunctionCallingAgent`). Defaults to ``None``.
        name (str): Module name.
        description (str): Module description.
    """

    def __init__(
        self,
        *,
        knowledge_base=None,
        k: int = 50,
        output_format: str = "csv",
        tools: Optional[List] = None,
        schema=None,
        data_model=None,
        language_model=None,
        prompt_template=None,
        examples=None,
        instructions: Optional[str] = None,
        final_instructions: Optional[str] = None,
        temperature: float | None = None,
        max_tokens: int | None = None,
        top_p: float | None = None,
        top_k: int | None = None,
        use_inputs_schema: bool = False,
        use_outputs_schema: bool = False,
        reasoning_effort: Optional[str] = None,
        use_chain_of_thought: bool = False,
        autonomous: bool = True,
        return_inputs_with_trajectory: bool = True,
        max_iterations: int = 5,
        streaming: bool = False,
        workdir: Optional[str] = None,
        skills=None,
        name: Optional[str] = None,
        description: Optional[str] = None,
    ):
        if knowledge_base is None:
            raise ValueError(
                "`knowledge_base` is required for CypherAgent: pass a "
                "KnowledgeBase with a graph adapter (graph_uri=...)."
            )
        # Domain attributes the `_get_builtin_tools` hook depends on must be set
        # before `super().__init__()` (which calls the hook).
        self.knowledge_base = _get_kb(knowledge_base)
        # Fail fast if the KB has no graph adapter — the tools all
        # call graph methods, so a SQL-only KB would only error at
        # tool-invocation time inside the agent loop.
        self.knowledge_base._require_graph_adapter()

        if output_format not in ("csv", "json"):
            raise ValueError(
                f"`output_format` must be 'csv' or 'json', got {output_format!r}"
            )
        self.output_format = output_format

        if not isinstance(k, int) or k < 1:
            raise ValueError(f"`k` must be a positive integer, got {k!r}")
        self.k = k

        if instructions is None:
            node_labels = [
                m.get_schema().get("title", "Unknown")
                for m in self.knowledge_base.get_symbolic_entities()
            ]
            rel_labels = [
                m.get_schema().get("title", "Unknown")
                for m in self.knowledge_base.get_symbolic_relations()
            ]
            instructions = get_default_instructions(node_labels, rel_labels)

        super().__init__(
            schema=schema,
            data_model=data_model,
            language_model=language_model,
            prompt_template=prompt_template,
            examples=examples,
            instructions=instructions,
            final_instructions=final_instructions,
            temperature=temperature,
            max_tokens=max_tokens,
            top_p=top_p,
            top_k=top_k,
            use_inputs_schema=use_inputs_schema,
            use_outputs_schema=use_outputs_schema,
            reasoning_effort=reasoning_effort,
            use_chain_of_thought=use_chain_of_thought,
            tools=tools,
            autonomous=autonomous,
            return_inputs_with_trajectory=return_inputs_with_trajectory,
            max_iterations=max_iterations,
            streaming=streaming,
            workdir=workdir,
            skills=skills,
            name=name,
            description=description,
        )

    def _get_builtin_tools(self):
        return [
            Tool(fn)
            for fn in _build_tools(
                self.knowledge_base,
                output_format=self.output_format,
                k=self.k,
            )
        ]

    def _builtin_tool_kind(self):
        return "Cypher"

    def get_config(self):
        config = super().get_config()
        config.update(
            {
                "k": self.k,
                "output_format": self.output_format,
                "knowledge_base": serialization_lib.serialize_synalinks_object(
                    self.knowledge_base,
                ),
            }
        )
        return config

    @classmethod
    def from_config(cls, config):
        config = dict(config)
        config["knowledge_base"] = serialization_lib.deserialize_synalinks_object(
            config.pop("knowledge_base")
        )
        return super().from_config(config)

`get_default_instructions(node_labels, rel_labels)`

Default instructions for the Cypher agent.

Parameters:

Name	Type	Description	Default
`node_labels`	`List[str]`	PascalCase node labels available in the graph.	required
`rel_labels`	`List[str]`	PascalCase relation labels available in the graph. Both are embedded in the prompt so the LM doesn't have to call `get_graph_schema` first for trivial lookups.	required

Returns:

Type	Description
`str`	A prompt string giving the LM the tool-use plan and the
`str`	read-only Cypher safety constraint.

Source code in synalinks/src/modules/agents/cypher_agent.py

def get_default_instructions(node_labels: List[str], rel_labels: List[str]) -> str:
    """Default instructions for the Cypher agent.

    Args:
        node_labels: PascalCase node labels available in the graph.
        rel_labels: PascalCase relation labels available in the graph.
            Both are embedded in the prompt so the LM doesn't have to
            call ``get_graph_schema`` first for trivial lookups.

    Returns:
        A prompt string giving the LM the tool-use plan and the
        read-only Cypher safety constraint.
    """
    return f"""
You are a graph analyst with read-only access to a knowledge graph.

Available node labels: {node_labels}
Available relation labels: {rel_labels}

Plan:
1. If you don't already know the schema, call `get_graph_schema` first.
2. When you need to inspect representative nodes, call `get_node_sample`.
3. Build a single `MATCH ... RETURN` Cypher query and execute it with
   `run_cypher_query`. Iterate on the query (read the error, fix the
   Cypher, retry) until you have the data.
4. Once you have an answer, stop calling tools and produce the final response.

Constraints:
- Only read-only Cypher is accepted. `CREATE`, `MERGE`, `SET`,
  `DELETE`, `DETACH DELETE`, `REMOVE`, `DROP`, `ALTER`, `COPY`,
  `INSTALL`, and `LOAD` are rejected by the engine — don't waste turns
  trying them.
- Node and relation labels are case-sensitive PascalCase (e.g.
  ``Person``, ``LivesIn``); property names are snake_case.
- Result sets are automatically capped server-side. If the result
  shows ``may_have_more=true``, refine the query (add filters or
  ``ORDER BY ... LIMIT n``) rather than asking for more rows.
""".strip()