Knowledge graphs¶
When your data has entities and relationships, model it as a graph: entity concepts are the vertices, relationship concepts are the connections, and rules traverse the graph.
These are modeling conventions, not language rules — the compiler attaches no special meaning to a concept's name. They are the discipline that keeps a growing rule base composable, and the structure Synalog-based agent runtimes expect.
Nodes and edges are how the agent builds the knowledge-graph layer of its dynamic semantic layer: once entities and relationships are named, every later rule traverses them instead of re-joining raw tables, and a filter on one node propagates through the whole graph.
Conventions¶
- Nodes are entities, edges are relationships, rules are traversals.
- Primary key first — the first column of every concept is its primary key; sort by it with
@OrderBy. - Preserve URIs and URLs in nodes (
url,href,link,website,profile_url,image_url,permalink,homepage, …). Dropping them makes the concept useless for downstream action. - Edges join through nodes, not raw tables. This guarantees referential integrity: a filter on a node automatically applies to every edge that references it.
@OrderBy(Person, "person_id");
Person(person_id:, name:, role:) distinct :- Employees(person_id:, name:, role:);
@OrderBy(WorksIn, "person_id");
WorksIn(person_id:, department_id:) distinct :-
Person(person_id:),
Department(department_id:),
Employees(person_id:, department_id:);
Edge patterns¶
N-ary relationships¶
When more than two entities participate, include all of them as columns:
WorksOn(person_id:, project_id:, role:) distinct :-
Person(person_id:), Project(project_id:),
ProjectAssignments(person_id:, project_id:, role:);
Weighted edges¶
Attach a numeric attribute to the relationship — often an aggregate:
Purchased(customer_id:, product_id:, total_amount? += amount) distinct :-
Customer(customer_id:), Product(product_id:),
Orders(customer_id:, product_id:, amount:);
Types and statuses as separate concepts¶
When an entity or relationship has distinct categorical states, model one concept per state — ActiveUser, ChurnedUser, ActiveContract, TerminatedContract — each joined through the base node.
Symmetric edges¶
Define the raw direction once (e.g. with a < b), then close it with a union:
CoAuthored(author_a:, author_b:, paper_id:) distinct :-
CoAuthoredRaw(author_a:, author_b:, paper_id:) |
CoAuthoredRaw(author_a: author_b, author_b: author_a, paper_id:);
Inverse edges¶
Derive the opposite direction from an existing edge:
Edge composition¶
Chain different edge types: A→B via one relation and B→C via another gives A→C:
WorksWithClient(employee_id:, client_id:) distinct :-
MemberOf(employee_id:, team_id:),
EngagedWith(team_id:, client_id:);
Chains and paths¶
Recursion over a single edge type (parent→child, manager→employee) computes chains — see Recursion. To track the route rather than just the endpoints, accumulate it with List= or string concatenation in the recursive rule.
Cycle and cardinality checks¶
A recursive closure detects hierarchy cycles (example). For cardinality constraints, count children per parent and filter for violations:
ChildCount(parent_id:, n? += 1) distinct :- ParentOf(parent_id:, child_id:);
TooManyChildren(parent_id:, n:) :- ChildCount(parent_id:, n:), n > 2;
Temporal edges¶
Include start_date/end_date extracted via the temporal pipeline, then filter with Today for "active today" queries or use the overlap test s1 <= e2 && s2 <= e1.
Key principles¶
- Every edge joins through node concepts for referential integrity.
- Reuse aggressively — once nodes and edges exist, all rules build on them instead of going back to raw tables.
- The graph is the agent's memory: each new concept or rule extends what every later query can express.
Complete example¶
A small employee/team/client graph: nodes with primary keys and preserved URLs, edges joined through nodes, an inverse edge, and an edge composition:
# run: Person, Team, Client, MemberOf, EngagedWith, WorksWithClient, ReportsTo
@Engine("duckdb");
# Tables
Employees(person_id: 1, name: "Ada", role: "engineer", manager_id: 2);
Employees(person_id: 2, name: "Grace", role: "lead", manager_id: null);
Employees(person_id: 3, name: "Alan", role: "engineer", manager_id: 2);
TeamAssignments(person_id: 1, team_id: "core");
TeamAssignments(person_id: 2, team_id: "core");
TeamAssignments(person_id: 3, team_id: "platform");
Engagements(team_id: "core", client_id: "acme", website: "https://acme.example.com");
Engagements(team_id: "platform", client_id: "globex", website: "https://globex.example.com");
# Concepts
## Nodes: primary key first, URLs preserved.
@OrderBy(Person, "person_id");
Person(person_id:, name:, role:) distinct :- Employees(person_id:, name:, role:);
@OrderBy(Team, "team_id");
Team(team_id:) distinct :- TeamAssignments(team_id:);
@OrderBy(Client, "client_id");
Client(client_id:, website:) distinct :- Engagements(client_id:, website:);
## Edges join through nodes, not raw tables.
@OrderBy(MemberOf, "person_id");
MemberOf(person_id:, team_id:) distinct :-
Person(person_id:),
Team(team_id:),
TeamAssignments(person_id:, team_id:);
@OrderBy(EngagedWith, "team_id");
EngagedWith(team_id:, client_id:) distinct :-
Team(team_id:),
Client(client_id:),
Engagements(team_id:, client_id:);
## Inverse edge derived from the management relation.
@OrderBy(ReportsTo, "employee_id");
ReportsTo(employee_id:, manager_id:) distinct :-
Person(person_id: employee_id),
Person(person_id: manager_id),
Employees(person_id: employee_id, manager_id:), manager_id is not null;
# Rules
## Edge composition: person -> team -> client gives person -> client.
@OrderBy(WorksWithClient, "person_id");
WorksWithClient(person_id:, client_id:) distinct :-
MemberOf(person_id:, team_id:),
EngagedWith(team_id:, client_id:);
Generated SQL and execution results
$ synalog.check('knowledge_graphs.l')
No errors found.
$ synalog.compile('knowledge_graphs.l', 'Person')
-- Initializing DuckDB environment.
create schema if not exists logica_home;
-- Empty record, has to have a field by DuckDB syntax.
drop type if exists logicarecord893574736 cascade; create type logicarecord893574736 as struct(nirvana numeric);
create sequence if not exists eternal_logical_sequence;
WITH t_0_Employees AS (SELECT * FROM (
SELECT
1 AS person_id,
E'Ada' AS name,
E'engineer' AS role,
2 AS manager_id
UNION ALL
SELECT
2 AS person_id,
E'Grace' AS name,
E'lead' AS role,
null AS manager_id
UNION ALL
SELECT
3 AS person_id,
E'Alan' AS name,
E'engineer' AS role,
2 AS manager_id
) AS UNUSED_TABLE_NAME )
SELECT
Employees.person_id AS person_id,
Employees.name AS name,
Employees.role AS role
FROM
t_0_Employees AS Employees
GROUP BY Employees.person_id, Employees.name, Employees.role ORDER BY person_id;
-- Executed on DuckDB:
| person_id | name | role |
|-----------|-------|----------|
| 1 | Ada | engineer |
| 2 | Grace | lead |
| 3 | Alan | engineer |
(3 rows)
$ synalog.compile('knowledge_graphs.l', 'Team')
-- Initializing DuckDB environment.
create schema if not exists logica_home;
-- Empty record, has to have a field by DuckDB syntax.
drop type if exists logicarecord893574736 cascade; create type logicarecord893574736 as struct(nirvana numeric);
create sequence if not exists eternal_logical_sequence;
WITH t_0_TeamAssignments AS (SELECT * FROM (
SELECT
1 AS person_id,
E'core' AS team_id
UNION ALL
SELECT
2 AS person_id,
E'core' AS team_id
UNION ALL
SELECT
3 AS person_id,
E'platform' AS team_id
) AS UNUSED_TABLE_NAME )
SELECT
TeamAssignments.team_id AS team_id
FROM
t_0_TeamAssignments AS TeamAssignments
GROUP BY TeamAssignments.team_id ORDER BY team_id;
-- Executed on DuckDB:
| team_id |
|----------|
| core |
| platform |
(2 rows)
$ synalog.compile('knowledge_graphs.l', 'Client')
-- Initializing DuckDB environment.
create schema if not exists logica_home;
-- Empty record, has to have a field by DuckDB syntax.
drop type if exists logicarecord893574736 cascade; create type logicarecord893574736 as struct(nirvana numeric);
create sequence if not exists eternal_logical_sequence;
WITH t_0_Engagements AS (SELECT * FROM (
SELECT
E'core' AS team_id,
E'acme' AS client_id,
E'https://acme.example.com' AS website
UNION ALL
SELECT
E'platform' AS team_id,
E'globex' AS client_id,
E'https://globex.example.com' AS website
) AS UNUSED_TABLE_NAME )
SELECT
Engagements.client_id AS client_id,
Engagements.website AS website
FROM
t_0_Engagements AS Engagements
GROUP BY Engagements.client_id, Engagements.website ORDER BY client_id;
-- Executed on DuckDB:
| client_id | website |
|-----------|----------------------------|
| acme | https://acme.example.com |
| globex | https://globex.example.com |
(2 rows)
$ synalog.compile('knowledge_graphs.l', 'MemberOf')
-- Initializing DuckDB environment.
create schema if not exists logica_home;
-- Empty record, has to have a field by DuckDB syntax.
drop type if exists logicarecord893574736 cascade; create type logicarecord893574736 as struct(nirvana numeric);
create sequence if not exists eternal_logical_sequence;
WITH t_1_Employees AS (SELECT * FROM (
SELECT
1 AS person_id,
E'Ada' AS name,
E'engineer' AS role,
2 AS manager_id
UNION ALL
SELECT
2 AS person_id,
E'Grace' AS name,
E'lead' AS role,
null AS manager_id
UNION ALL
SELECT
3 AS person_id,
E'Alan' AS name,
E'engineer' AS role,
2 AS manager_id
) AS UNUSED_TABLE_NAME ),
t_0_Person AS (SELECT
Employees.person_id AS person_id,
Employees.name AS name,
Employees.role AS role
FROM
t_1_Employees AS Employees
GROUP BY Employees.person_id, Employees.name, Employees.role ORDER BY person_id),
t_4_TeamAssignments AS (SELECT * FROM (
SELECT
1 AS person_id,
E'core' AS team_id
UNION ALL
SELECT
2 AS person_id,
E'core' AS team_id
UNION ALL
SELECT
3 AS person_id,
E'platform' AS team_id
) AS UNUSED_TABLE_NAME ),
t_2_Team AS (SELECT
t_3_TeamAssignments.team_id AS team_id
FROM
t_4_TeamAssignments AS t_3_TeamAssignments
GROUP BY t_3_TeamAssignments.team_id ORDER BY team_id)
SELECT
Person.person_id AS person_id,
Team.team_id AS team_id
FROM
t_0_Person AS Person, t_2_Team AS Team, t_4_TeamAssignments AS TeamAssignments
WHERE
(TeamAssignments.person_id = Person.person_id) AND
(TeamAssignments.team_id = Team.team_id)
GROUP BY Person.person_id, Team.team_id ORDER BY person_id;
-- Executed on DuckDB:
| person_id | team_id |
|-----------|----------|
| 1 | core |
| 2 | core |
| 3 | platform |
(3 rows)
$ synalog.compile('knowledge_graphs.l', 'EngagedWith')
-- Initializing DuckDB environment.
create schema if not exists logica_home;
-- Empty record, has to have a field by DuckDB syntax.
drop type if exists logicarecord893574736 cascade; create type logicarecord893574736 as struct(nirvana numeric);
create sequence if not exists eternal_logical_sequence;
WITH t_1_TeamAssignments AS (SELECT * FROM (
SELECT
1 AS person_id,
E'core' AS team_id
UNION ALL
SELECT
2 AS person_id,
E'core' AS team_id
UNION ALL
SELECT
3 AS person_id,
E'platform' AS team_id
) AS UNUSED_TABLE_NAME ),
t_0_Team AS (SELECT
TeamAssignments.team_id AS team_id
FROM
t_1_TeamAssignments AS TeamAssignments
GROUP BY TeamAssignments.team_id ORDER BY team_id),
t_4_Engagements AS (SELECT * FROM (
SELECT
E'core' AS team_id,
E'acme' AS client_id,
E'https://acme.example.com' AS website
UNION ALL
SELECT
E'platform' AS team_id,
E'globex' AS client_id,
E'https://globex.example.com' AS website
) AS UNUSED_TABLE_NAME ),
t_2_Client AS (SELECT
t_3_Engagements.client_id AS client_id,
t_3_Engagements.website AS website
FROM
t_4_Engagements AS t_3_Engagements
GROUP BY t_3_Engagements.client_id, t_3_Engagements.website ORDER BY client_id)
SELECT
Team.team_id AS team_id,
Client.client_id AS client_id
FROM
t_0_Team AS Team, t_2_Client AS Client, t_4_Engagements AS Engagements
WHERE
(Engagements.team_id = Team.team_id) AND
(Engagements.client_id = Client.client_id)
GROUP BY Team.team_id, Client.client_id ORDER BY team_id;
-- Executed on DuckDB:
| team_id | client_id |
|----------|-----------|
| core | acme |
| platform | globex |
(2 rows)
$ synalog.compile('knowledge_graphs.l', 'WorksWithClient')
-- Initializing DuckDB environment.
create schema if not exists logica_home;
-- Empty record, has to have a field by DuckDB syntax.
drop type if exists logicarecord893574736 cascade; create type logicarecord893574736 as struct(nirvana numeric);
create sequence if not exists eternal_logical_sequence;
WITH t_2_Employees AS (SELECT * FROM (
SELECT
1 AS person_id,
E'Ada' AS name,
E'engineer' AS role,
2 AS manager_id
UNION ALL
SELECT
2 AS person_id,
E'Grace' AS name,
E'lead' AS role,
null AS manager_id
UNION ALL
SELECT
3 AS person_id,
E'Alan' AS name,
E'engineer' AS role,
2 AS manager_id
) AS UNUSED_TABLE_NAME ),
t_1_Person AS (SELECT
Employees.person_id AS person_id,
Employees.name AS name,
Employees.role AS role
FROM
t_2_Employees AS Employees
GROUP BY Employees.person_id, Employees.name, Employees.role ORDER BY person_id),
t_5_TeamAssignments AS (SELECT * FROM (
SELECT
1 AS person_id,
E'core' AS team_id
UNION ALL
SELECT
2 AS person_id,
E'core' AS team_id
UNION ALL
SELECT
3 AS person_id,
E'platform' AS team_id
) AS UNUSED_TABLE_NAME ),
t_3_Team AS (SELECT
t_4_TeamAssignments.team_id AS team_id
FROM
t_5_TeamAssignments AS t_4_TeamAssignments
GROUP BY t_4_TeamAssignments.team_id ORDER BY team_id),
t_0_MemberOf AS (SELECT
Person.person_id AS person_id,
Team.team_id AS team_id
FROM
t_1_Person AS Person, t_3_Team AS Team, t_5_TeamAssignments AS TeamAssignments
WHERE
(TeamAssignments.person_id = Person.person_id) AND
(TeamAssignments.team_id = Team.team_id)
GROUP BY Person.person_id, Team.team_id ORDER BY person_id),
t_11_Engagements AS (SELECT * FROM (
SELECT
E'core' AS team_id,
E'acme' AS client_id,
E'https://acme.example.com' AS website
UNION ALL
SELECT
E'platform' AS team_id,
E'globex' AS client_id,
E'https://globex.example.com' AS website
) AS UNUSED_TABLE_NAME ),
t_9_Client AS (SELECT
t_10_Engagements.client_id AS client_id,
t_10_Engagements.website AS website
FROM
t_11_Engagements AS t_10_Engagements
GROUP BY t_10_Engagements.client_id, t_10_Engagements.website ORDER BY client_id),
t_6_EngagedWith AS (SELECT
t_7_Team.team_id AS team_id,
Client.client_id AS client_id
FROM
t_3_Team AS t_7_Team, t_9_Client AS Client, t_11_Engagements AS Engagements
WHERE
(Engagements.team_id = t_7_Team.team_id) AND
(Engagements.client_id = Client.client_id)
GROUP BY t_7_Team.team_id, Client.client_id ORDER BY team_id)
SELECT
MemberOf.person_id AS person_id,
EngagedWith.client_id AS client_id
FROM
t_0_MemberOf AS MemberOf, t_6_EngagedWith AS EngagedWith
WHERE
(EngagedWith.team_id = MemberOf.team_id)
GROUP BY MemberOf.person_id, EngagedWith.client_id ORDER BY person_id;
-- Executed on DuckDB:
| person_id | client_id |
|-----------|-----------|
| 1 | acme |
| 2 | acme |
| 3 | globex |
(3 rows)
$ synalog.compile('knowledge_graphs.l', 'ReportsTo')
-- Initializing DuckDB environment.
create schema if not exists logica_home;
-- Empty record, has to have a field by DuckDB syntax.
drop type if exists logicarecord893574736 cascade; create type logicarecord893574736 as struct(nirvana numeric);
create sequence if not exists eternal_logical_sequence;
WITH t_3_Employees AS (SELECT * FROM (
SELECT
1 AS person_id,
E'Ada' AS name,
E'engineer' AS role,
2 AS manager_id
UNION ALL
SELECT
2 AS person_id,
E'Grace' AS name,
E'lead' AS role,
null AS manager_id
UNION ALL
SELECT
3 AS person_id,
E'Alan' AS name,
E'engineer' AS role,
2 AS manager_id
) AS UNUSED_TABLE_NAME ),
t_1_Person AS (SELECT
t_2_Employees.person_id AS person_id,
t_2_Employees.name AS name,
t_2_Employees.role AS role
FROM
t_3_Employees AS t_2_Employees
GROUP BY t_2_Employees.person_id, t_2_Employees.name, t_2_Employees.role ORDER BY person_id)
SELECT
Person.person_id AS employee_id,
t_0_Person.person_id AS manager_id
FROM
t_1_Person AS Person, t_1_Person AS t_0_Person, t_3_Employees AS Employees
WHERE
(t_0_Person.person_id IS NOT null) AND
(Employees.person_id = Person.person_id) AND
(Employees.manager_id = t_0_Person.person_id)
GROUP BY Person.person_id, t_0_Person.person_id ORDER BY employee_id;
-- Executed on DuckDB:
| employee_id | manager_id |
|-------------|------------|
| 1 | 2 |
| 3 | 2 |
(2 rows)
To turn an existing OWL/RDF ontology into a knowledge graph like this automatically, see Ontologies.