Skip to content

OMEGA

OMEGA

Bases: EvolutionaryOptimizer

OMEGA: OptiMizEr as Genetic Algorithm.

A genetic optimizer with dominated novelty search.

This optimizer is unique to Synalinks and the result of our research effort on advancing neuro-symbolic AI.

Dominated Novelty Search (DNS), is a SOTA Quality-Diversity optimization method that implements a competition function in a classic genetic algorithm.

The key insight behind Dominated Novelty Search is that candidates should be eliminated from the population if they are both:

  • Inferior in reward/fitness
  • Similar to existing candidates/solutions

This algorithm creates an evolutionary pressure to focus on high performing candidates Or candidates that explore other approaches.

This approach only add one step to the traditional genetic algorithm and outperform MAP-Elites, Threshold-Elites and Cluster-Elites.

This allow the system to explore the search space more quickly by eliminating non-promising candidates while preserving diversity to avoid local optimum.

At Synalinks, we adapted this algorithm for LM-based optimization, to do so we use an embedding model to compute the candidate's descriptor and a cosine distance between solutions.

Note: In Synalinks, unlike other In-Context learning frameworks, a variable (the module's state to optimize) is a JSON object not a simple string. Which has multiple implications, we maintain a 100% correct structure through constrained JSON decoding, and we allow the state to have variable/dynamic number of fields, which is handled by this approach by embedding each field and averaging them before computing the distance required by DNS.

Example:

import synalinks
import asyncio

async def main():
    # ... your program definition

    program.compile(
        reward=synalinks.rewards.ExactMatch(),
        optimizer=synalinks.optimizers.OMEGA(
            language_model=language_model,
            embedding_model=embedding_model,
        )
    )

    history = await program.fit(...)

Concerning the inspirations for this optimizer
  • Dominated Novelty Search for their elegant Quality-Diversity algorithm that outperform many other evolutionary strategies.
  • DSPY's GEPA for feeding the optimizer program with the raw training data and for formalizing the evolutionary optimization strategy (NOT the MAP-Elites method used).
  • DeepMind's AlphaEvolve have been a huge inspiration, more on the motivational side as they didn't released the code.
References
  • Dominated Novelty Search: Rethinking Local Competition in Quality-Diversity (https://arxiv.org/html/2502.00593v1)
  • GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning (https://arxiv.org/pdf/2507.19457)
  • AlphaEvolve: A coding agent for scientific and algorithmic discovery (https://arxiv.org/pdf/2506.13131)

Parameters:

Name Type Description Default
instructions str

Additional instructions about the task for the optimizer.

None
language_model LanguageModel

The language model to use.

None
embedding_model EmbeddingModel

The embedding model to use to compute candidates descriptors according to Dominated Novelty Search.

None
k_nearest_fitter int

The K nearest fitter used by Dominated Novelty Search.

5
distance_function callable

Optional. The distance function to use by Dominated Novelty Search. If no function is provided, use the default cosine distance.

None
mutation_temperature float

The temperature for the LM calls of the mutation programs.

0.3
crossover_temperature float

The temperature for the LM calls of the crossover programs.

0.3
reasoning_effort string

Optional. The reasoning effort for the LM call between ['minimal', 'low', 'medium', 'high', 'disable', 'none', None]. Default to None (no reasoning).

None
algorithm str

The mechanism to use for the genetic algorithm between ['ga', 'dns']. This parameter is provided for ablation studies and shouldn't be modified. (Default to 'dns').

'dns'
selection str

The method to select the candidate to evolve at the beginning of a batch between ['random', 'best', 'softmax']. (Default to 'softmax').

'softmax'
selection_temperature float

The temperature for softmax selection. Used only when selection='softmax'. Lower values concentrate selection on high-reward candidates, higher values make selection more uniform (Default 0.3).

0.3
merging_rate float

Rate at which crossover vs mutation is selected. (Default to 0.02).

0.02
population_size int

The maximum number of best candidates to keep during the optimization process.

10
name str

Optional name for the optimizer instance.

None
description str

Optional description of the optimizer instance.

None
Source code in synalinks/src/optimizers/omega.py
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
@synalinks_export("synalinks.optimizers.OMEGA")
class OMEGA(EvolutionaryOptimizer):
    """OMEGA: OptiMizEr as Genetic Algorithm.

    A genetic optimizer with dominated novelty search.

    This optimizer is **unique to Synalinks** and the result of our research
    effort on advancing neuro-symbolic AI.

    Dominated Novelty Search (DNS), is a SOTA Quality-Diversity optimization
    method that implements a competition function in a classic genetic
    algorithm.

    The key insight behind Dominated Novelty Search is that candidates should
    be eliminated from the population if they are both:

    - Inferior in reward/fitness
    - Similar to existing candidates/solutions

    This algorithm creates an evolutionary pressure to focus on high performing
    candidates **Or** candidates that explore other approaches.

    This approach only add one step to the traditional genetic algorithm and
    *outperform* MAP-Elites, Threshold-Elites and Cluster-Elites.

    This allow the system to explore the search space more quickly by
    eliminating non-promising candidates while preserving diversity to avoid
    local optimum.

    At Synalinks, we adapted this algorithm for LM-based optimization, to do
    so we use an embedding model to compute the candidate's descriptor and a
    cosine distance between solutions.

    **Note**: In Synalinks, unlike other In-Context learning frameworks, a
    variable (the module's state to optimize) is a JSON object not a simple
    string. Which has multiple implications, we maintain a 100% correct
    structure through constrained JSON decoding, and we allow the state to
    have variable/dynamic number of fields, which is handled by this approach
    by embedding each field and averaging them before computing the distance
    required by DNS.

    Example:
    ```
    import synalinks
    import asyncio

    async def main():
        # ... your program definition

        program.compile(
            reward=synalinks.rewards.ExactMatch(),
            optimizer=synalinks.optimizers.OMEGA(
                language_model=language_model,
                embedding_model=embedding_model,
            )
        )

        history = await program.fit(...)
    ```

    Concerning the inspirations for this optimizer:
        - Dominated Novelty Search for their elegant Quality-Diversity
          algorithm that outperform many other evolutionary strategies.
        - DSPY's GEPA for feeding the optimizer program with the raw training
          data and for formalizing the evolutionary optimization strategy
          (**NOT** the MAP-Elites method used).
        - DeepMind's AlphaEvolve have been a huge inspiration, more on the
          motivational side as they didn't released the code.

    References:
        - Dominated Novelty Search: Rethinking Local Competition in
          Quality-Diversity (https://arxiv.org/html/2502.00593v1)
        - GEPA: Reflective Prompt Evolution Can Outperform Reinforcement
          Learning (https://arxiv.org/pdf/2507.19457)
        - AlphaEvolve: A coding agent for scientific and algorithmic
          discovery (https://arxiv.org/pdf/2506.13131)

    Args:
        instructions (str): Additional instructions about the task for the
            optimizer.
        language_model (LanguageModel): The language model to use.
        embedding_model (EmbeddingModel): The embedding model to use to
            compute candidates descriptors according to Dominated Novelty
            Search.
        k_nearest_fitter (int): The K nearest fitter used by Dominated
            Novelty Search.
        distance_function (callable): Optional. The distance function to use
            by Dominated Novelty Search. If no function is provided, use
            the default cosine distance.
        mutation_temperature (float): The temperature for the LM calls of
            the mutation programs.
        crossover_temperature (float): The temperature for the LM calls of
            the crossover programs.
        reasoning_effort (string): Optional. The reasoning effort for the LM call
            between ['minimal', 'low', 'medium', 'high', 'disable', 'none', None].
            Default to None (no reasoning).
        algorithm (str): The mechanism to use for the genetic algorithm
            between ['ga', 'dns']. This parameter is provided for ablation
            studies and shouldn't be modified. (Default to 'dns').
        selection (str): The method to select the candidate to evolve at the
            beginning of a batch between ['random', 'best', 'softmax'].
            (Default to 'softmax').
        selection_temperature (float): The temperature for softmax selection.
            Used only when `selection='softmax'`. Lower values concentrate
            selection on high-reward candidates, higher values make selection
            more uniform (Default 0.3).
        merging_rate (float): Rate at which crossover vs mutation is selected.
            (Default to 0.02).
        population_size (int): The maximum number of best candidates to keep
            during the optimization process.
        name (str): Optional name for the optimizer instance.
        description (str): Optional description of the optimizer instance.
    """

    def __init__(
        self,
        instructions=None,
        language_model=None,
        embedding_model=None,
        k_nearest_fitter=5,
        distance_function=None,
        mutation_temperature=0.3,
        crossover_temperature=0.3,
        reasoning_effort=None,
        merging_rate=0.02,
        algorithm="dns",
        selection="softmax",
        selection_temperature=0.3,
        population_size=10,
        name=None,
        description=None,
        **kwargs,
    ):
        super().__init__(
            language_model=language_model,
            mutation_temperature=mutation_temperature,
            crossover_temperature=crossover_temperature,
            selection=selection,
            selection_temperature=selection_temperature,
            merging_rate=merging_rate,
            population_size=population_size,
            name=name,
            description=description,
            **kwargs,
        )
        if not instructions:
            instructions = ""
        self.instructions = instructions
        self.reasoning_effort = reasoning_effort

        # DNS-specific parameters
        self.embedding_model = embedding_model
        self.k_nearest_fitter = k_nearest_fitter
        self.distance_function = distance_function

        algorithms = ["ga", "dns"]
        if algorithm not in algorithms:
            raise ValueError(f"Parameter `algorithm` should be between {algorithms}")
        self.algorithm = algorithm

    async def build(self, trainable_variables):
        """
        Build the optimizer programs based on the trainable variables.

        Args:
            trainable_variables (list): List of variables that will be optimized
        """
        for trainable_variable in trainable_variables:
            schema_id = id(trainable_variable.get_schema())
            mask = list(Trainable.keys())
            symbolic_variable = trainable_variable.to_symbolic_data_model().out_mask(
                mask=mask
            )

            if schema_id not in self.mutation_programs:
                inputs = Input(data_model=MutationInputs)
                outputs = await ChainOfThought(
                    data_model=symbolic_variable,
                    language_model=self.language_model,
                    temperature=self.mutation_temperature,
                    reasoning_effort=self.reasoning_effort,
                    instructions=(
                        "\n".join(
                            [
                                base_instructions(),
                                mutation_instructions(list(symbolic_variable.keys())),
                            ]
                        )
                        if not self.instructions
                        else "\n".join(
                            [
                                self.instructions,
                                base_instructions(),
                                mutation_instructions(list(symbolic_variable.keys())),
                            ]
                        )
                    ),
                    name=f"mutation_cot_{schema_id}_" + self.name,
                )(inputs)
                outputs = outputs.in_mask(mask=list(symbolic_variable.keys()))
                program = Program(
                    inputs=inputs,
                    outputs=outputs,
                    name=f"mutation_{schema_id}_" + self.name,
                    description="The mutation program that fix/optimize variables",
                )
                self.mutation_programs[schema_id] = program

            if schema_id not in self.crossover_programs:
                inputs = Input(data_model=CrossoverInputs)
                outputs = await ChainOfThought(
                    data_model=symbolic_variable,
                    language_model=self.language_model,
                    temperature=self.crossover_temperature,
                    reasoning_effort=self.reasoning_effort,
                    instructions=(
                        "\n".join(
                            [
                                base_instructions(),
                                crossover_instructions(list(symbolic_variable.keys())),
                            ]
                        )
                        if not self.instructions
                        else "\n".join(
                            [
                                self.instructions,
                                base_instructions(),
                                crossover_instructions(list(symbolic_variable.keys())),
                            ]
                        )
                    ),
                    name=f"crossover_cot_{schema_id}_" + self.name,
                )(inputs)
                outputs = outputs.in_mask(mask=list(symbolic_variable.keys()))
                program = Program(
                    inputs=inputs,
                    outputs=outputs,
                    name=f"crossover_{schema_id}_" + self.name,
                    description="Crossover program combining high performing variables",
                )
                self.crossover_programs[schema_id] = program

        self.built = True

    async def mutate_candidate(
        self,
        step: int,
        trainable_variable: "Variable",
        selected_candidate: Dict[str, Any],
        x: Optional[List[Any]] = None,
        y: Optional[List[Any]] = None,
        y_pred: Optional[List[Any]] = None,
        training: bool = False,
    ) -> Dict[str, Any]:
        """Apply mutation to generate a new candidate using LLM.

        Creates mutation inputs from the selected candidate and training data,
        then calls the mutation program to generate an optimized variant.

        Args:
            step (int): The current training step
            trainable_variable (Variable): The trainable variable (for metadata access)
            selected_candidate (dict): The selected candidate to mutate
            x (list): Input data batch
            y (list): Ground truth data batch
            y_pred (list): Predicted outputs from the current model
            training (bool): Whether in training mode

        Returns:
            dict: The mutated candidate from the mutation program
        """
        mask = list(Trainable.keys())
        schema_id = id(trainable_variable.get_schema())
        masked_variable = out_mask_json(
            selected_candidate,
            mask=mask,
        )
        inputs = MutationInputs(
            program_description=self.program.description,
            program_inputs=[inp.get_json() for inp in x],
            program_predicted_outputs=[
                pred.get_json() if pred else None for pred in y_pred
            ],
            program_ground_truth=([gt.get_json() for gt in y] if y is not None else []),
            variable_description=trainable_variable.description,
            current_variable=masked_variable,
        )
        program = self.mutation_programs[schema_id]
        return await program(inputs, training=training)

    async def merge_candidate(
        self,
        step: int,
        trainable_variable: "Variable",
        current_candidate: Dict[str, Any],
        other_candidate: Dict[str, Any],
        x: Optional[List[Any]] = None,
        y: Optional[List[Any]] = None,
        y_pred: Optional[List[Any]] = None,
        training: bool = False,
    ) -> Dict[str, Any]:
        """Apply crossover to merge two selected candidates.

        Creates crossover inputs combining two high-performing candidates,
        then calls the crossover program to generate a merged variant.

        Args:
            step (int): The current training step
            trainable_variable (Variable): The trainable variable (for metadata access)
            current_candidate (dict): First selected candidate to merge
            other_candidate (dict): Second selected candidate to merge
            x (list): Input data batch
            y (list): Ground truth data batch
            y_pred (list): Predicted outputs from the current model
            training (bool): Whether in training mode

        Returns:
            dict: The merged candidate from the crossover program
        """
        mask = list(Trainable.keys())
        schema_id = id(trainable_variable.get_schema())
        current_variable = out_mask_json(
            current_candidate,
            mask=mask,
        )
        other_variable = out_mask_json(
            other_candidate,
            mask=mask,
        )
        inputs = CrossoverInputs(
            program_description=self.program.description,
            program_inputs=[inp.get_json() for inp in x],
            program_predicted_outputs=[
                pred.get_json() if pred else None for pred in y_pred
            ],
            program_ground_truth=([gt.get_json() for gt in y] if y is not None else []),
            variable_description=trainable_variable.description,
            other_variable=other_variable,
            current_variable=current_variable,
        )
        program = self.crossover_programs[schema_id]
        return await program(inputs, training=training)

    async def competition(self, candidates: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
        """Apply Dominated Novelty Search (DNS) competition.

        DNS filters candidates by removing those that are both:
        - Inferior in reward (dominated)
        - Similar to existing candidates (not novel)

        This maintains diversity while focusing on high-performing candidates.

        Args:
            candidates (list): List of candidate dictionaries with 'reward' key

        Returns:
            list: Filtered list of candidates that passed the DNS competition
        """
        if len(candidates) <= 1:
            return candidates

        distance_function = (
            self.distance_function if self.distance_function else similarity_distance
        )

        selected_candidates = []
        for candidate in candidates:
            is_dominated = False
            for other in candidates:
                if other is candidate:
                    continue
                distance = await distance_function(
                    candidate,
                    other,
                    embedding_model=self.embedding_model,
                )
                # Check if within k-nearest neighborhood
                if distance < 1.0 / self.k_nearest_fitter:
                    # Check if dominated (lower reward)
                    if candidate.get("reward", 0) < other.get("reward", 0):
                        is_dominated = True
                        break
            if not is_dominated:
                selected_candidates.append(candidate)

        return selected_candidates if selected_candidates else [candidates[0]]

    async def on_epoch_end(self, epoch, trainable_variables):
        """Called at the end of each epoch.

        Applies DNS competition (if algorithm='dns') to filter candidates,
        then selects the top candidates based on population_size.

        Args:
            epoch (int): The epoch number
            trainable_variables (list): The list of trainable variables
        """
        for trainable_variable in trainable_variables:
            candidates = trainable_variable.get("candidates")
            best_candidates = trainable_variable.get("best_candidates")

            # Combine current candidates with best candidates
            all_candidates = candidates + best_candidates

            # Apply DNS competition if enabled
            if self.algorithm == "dns" and len(all_candidates) > 1:
                all_candidates = await self.competition(all_candidates)

            # Sort by reward and keep top population_size candidates
            all_candidates = sorted(
                all_candidates,
                key=lambda x: x.get("reward", 0),
                reverse=True,
            )
            trainable_variable.update(
                {
                    "candidates": [],
                    "best_candidates": all_candidates[: self.population_size],
                }
            )

    def get_config(self):
        config = super().get_config()
        config.update(
            {
                "instructions": self.instructions,
                "reasoning_effort": self.reasoning_effort,
                "k_nearest_fitter": self.k_nearest_fitter,
                "algorithm": self.algorithm,
            }
        )
        if self.embedding_model:
            config["embedding_model"] = serialization_lib.serialize_synalinks_object(
                self.embedding_model
            )
        return config

    @classmethod
    def from_config(cls, config):
        embedding_model = None
        if "embedding_model" in config:
            embedding_model = serialization_lib.deserialize_synalinks_object(
                config.pop("embedding_model")
            )
        language_model = serialization_lib.deserialize_synalinks_object(
            config.pop("language_model")
        )
        return cls(
            language_model=language_model,
            embedding_model=embedding_model,
            **config,
        )

build(trainable_variables) async

Build the optimizer programs based on the trainable variables.

Parameters:

Name Type Description Default
trainable_variables list

List of variables that will be optimized

required
Source code in synalinks/src/optimizers/omega.py
async def build(self, trainable_variables):
    """
    Build the optimizer programs based on the trainable variables.

    Args:
        trainable_variables (list): List of variables that will be optimized
    """
    for trainable_variable in trainable_variables:
        schema_id = id(trainable_variable.get_schema())
        mask = list(Trainable.keys())
        symbolic_variable = trainable_variable.to_symbolic_data_model().out_mask(
            mask=mask
        )

        if schema_id not in self.mutation_programs:
            inputs = Input(data_model=MutationInputs)
            outputs = await ChainOfThought(
                data_model=symbolic_variable,
                language_model=self.language_model,
                temperature=self.mutation_temperature,
                reasoning_effort=self.reasoning_effort,
                instructions=(
                    "\n".join(
                        [
                            base_instructions(),
                            mutation_instructions(list(symbolic_variable.keys())),
                        ]
                    )
                    if not self.instructions
                    else "\n".join(
                        [
                            self.instructions,
                            base_instructions(),
                            mutation_instructions(list(symbolic_variable.keys())),
                        ]
                    )
                ),
                name=f"mutation_cot_{schema_id}_" + self.name,
            )(inputs)
            outputs = outputs.in_mask(mask=list(symbolic_variable.keys()))
            program = Program(
                inputs=inputs,
                outputs=outputs,
                name=f"mutation_{schema_id}_" + self.name,
                description="The mutation program that fix/optimize variables",
            )
            self.mutation_programs[schema_id] = program

        if schema_id not in self.crossover_programs:
            inputs = Input(data_model=CrossoverInputs)
            outputs = await ChainOfThought(
                data_model=symbolic_variable,
                language_model=self.language_model,
                temperature=self.crossover_temperature,
                reasoning_effort=self.reasoning_effort,
                instructions=(
                    "\n".join(
                        [
                            base_instructions(),
                            crossover_instructions(list(symbolic_variable.keys())),
                        ]
                    )
                    if not self.instructions
                    else "\n".join(
                        [
                            self.instructions,
                            base_instructions(),
                            crossover_instructions(list(symbolic_variable.keys())),
                        ]
                    )
                ),
                name=f"crossover_cot_{schema_id}_" + self.name,
            )(inputs)
            outputs = outputs.in_mask(mask=list(symbolic_variable.keys()))
            program = Program(
                inputs=inputs,
                outputs=outputs,
                name=f"crossover_{schema_id}_" + self.name,
                description="Crossover program combining high performing variables",
            )
            self.crossover_programs[schema_id] = program

    self.built = True

competition(candidates) async

Apply Dominated Novelty Search (DNS) competition.

DNS filters candidates by removing those that are both: - Inferior in reward (dominated) - Similar to existing candidates (not novel)

This maintains diversity while focusing on high-performing candidates.

Parameters:

Name Type Description Default
candidates list

List of candidate dictionaries with 'reward' key

required

Returns:

Name Type Description
list List[Dict[str, Any]]

Filtered list of candidates that passed the DNS competition

Source code in synalinks/src/optimizers/omega.py
async def competition(self, candidates: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
    """Apply Dominated Novelty Search (DNS) competition.

    DNS filters candidates by removing those that are both:
    - Inferior in reward (dominated)
    - Similar to existing candidates (not novel)

    This maintains diversity while focusing on high-performing candidates.

    Args:
        candidates (list): List of candidate dictionaries with 'reward' key

    Returns:
        list: Filtered list of candidates that passed the DNS competition
    """
    if len(candidates) <= 1:
        return candidates

    distance_function = (
        self.distance_function if self.distance_function else similarity_distance
    )

    selected_candidates = []
    for candidate in candidates:
        is_dominated = False
        for other in candidates:
            if other is candidate:
                continue
            distance = await distance_function(
                candidate,
                other,
                embedding_model=self.embedding_model,
            )
            # Check if within k-nearest neighborhood
            if distance < 1.0 / self.k_nearest_fitter:
                # Check if dominated (lower reward)
                if candidate.get("reward", 0) < other.get("reward", 0):
                    is_dominated = True
                    break
        if not is_dominated:
            selected_candidates.append(candidate)

    return selected_candidates if selected_candidates else [candidates[0]]

merge_candidate(step, trainable_variable, current_candidate, other_candidate, x=None, y=None, y_pred=None, training=False) async

Apply crossover to merge two selected candidates.

Creates crossover inputs combining two high-performing candidates, then calls the crossover program to generate a merged variant.

Parameters:

Name Type Description Default
step int

The current training step

required
trainable_variable Variable

The trainable variable (for metadata access)

required
current_candidate dict

First selected candidate to merge

required
other_candidate dict

Second selected candidate to merge

required
x list

Input data batch

None
y list

Ground truth data batch

None
y_pred list

Predicted outputs from the current model

None
training bool

Whether in training mode

False

Returns:

Name Type Description
dict Dict[str, Any]

The merged candidate from the crossover program

Source code in synalinks/src/optimizers/omega.py
async def merge_candidate(
    self,
    step: int,
    trainable_variable: "Variable",
    current_candidate: Dict[str, Any],
    other_candidate: Dict[str, Any],
    x: Optional[List[Any]] = None,
    y: Optional[List[Any]] = None,
    y_pred: Optional[List[Any]] = None,
    training: bool = False,
) -> Dict[str, Any]:
    """Apply crossover to merge two selected candidates.

    Creates crossover inputs combining two high-performing candidates,
    then calls the crossover program to generate a merged variant.

    Args:
        step (int): The current training step
        trainable_variable (Variable): The trainable variable (for metadata access)
        current_candidate (dict): First selected candidate to merge
        other_candidate (dict): Second selected candidate to merge
        x (list): Input data batch
        y (list): Ground truth data batch
        y_pred (list): Predicted outputs from the current model
        training (bool): Whether in training mode

    Returns:
        dict: The merged candidate from the crossover program
    """
    mask = list(Trainable.keys())
    schema_id = id(trainable_variable.get_schema())
    current_variable = out_mask_json(
        current_candidate,
        mask=mask,
    )
    other_variable = out_mask_json(
        other_candidate,
        mask=mask,
    )
    inputs = CrossoverInputs(
        program_description=self.program.description,
        program_inputs=[inp.get_json() for inp in x],
        program_predicted_outputs=[
            pred.get_json() if pred else None for pred in y_pred
        ],
        program_ground_truth=([gt.get_json() for gt in y] if y is not None else []),
        variable_description=trainable_variable.description,
        other_variable=other_variable,
        current_variable=current_variable,
    )
    program = self.crossover_programs[schema_id]
    return await program(inputs, training=training)

mutate_candidate(step, trainable_variable, selected_candidate, x=None, y=None, y_pred=None, training=False) async

Apply mutation to generate a new candidate using LLM.

Creates mutation inputs from the selected candidate and training data, then calls the mutation program to generate an optimized variant.

Parameters:

Name Type Description Default
step int

The current training step

required
trainable_variable Variable

The trainable variable (for metadata access)

required
selected_candidate dict

The selected candidate to mutate

required
x list

Input data batch

None
y list

Ground truth data batch

None
y_pred list

Predicted outputs from the current model

None
training bool

Whether in training mode

False

Returns:

Name Type Description
dict Dict[str, Any]

The mutated candidate from the mutation program

Source code in synalinks/src/optimizers/omega.py
async def mutate_candidate(
    self,
    step: int,
    trainable_variable: "Variable",
    selected_candidate: Dict[str, Any],
    x: Optional[List[Any]] = None,
    y: Optional[List[Any]] = None,
    y_pred: Optional[List[Any]] = None,
    training: bool = False,
) -> Dict[str, Any]:
    """Apply mutation to generate a new candidate using LLM.

    Creates mutation inputs from the selected candidate and training data,
    then calls the mutation program to generate an optimized variant.

    Args:
        step (int): The current training step
        trainable_variable (Variable): The trainable variable (for metadata access)
        selected_candidate (dict): The selected candidate to mutate
        x (list): Input data batch
        y (list): Ground truth data batch
        y_pred (list): Predicted outputs from the current model
        training (bool): Whether in training mode

    Returns:
        dict: The mutated candidate from the mutation program
    """
    mask = list(Trainable.keys())
    schema_id = id(trainable_variable.get_schema())
    masked_variable = out_mask_json(
        selected_candidate,
        mask=mask,
    )
    inputs = MutationInputs(
        program_description=self.program.description,
        program_inputs=[inp.get_json() for inp in x],
        program_predicted_outputs=[
            pred.get_json() if pred else None for pred in y_pred
        ],
        program_ground_truth=([gt.get_json() for gt in y] if y is not None else []),
        variable_description=trainable_variable.description,
        current_variable=masked_variable,
    )
    program = self.mutation_programs[schema_id]
    return await program(inputs, training=training)

on_epoch_end(epoch, trainable_variables) async

Called at the end of each epoch.

Applies DNS competition (if algorithm='dns') to filter candidates, then selects the top candidates based on population_size.

Parameters:

Name Type Description Default
epoch int

The epoch number

required
trainable_variables list

The list of trainable variables

required
Source code in synalinks/src/optimizers/omega.py
async def on_epoch_end(self, epoch, trainable_variables):
    """Called at the end of each epoch.

    Applies DNS competition (if algorithm='dns') to filter candidates,
    then selects the top candidates based on population_size.

    Args:
        epoch (int): The epoch number
        trainable_variables (list): The list of trainable variables
    """
    for trainable_variable in trainable_variables:
        candidates = trainable_variable.get("candidates")
        best_candidates = trainable_variable.get("best_candidates")

        # Combine current candidates with best candidates
        all_candidates = candidates + best_candidates

        # Apply DNS competition if enabled
        if self.algorithm == "dns" and len(all_candidates) > 1:
            all_candidates = await self.competition(all_candidates)

        # Sort by reward and keep top population_size candidates
        all_candidates = sorted(
            all_candidates,
            key=lambda x: x.get("reward", 0),
            reverse=True,
        )
        trainable_variable.update(
            {
                "candidates": [],
                "best_candidates": all_candidates[: self.population_size],
            }
        )

base_instructions()

Base instructions that define the context for all optimization programs.

These instructions explain that the system optimizes JSON variables in a computation graph.

Source code in synalinks/src/optimizers/omega.py
def base_instructions():
    """Base instructions that define the context for all optimization programs.

    These instructions explain that the system optimizes JSON variables
    in a computation graph.
    """
    return """
You are an integral part of an optimization system designed to improve
JSON variables within a computation graph (i.e. the program).
Each module in the graph performs specific computations, with JSON variables
serving as the state.
These variables can represent prompts, code, plans, rules, or any other
JSON-compatible data.
""".strip()

crossover_instructions(variables_keys)

Instructions for the crossover program that optimizes variables.

Parameters:

Name Type Description Default
variables_keys list

List of keys that the variable should contain

required
Source code in synalinks/src/optimizers/omega.py
def crossover_instructions(variables_keys):
    """Instructions for the crossover program that optimizes variables.

    Args:
        variables_keys (list): List of keys that the variable should contain
    """
    return f"""
Your responsibility is to create a new, optimized variable by strategically
combining features from the current variable and a high-performing candidate.
The new variable should improve the alignment of the predicted output with
the ground truth.

Guidelines:
- Analyze both the current variable and the other high-performing variable,
  identifying their respective strengths and weaknesses.
- Pay close attention to the variable's description, its intended use, and the
  broader context of the computation graph.
- Ensure the new variable is generalizable and performs well across various
  inputs of the same kind.
- Include all specified keys: {variables_keys}.
- Justify each feature you incorporate, explaining how it contributes to
  better performance or alignment with the ground truth.
- If no ground truth is provided, the goal is to critically enhance the
  predicted output.
- If you have to optimize a variable containing code, provide a generalizable
  algorithm.
- Always focus on ONLY one aspect at the time.
- If the instructions/prompt contains general information, keep it.
""".strip()

mutation_instructions(variables_keys)

Instructions for the mutation program that optimizes variables.

Parameters:

Name Type Description Default
variables_keys list

List of keys that the variable should contain

required
Source code in synalinks/src/optimizers/omega.py
def mutation_instructions(variables_keys):
    """Instructions for the mutation program that optimizes variables.

    Args:
        variables_keys (list): List of keys that the variable should contain
    """
    return f"""
Your primary task is to creatively enhance the provided variable so that the
predicted output aligns as closely as possible with the ground truth.
Pay close attention to the variable's description, its intended use, and the
broader context of the computation graph.

Guidelines:
- Ensure the new variable is generalizable and performs well across various
  inputs of the same kind.
- Include all specified keys: {variables_keys}.
- Justify each change with clear reasoning, referencing the variable's purpose
  and the desired output.
- If no ground truth is provided, the goal is to critically enhance the
  predicted output.
- If you have to optimize a variable containing code, provide a generalizable
  algorithm.
- Always focus on ONLY one aspect at the time.
- If the instructions/prompt contains general information, keep it.
""".strip()

similarity_distance(candidate1, candidate2, embedding_model=None, axis=-1) async

Compute distance between two candidates using embeddings.

This function computes the cosine distance between the mean embeddings of two candidate JSON objects. Each field of the JSON is embedded separately, normalized to unit length, then averaged.

Parameters:

Name Type Description Default
candidate1 dict

First candidate (dict or JSON-serializable object)

required
candidate2 dict

Second candidate (dict or JSON-serializable object)

required
embedding_model EmbeddingModel

The embedding model for computing embeddings

None
axis int

The axis along which to compute the similarity (default: -1)

-1

Returns:

Name Type Description
float float

Cosine distance between candidates (0 = identical, 1 = orthogonal)

Source code in synalinks/src/optimizers/omega.py
async def similarity_distance(
    candidate1: Dict[str, Any],
    candidate2: Dict[str, Any],
    embedding_model: Optional["EmbeddingModel"] = None,
    axis: int = -1,
) -> float:
    """Compute distance between two candidates using embeddings.

    This function computes the cosine distance between the mean embeddings
    of two candidate JSON objects. Each field of the JSON is embedded
    separately, normalized to unit length, then averaged.

    Args:
        candidate1 (dict): First candidate (dict or JSON-serializable object)
        candidate2 (dict): Second candidate (dict or JSON-serializable object)
        embedding_model (EmbeddingModel): The embedding model for computing embeddings
        axis (int): The axis along which to compute the similarity (default: -1)

    Returns:
        float: Cosine distance between candidates (0 = identical, 1 = orthogonal)
    """
    embeddings1 = await embedding_model(tree.flatten(candidate1))
    embeddings2 = await embedding_model(tree.flatten(candidate2))
    embeddings1 = embeddings1["embeddings"]
    embeddings2 = embeddings2["embeddings"]
    embeddings1 = np.convert_to_tensor(embeddings1)
    embeddings2 = np.convert_to_tensor(embeddings2)
    embeddings1, embeddings2 = squeeze_or_expand_to_same_rank(embeddings1, embeddings2)
    embeddings1 = np.normalize(embeddings1, axis=axis)
    embeddings2 = np.normalize(embeddings2, axis=axis)
    embeddings1 = np.mean(embeddings1, axis=0)
    embeddings2 = np.mean(embeddings2, axis=0)
    similarity = (np.sum(embeddings1 * embeddings2, axis=axis) + 1) / 2
    return 1 - similarity