HumanEval
get_input_data_model()
get_output_data_model()
iterable_dataset(repeat=1, batch_size=1, limit=None, split='test')
Streaming dataset for RL-style training.
Returns:
| Type | Description |
|---|---|
HuggingFaceDataset
|
A streaming, iterable dataset. |
Source code in synalinks/src/datasets/built_in/humaneval.py
load_data(validation_split=0.2)
Load HumanEval.
HF ships only a test split (164 problems), so we split it
deterministically into train / test. Real scoring requires running
unit tests against the completion — exact match against the
canonical solution is a placeholder reward.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
validation_split
|
float
|
Fraction held out for evaluation
(default |
0.2
|
Returns:
| Type | Description |
|---|---|
tuple
|
|