Registration API¶
The registration system manages environment creation, task IDs, and dataset organization.
JaxARC uses a registration system to manage environments and tasks. Each
environment is registered with a unique ID that can be used with make().
Module Contents¶
Registration system for JaxARC environments.
This package provides a lean registry that maps simple dataset keys to environment specs. Dataset parsing and task loading are no longer part of this module. Environments are expected to be constructed with buffer-based EnvParams (JAX-native, JIT-friendly) and not depend on parsers at runtime.
Core ideas: - A global registry maps dataset keys (e.g., “Mini”, “Concept”, “AGI1”, “AGI2”) to EnvSpec definitions. - No parser entry points or subset inference live here anymore. - make(id, **kwargs) only parses the dataset key and returns the environment and parameters
built from provided kwargs (e.g., a prebuilt buffer in EnvParams or an explicit params).
Named subsets can be registered (e.g., register_subset(“Mini”, “easy”, […])) and then selected via make(“Mini-easy”) to load exactly those tasks. This makes it easy to publish curated benchmarks and implement curriculum learning.
- Typical usage:
from jaxarc.registration import make # Build EnvParams with a pre-stacked task buffer outside this module. env, params = make(“Mini”, params=my_params)
Notes: - This module keeps a single way of doing things: buffer-based, JIT-friendly EnvParams. - Dataset downloading/parsing and subset handling should be done outside this module.
- class jaxarc.registration.EnvRegistry[source]¶
Bases:
objectGlobal environment registry with gym-like semantics.
- available_named_subsets(dataset_key: str, include_builtin: bool = True) tuple[str, ...][source]¶
Return names of available subsets for a dataset.
- Parameters:
dataset_key – Dataset name (Mini, Concept, AGI1, AGI2)
include_builtin – Include built-in selectors (‘all’, ‘train’, ‘eval’) and concept groups (default: True)
- Returns:
Tuple of subset names, sorted alphabetically
Examples
>>> available_named_subsets("Mini") ('all',) # Mini doesn't have train/eval splits
>>> available_named_subsets("Concept") ('AboveBelow', 'Center', 'all', ...) # Includes concept groups
>>> available_named_subsets("AGI1") ('all', 'eval', 'train') # AGI has splits
>>> available_named_subsets("Mini", include_builtin=False) () # Only custom subsets
- available_task_ids(dataset_key: str, config: Any | None = None, auto_download: bool = False) list[str][source]¶
Return all available task IDs for a dataset key after ensuring dataset availability.
- get_subset_task_ids(dataset_key: str, selector: str = 'all', config: Any | None = None, auto_download: bool = False) list[str][source]¶
Get task IDs for a specific subset without creating an environment.
This allows users to query what tasks will be loaded before calling make().
- Parameters:
dataset_key – Dataset name (Mini, Concept, AGI1, AGI2)
selector – Subset selector (‘all’, ‘train’, ‘easy’, task_id, etc.)
config – Optional config
auto_download – Download dataset if missing
- Returns:
List of task IDs that will be loaded
Examples
>>> get_subset_task_ids("Mini", "all") ['Most_Common_color_l6ab0lf3xztbyxsu3p', ...]
>>> get_subset_task_ids("Mini", "easy") ['task1', 'task2', 'task3'] # Only tasks in 'easy' subset
>>> get_subset_task_ids("Concept", "Center") ['Center_001', 'Center_002', ...] # Tasks in Center concept
>>> get_subset_task_ids("Mini", "Most_Common_color_l6ab0lf3xztbyxsu3p") ['Most_Common_color_l6ab0lf3xztbyxsu3p'] # Single task
- make(id: str, **kwargs: Any) Tuple[Any, Any][source]¶
Create an environment instance and parameters for a registered spec.
- Expected kwargs:
params: EnvParams (preferred; buffer-based, JIT-friendly)
env_entry: str (optional) override of environment entry point
- Returns:
env: Environment instance params: EnvParams provided directly
- Return type:
(env, params) tuple
- register(id: str, env_entry: str = 'jaxarc.envs:Environment', max_episode_steps: int = 100, **kwargs: Any) None[source]¶
Register a new environment specification.
- Parameters:
id – Unique environment ID (e.g., “JaxARC-Mini-v0”)
entry_point – Dotted path or colon path to class/factory (e.g., “jaxarc.envs:Environment”)
max_episode_steps – Default max steps for this environment family
**kwargs – Additional metadata stored with the spec
- register_subset(dataset_key: str, name: str, task_ids: list[str] | tuple[str, ...]) None[source]¶
Register a named subset (e.g., ‘Mini-easy’) that maps to specific task IDs.
- Parameters:
dataset_key – Base dataset key (e.g., ‘Mini’, ‘Concept’, ‘AGI1’, ‘AGI2’ or synonyms)
name – Subset name (e.g., ‘easy’, ‘hard’, ‘my-benchmark’)
task_ids – Sequence of task IDs to include in this subset
- class jaxarc.registration.EnvSpec(id: str, env_entry: str = 'jaxarc.envs:Environment', max_episode_steps: int = 100, kwargs: Dict[str, ~typing.Any]=<factory>)[source]¶
Bases:
objectEnvironment specification for registration.
- jaxarc.registration.available_named_subsets(dataset_key: str, include_builtin: bool = True) tuple[str, ...][source]¶
List available subset names for a dataset (includes built-in selectors by default).
- Parameters:
dataset_key – Dataset name (Mini, Concept, AGI1, AGI2)
include_builtin – Include built-in selectors like ‘all’, ‘train’, ‘eval’ (default: True)
- Returns:
Tuple of subset names
Examples
>>> available_named_subsets("Mini") ('all', 'easy', 'eval', 'train')
>>> available_named_subsets("Mini", include_builtin=False) ('easy',) # Only custom subsets
- jaxarc.registration.available_task_ids(dataset_key: str, config: Any | None = None, auto_download: bool = False) list[str][source]¶
List all available task IDs (equivalent to get_subset_task_ids with selector=’all’).
- jaxarc.registration.get_subset_task_ids(dataset_key: str, selector: str = 'all', config: Any | None = None, auto_download: bool = False) list[str][source]¶
Get task IDs for a specific subset without creating an environment.
This allows users to query what tasks will be loaded before calling make().
- Parameters:
dataset_key – Dataset name (Mini, Concept, AGI1, AGI2)
selector – Subset selector (‘all’, ‘train’, ‘easy’, task_id, etc.)
config – Optional config
auto_download – Download dataset if missing
- Returns:
List of task IDs that will be loaded
Examples
>>> get_subset_task_ids("Mini", "all") ['Most_Common_color_l6ab0lf3xztbyxsu3p', ...]
>>> get_subset_task_ids("Mini", "easy") ['task1', 'task2', 'task3']
>>> get_subset_task_ids("Mini", "Most_Common_color_l6ab0lf3xztbyxsu3p") ['Most_Common_color_l6ab0lf3xztbyxsu3p']
- jaxarc.registration.load_all_subsets_for_dataset(dataset: str, config_root: Path | None = None) int[source]¶
Discover and register all YAML-defined subsets for dataset.
Returns the number of subsets successfully loaded.
- jaxarc.registration.load_subset(name: str, dataset: str, config_root: Path | None = None) list[str] | None[source]¶
Load task IDs for a named subset from a YAML file.
- Parameters:
name – Subset name (e.g.
"easy").dataset – Dataset key (e.g.
"Mini","AGI1").config_root – Path to the
configs/directory. When None, :pep:`pyprojroot` is used to locate it automatically.
- Returns:
List of task ID strings, or None if the file was not found or could not be parsed.
- jaxarc.registration.load_subset_if_needed(name: str, dataset: str, config_root: Path | None = None) bool[source]¶
Load and register a subset only if it is not already registered.
Returns True when the subset is available (either already present or freshly loaded).
- jaxarc.registration.make(id: str, **kwargs: Any) tuple[Any, Any][source]¶
Create an environment instance and EnvParams using a registered spec.
See EnvRegistry.make for details on supported kwargs.
- jaxarc.registration.register(id: str, entry_point: str | None = None, env_entry: str = 'jaxarc.envs:Environment', max_episode_steps: int = 100, **kwargs: Any) None[source]¶
Register an environment spec in the global registry.
Task ID Format¶
Task IDs follow the pattern: Dataset-TaskName_taskId
Examples:
Mini-Most_Common_color_l6ab0lf3xztbyxsu3pConceptARC-denoising_0c9aba6eARC-007bbfb7