Registration API

The registration system manages environment creation, task IDs, and dataset organization.

JaxARC uses a registration system to manage environments and tasks. Each environment is registered with a unique ID that can be used with make().

Module Contents

Registration system for JaxARC environments.

This package provides a lean registry that maps simple dataset keys to environment specs. Dataset parsing and task loading are no longer part of this module. Environments are expected to be constructed with buffer-based EnvParams (JAX-native, JIT-friendly) and not depend on parsers at runtime.

Core ideas: - A global registry maps dataset keys (e.g., “Mini”, “Concept”, “AGI1”, “AGI2”) to EnvSpec definitions. - No parser entry points or subset inference live here anymore. - make(id, **kwargs) only parses the dataset key and returns the environment and parameters

built from provided kwargs (e.g., a prebuilt buffer in EnvParams or an explicit params).

  • Named subsets can be registered (e.g., register_subset(“Mini”, “easy”, […])) and then selected via make(“Mini-easy”) to load exactly those tasks. This makes it easy to publish curated benchmarks and implement curriculum learning.

Typical usage:

from jaxarc.registration import make # Build EnvParams with a pre-stacked task buffer outside this module. env, params = make(“Mini”, params=my_params)

Notes: - This module keeps a single way of doing things: buffer-based, JIT-friendly EnvParams. - Dataset downloading/parsing and subset handling should be done outside this module.

class jaxarc.registration.EnvRegistry[source]

Bases: object

Global environment registry with gym-like semantics.

available_named_subsets(dataset_key: str, include_builtin: bool = True) tuple[str, ...][source]

Return names of available subsets for a dataset.

Parameters:
  • dataset_key – Dataset name (Mini, Concept, AGI1, AGI2)

  • include_builtin – Include built-in selectors (‘all’, ‘train’, ‘eval’) and concept groups (default: True)

Returns:

Tuple of subset names, sorted alphabetically

Examples

>>> available_named_subsets("Mini")
('all',)  # Mini doesn't have train/eval splits
>>> available_named_subsets("Concept")
('AboveBelow', 'Center', 'all', ...)  # Includes concept groups
>>> available_named_subsets("AGI1")
('all', 'eval', 'train')  # AGI has splits
>>> available_named_subsets("Mini", include_builtin=False)
()  # Only custom subsets
available_task_ids(dataset_key: str, config: Any | None = None, auto_download: bool = False) list[str][source]

Return all available task IDs for a dataset key after ensuring dataset availability.

get_subset_task_ids(dataset_key: str, selector: str = 'all', config: Any | None = None, auto_download: bool = False) list[str][source]

Get task IDs for a specific subset without creating an environment.

This allows users to query what tasks will be loaded before calling make().

Parameters:
  • dataset_key – Dataset name (Mini, Concept, AGI1, AGI2)

  • selector – Subset selector (‘all’, ‘train’, ‘easy’, task_id, etc.)

  • config – Optional config

  • auto_download – Download dataset if missing

Returns:

List of task IDs that will be loaded

Examples

>>> get_subset_task_ids("Mini", "all")
['Most_Common_color_l6ab0lf3xztbyxsu3p', ...]
>>> get_subset_task_ids("Mini", "easy")
['task1', 'task2', 'task3']  # Only tasks in 'easy' subset
>>> get_subset_task_ids("Concept", "Center")
['Center_001', 'Center_002', ...]  # Tasks in Center concept
>>> get_subset_task_ids("Mini", "Most_Common_color_l6ab0lf3xztbyxsu3p")
['Most_Common_color_l6ab0lf3xztbyxsu3p']  # Single task
make(id: str, **kwargs: Any) Tuple[Any, Any][source]

Create an environment instance and parameters for a registered spec.

Expected kwargs:
  • params: EnvParams (preferred; buffer-based, JIT-friendly)

  • env_entry: str (optional) override of environment entry point

Returns:

env: Environment instance params: EnvParams provided directly

Return type:

(env, params) tuple

register(id: str, env_entry: str = 'jaxarc.envs:Environment', max_episode_steps: int = 100, **kwargs: Any) None[source]

Register a new environment specification.

Parameters:
  • id – Unique environment ID (e.g., “JaxARC-Mini-v0”)

  • entry_point – Dotted path or colon path to class/factory (e.g., “jaxarc.envs:Environment”)

  • max_episode_steps – Default max steps for this environment family

  • **kwargs – Additional metadata stored with the spec

register_subset(dataset_key: str, name: str, task_ids: list[str] | tuple[str, ...]) None[source]

Register a named subset (e.g., ‘Mini-easy’) that maps to specific task IDs.

Parameters:
  • dataset_key – Base dataset key (e.g., ‘Mini’, ‘Concept’, ‘AGI1’, ‘AGI2’ or synonyms)

  • name – Subset name (e.g., ‘easy’, ‘hard’, ‘my-benchmark’)

  • task_ids – Sequence of task IDs to include in this subset

subset_task_ids(dataset_key: str, name: str) tuple[str, ...][source]

Return the task IDs registered for a named subset (e.g., ‘Mini’, ‘easy’).

class jaxarc.registration.EnvSpec(id: str, env_entry: str = 'jaxarc.envs:Environment', max_episode_steps: int = 100, kwargs: Dict[str, ~typing.Any]=<factory>)[source]

Bases: object

Environment specification for registration.

env_entry: str = 'jaxarc.envs:Environment'
id: str
kwargs: Dict[str, Any]
max_episode_steps: int = 100
jaxarc.registration.available_named_subsets(dataset_key: str, include_builtin: bool = True) tuple[str, ...][source]

List available subset names for a dataset (includes built-in selectors by default).

Parameters:
  • dataset_key – Dataset name (Mini, Concept, AGI1, AGI2)

  • include_builtin – Include built-in selectors like ‘all’, ‘train’, ‘eval’ (default: True)

Returns:

Tuple of subset names

Examples

>>> available_named_subsets("Mini")
('all', 'easy', 'eval', 'train')
>>> available_named_subsets("Mini", include_builtin=False)
('easy',)  # Only custom subsets
jaxarc.registration.available_task_ids(dataset_key: str, config: Any | None = None, auto_download: bool = False) list[str][source]

List all available task IDs (equivalent to get_subset_task_ids with selector=’all’).

jaxarc.registration.get_subset_task_ids(dataset_key: str, selector: str = 'all', config: Any | None = None, auto_download: bool = False) list[str][source]

Get task IDs for a specific subset without creating an environment.

This allows users to query what tasks will be loaded before calling make().

Parameters:
  • dataset_key – Dataset name (Mini, Concept, AGI1, AGI2)

  • selector – Subset selector (‘all’, ‘train’, ‘easy’, task_id, etc.)

  • config – Optional config

  • auto_download – Download dataset if missing

Returns:

List of task IDs that will be loaded

Examples

>>> get_subset_task_ids("Mini", "all")
['Most_Common_color_l6ab0lf3xztbyxsu3p', ...]
>>> get_subset_task_ids("Mini", "easy")
['task1', 'task2', 'task3']
>>> get_subset_task_ids("Mini", "Most_Common_color_l6ab0lf3xztbyxsu3p")
['Most_Common_color_l6ab0lf3xztbyxsu3p']
jaxarc.registration.load_all_subsets_for_dataset(dataset: str, config_root: Path | None = None) int[source]

Discover and register all YAML-defined subsets for dataset.

Returns the number of subsets successfully loaded.

jaxarc.registration.load_subset(name: str, dataset: str, config_root: Path | None = None) list[str] | None[source]

Load task IDs for a named subset from a YAML file.

Parameters:
  • name – Subset name (e.g. "easy").

  • dataset – Dataset key (e.g. "Mini", "AGI1").

  • config_root – Path to the configs/ directory. When None, :pep:`pyprojroot` is used to locate it automatically.

Returns:

List of task ID strings, or None if the file was not found or could not be parsed.

jaxarc.registration.load_subset_if_needed(name: str, dataset: str, config_root: Path | None = None) bool[source]

Load and register a subset only if it is not already registered.

Returns True when the subset is available (either already present or freshly loaded).

jaxarc.registration.make(id: str, **kwargs: Any) tuple[Any, Any][source]

Create an environment instance and EnvParams using a registered spec.

See EnvRegistry.make for details on supported kwargs.

jaxarc.registration.register(id: str, entry_point: str | None = None, env_entry: str = 'jaxarc.envs:Environment', max_episode_steps: int = 100, **kwargs: Any) None[source]

Register an environment spec in the global registry.

jaxarc.registration.register_subset(dataset_key: str, name: str, task_ids: list[str] | tuple[str, ...]) None[source]

Register a named subset for a dataset key, enabling IDs like ‘Mini-easy’.

jaxarc.registration.subset_task_ids(dataset_key: str, name: str) tuple[str, ...][source]

Return the task IDs registered under a named subset.

This only works for explicitly registered subsets (via register_subset). For more flexible queries, use get_subset_task_ids() instead.

Task ID Format

Task IDs follow the pattern: Dataset-TaskName_taskId

Examples:

  • Mini-Most_Common_color_l6ab0lf3xztbyxsu3p

  • ConceptARC-denoising_0c9aba6e

  • ARC-007bbfb7