petitRADTRANS.sbi.benchmark

petitRADTRANS.sbi.benchmark#

Benchmarking interfaces for comparing amortized and exact retrievals.

Classes#

`RetrievalBenchmarkCase`	One benchmark problem used to compare inference backends.
`BenchmarkMetrics`	Metrics summarizing agreement and predictive performance.
`BenchmarkComparison`	Compare an amortized result to one or more exact retrieval baselines.
`RetrievalBenchmarkSuite`	Run standardized benchmark comparisons for SBI tasks.

Module Contents#

class petitRADTRANS.sbi.benchmark.RetrievalBenchmarkCase#

One benchmark problem used to compare inference backends.

name: str#

task: petitRADTRANS.sbi.task.SBITask#

observation: Any#

reference_posterior: Any = None#

metadata: Mapping[str, Any]#

class petitRADTRANS.sbi.benchmark.BenchmarkMetrics#

Metrics summarizing agreement and predictive performance.

calibration: Mapping[str, float]#

posterior_distance: Mapping[str, float]#

predictive_checks: Mapping[str, float]#

runtime: Mapping[str, float]#

class petitRADTRANS.sbi.benchmark.BenchmarkComparison#

Compare an amortized result to one or more exact retrieval baselines.

case_name: str#

amortized_result: petitRADTRANS.sbi.inference.AmortizedRetrievalResult#

exact_results: Mapping[str, Any]#

metrics: BenchmarkMetrics#

metadata: Mapping[str, Any]#

class petitRADTRANS.sbi.benchmark.RetrievalBenchmarkSuite(cases: list[RetrievalBenchmarkCase])#

Run standardized benchmark comparisons for SBI tasks.

cases#

abstractmethod run_case(case: RetrievalBenchmarkCase) → BenchmarkComparison#

Run one benchmark case and compute comparison metrics.

Parameters#

case:: Benchmark case describing the task, observation, and optional exact reference posterior to compare against.

Returns#

BenchmarkComparison: Comparison payload combining amortized and exact results together with any derived metrics.

Notes#

The base class is intentionally abstract. Concrete suites are expected to bind exact and amortized inference backends and define the metric computations appropriate for the comparison.

run_all() → list[BenchmarkComparison]#: Run all configured benchmark cases.