petitRADTRANS.sbi.benchmark
===========================

.. py:module:: petitRADTRANS.sbi.benchmark

.. autoapi-nested-parse::

   Benchmarking interfaces for comparing amortized and exact retrievals.


Classes
-------

.. autoapisummary::

   petitRADTRANS.sbi.benchmark.RetrievalBenchmarkCase
   petitRADTRANS.sbi.benchmark.BenchmarkMetrics
   petitRADTRANS.sbi.benchmark.BenchmarkComparison
   petitRADTRANS.sbi.benchmark.RetrievalBenchmarkSuite


Module Contents
---------------

.. py:class:: RetrievalBenchmarkCase

   One benchmark problem used to compare inference backends.


   .. py:attribute:: name
      :type:  str


   .. py:attribute:: task
      :type:  petitRADTRANS.sbi.task.SBITask


   .. py:attribute:: observation
      :type:  Any


   .. py:attribute:: reference_posterior
      :type:  Any
      :value: None


   .. py:attribute:: metadata
      :type:  Mapping[str, Any]


.. py:class:: BenchmarkMetrics

   Metrics summarizing agreement and predictive performance.


   .. py:attribute:: calibration
      :type:  Mapping[str, float]


   .. py:attribute:: posterior_distance
      :type:  Mapping[str, float]


   .. py:attribute:: predictive_checks
      :type:  Mapping[str, float]


   .. py:attribute:: runtime
      :type:  Mapping[str, float]


.. py:class:: BenchmarkComparison

   Compare an amortized result to one or more exact retrieval baselines.


   .. py:attribute:: case_name
      :type:  str


   .. py:attribute:: amortized_result
      :type:  petitRADTRANS.sbi.inference.AmortizedRetrievalResult


   .. py:attribute:: exact_results
      :type:  Mapping[str, Any]


   .. py:attribute:: metrics
      :type:  BenchmarkMetrics


   .. py:attribute:: metadata
      :type:  Mapping[str, Any]


.. py:class:: RetrievalBenchmarkSuite(cases: list[RetrievalBenchmarkCase])

   Run standardized benchmark comparisons for SBI tasks.


   .. py:attribute:: cases


   .. py:method:: run_case(case: RetrievalBenchmarkCase) -> BenchmarkComparison
      :abstractmethod:


      Run one benchmark case and compute comparison metrics.

      Parameters
      ----------
      case:
          Benchmark case describing the task, observation, and optional exact
          reference posterior to compare against.

      Returns
      -------
      BenchmarkComparison
          Comparison payload combining amortized and exact results together
          with any derived metrics.

      Notes
      -----
      The base class is intentionally abstract. Concrete suites are expected
      to bind exact and amortized inference backends and define the metric
      computations appropriate for the comparison.


   .. py:method:: run_all() -> list[BenchmarkComparison]

      Run all configured benchmark cases.