petitRADTRANS.retrieval#

Submodules#

Classes#

`Retrieval`	Implement the retrieval method using petitRADTRANS.
`RetrievalConfig`	Contain all the data and model level information necessary to run a petitRADTRANS retrieval.

Package Contents#

class petitRADTRANS.retrieval.Retrieval(configuration: petitRADTRANS.retrieval.retrieval_config.RetrievalConfig, output_directory: str = os.getcwd(), evaluate_sample_spectra: bool = False, corner_plot_names: list[str] = None, reference_data_name: str = None, use_prt_plot_style: bool = True, test_plotting: bool = False, uncertainties_mode: str = 'default', print_log_likelihood_for_debugging: bool = False, generate_mock_data: bool = False, compilation_cache_dir: str | bool | None = None)#

Implement the retrieval method using petitRADTRANS.

A RetrievalConfig object is passed to this class to describe the retrieval data, parameters and priors. The run() method then samples the parameter space, producing posterior distributions for parameters and bayesian evidence for models. Various useful plotting functions have also been included, and can be run once the retrieval is complete.

Args:

configurationRetrievalConfig

A RetrievalConfig object that describes the retrieval to be run. This is the user facing class that must be setup for every retrieval.

output_directoryStr

The directory in which the output folders should be written

evaluate_sample_spectraBool

Produce plots and data files for random samples drawn from the outputs of the sampler.

corner_plot_namesList(Str)

List of additional retrieval names that should be included in the corner plotlib.

reference_data_namestr

Name of the dataset to use as the central plotting reference. This controls which dataset supplies the default Radtrans object and forward model for pressure-temperature plots, abundance plots, and related full-range evaluation utilities. If None, the first key in configuration.data is used.

use_prt_plot_styleBool

Use the petitRADTRANS plotting style as described in style.py. Recommended to turn this parameter to false if you want to use interactive plotting, or if the test_plotting parameter is True.

test_plottingBool

Only use when running locally. A boolean flag that will produce plots for each sample.

uncertainties_modeStr

Uncertainties handling method during the retrieval.

“default”: the uncertainties are fixed.
“optimize”: automatically optimize for uncertainties, following Gibson et al. 2020 (https://doi.org/10.1093/mnras/staa228).
“retrieve”: uncertainties are scaled with a coefficient, which is retrieved.
“retrieve_add”: a fixed scalar is added to the uncertainties, and is retrieved.

print_log_likelihood_for_debuggingbool

If True, the current log likelihood of a forward model run will be printed to the console.

generate_mock_databool

If True, the retrieval will generate a mock data set by sampling the prior distributions and bring it into the exact same shape as the input data. This is useful for testing the retrieval setup in input = output tests. The mock data will be saved in the mock_data folder in the run directory, with the following file names: data.name + ‘_mock_data.dat’.

configuration#

uncertainties_mode = 'default'#

print_log_likelihood_for_debugging = False#

generate_mock_data = False#

output_directory#

corner_files = None#

best_fit_spectra#

best_fit_parameters#

chi2 = None#

posterior_sample_spectra#

test_plotting = False#

evaluate_sample_spectra = False#

analyzer = None#

sampler = None#

samples#

param_dictionary#

plotter#

prt_plot_style = True#

path#

parameter_layout#

_n_free_params_total_retrieval#

_free_parameter_names_cache#

runtime#

latest_sampler_results = None#

_configure_compilation_cache(compilation_cache_dir)#

Enable a shared on-disk XLA compilation cache for this retrieval.

Compiling a differentiable forward model can take minutes. Sequential multi-chain runs and freshly spawned chain workers otherwise recompile it from scratch every time, because each sampler run clears the in-memory cache to free device memory after it finishes. Pointing JAX at a persistent on-disk cache turns those recompiles into cache reads.

compilation_cache_dir selects the location:

None (default): use {output_directory}/.jax_cache.
a path string: use that directory.
False: leave JAX’s cache configuration untouched.

A cache directory already configured by the user (for example in a launch script) is always respected and never overridden. The whole step is best-effort: a failure to configure the cache only emits a warning.

_check_errors()#

_data_are_valid(data=None)#

_error_check_model_function()#

_rebin_opacities(resolution: float)#

build_param_dict(sample: numpy.typing.NDArray[float], free_param_names: list[str]) → dict[str, petitRADTRANS.retrieval.parameter.Parameter]#

This function builds a dictionary of parameters that can be passed to the model building functions. It requires a numpy array with the same length as the number of free parameters, and a list of all the parameter names in the order they appear in the array. The returned dictionary will contain all of these parameters, together with the fixed retrieval parameters.

Args:

samplenumpy.ndarray: An array or list of free parameter values
free_param_nameslist(string): A list of names for each of the free parameters.

Returns:

parametersdict: A dictionary of Parameters, with values set to the values in sample.

static _call_model_generating_function(model_generating_function: Callable, radtrans_object, parameters: dict[str, petitRADTRANS.retrieval.parameter.Parameter], *, data_name: str, data_object: petitRADTRANS.retrieval.data.Data | None = None, pt_plot_mode: bool = False, adaptive_mesh_refinement: bool = False)#

calculate_forward_model(parameters=None, data=None, pt_plot_mode: bool = False, adaptive_mesh_refinement: bool = False, copy_configuration_parameters: bool = True, **kwargs)#

Calculate the forward model associated with the given data, for the given parameters.

Args:

parameters:

Parameters of the forward models. Can be a dictionary with parameter names as keys, or a string, or None. Possible string values are:

‘best fit’: return the forward model with the retrieved best fit parameters.

‘median’: return the forward model with the retrieved median parameters.

‘quantile’: return the forward model with retrieved parameters at the given quantile.

If parameter='quantile', then a quantile (float) keyword argument must be added. If None, the free parameters of the retrieval are set to their prior mid-range value.

data:

Can be a string, or a dictionary with data name as keys and Data objects as values, or None. If None, the forward models of all the data in the retrieval configuration are calculated.

pt_plot_mode:

If True, the model function should return the pressure and temperature arrays before computing the flux.

adaptive_mesh_refinement:

If True, use an adaptive high resolution pressure grid around the location of cloud condensation. This will increase the size of the pressure grid by a constant factor that can be adjusted in the setup_pres function.

copy_configuration_parameters:

If True, copy the configuration parameters to prevent unwanted modifications.

Returns:

A dictionary with the outputs of the forward models for the requested data, or, if data is a string, the output of the forward model corresponding to these data.

calculate_temperature_profile(parameters=None, data=None, adaptive_mesh_refinement: bool = False, copy_configuration_parameters: bool = True, **kwargs)#

Calculate the temperature profile associated with the given data, for the given parameters. Wrapper for calculate_forward_model.

Args:

parameters:

Parameters of the forward models. Can be a dictionary with parameter names as keys, or a string, or None. Possible string values are:

‘best fit’: return the forward model with the retrieved best fit parameters.

‘median’: return the forward model with the retrieved median parameters.

‘quantile’: return the forward model with retrieved parameters at the given quantile.

If parameter='quantile', then a quantile (float) keyword argument must be added. If None, the free parameters of the retrieval are set to their prior mid-range value.

data:

Can be a string, or a dictionary with data name as keys and Data objects as values, or None. If None, the forward models of all the data in the retrieval configuration are calculated.

pt_plot_mode:

If True, the model function should return the pressure and temperature arrays before computing the flux.

adaptive_mesh_refinement:

copy_configuration_parameters:

If True, copy the configuration parameters to prevent unwanted modifications.

Returns:

A dictionary with the outputs of the forward models for the requested data, or, if data is a string, the output of the forward model corresponding to these data.

classmethod from_data(data: dict[str, petitRADTRANS.retrieval.data.Data], retrieved_parameters: dict[str, dict], retrieval_name: str = 'retrieval_name', run_mode: str = 'retrieval', adaptive_mesh_refinement: bool = False, output_directory: str = '', evaluate_sample_spectra: bool = False, corner_plot_names: list[str] = None, reference_data_name: str = None, use_prt_plot_style: bool = True, test_plotting: bool = False, uncertainties_mode: str = 'default', scattering_in_emission: bool = False, pressures: numpy.ndarray = None)#

Instantiate a Retrieval object with a dictionary of Data objects. Intended to be used in couple with the SpectralModel.init_data function.

The RetrievalConfig object is automatically generated. No fixed parameters will be used, those must be stored in their respective Data.model_generating_function. This is automatically done when using the SpectralModel.init_data function.

Args:

dataDict

A dictionary with data names as keys and Data objects as values.

retrieved_parametersDict

A dictionary with retrieved parameter names as keys and dictionaries as values. Those sub-dictionaries must have keys ‘prior_parameters’ and ‘prior_type’. This can also be a list of RetrievalParameter objects.

retrieval_nameStr

Name of this retrieval. Make it informative so that you can keep track of the outputs!

run_modeStr

Can be either ‘retrieval’, which runs the retrieval, or ‘evaluate’, which produces plots from the best fit parameters stored in the samples file.

adaptive_mesh_refinementBool

Use an adaptive high resolution pressure grid around the location of cloud condensation. This will increase the size of the pressure grid by a constant factor that can be adjusted in the setup_pres function.

output_directoryStr

The directory in which the output folders should be written

evaluate_sample_spectraBool

Produce plots and data files for random samples drawn from the outputs of the sampler.

corner_plot_namesList(Str)

List of additional retrieval names that should be included in the corner plotlib.

reference_data_namestr

Name of the dataset to use as the central plotting reference. If None, the first key in data is used.

use_prt_plot_styleBool

Use the petitRADTRANS plotting style as described in style.py. Recommended to turn this parameter to false if you want to use interactive plotting, or if the test_plotting parameter is True.

test_plottingBool

Only use when running locally. A boolean flag that will produce plots for each sample.

uncertainties_modeStr

Uncertainties handling method during the retrieval.

“default”: the uncertainties are fixed.
“optimize”: automatically optimize for uncertainties, following Gibson et al. 2020 (https://doi.org/10.1093/mnras/staa228).
“retrieve”: uncertainties are scaled with a coefficient, which is retrieved.
“retrieve_add”: a fixed scalar is added to the uncertainties, and is retrieved.

scattering_in_emissionBool

If using emission spectra, turn scattering on or off.

pressuresnumpy.ndarray

A log-spaced array of pressures over which to retrieve. 100 points is standard, between 10^-6 and 10^3.

Returns:

An Retrieval object instance.

generate_retrieval_summary(stats: dict = None, sampler_parameters: dict = None) → None#

This function produces a human-readable text file describing the retrieval. It includes all the fixed and free parameters, the limits of the priors (if uniform), a description of the data used, and if the retrieval is complete, a summary of the best fit parameters and model evidence.

Args:

statsdict: Optional sampler-specific statistics object. If the active sampler exposes a standardized summary payload, that is preferred.

_collect_best_fit_summary() → dict | None#

_count_data_points() → int#

_count_free_parameters() → int#

_get_log_weight_samples(results=None) → numpy.ndarray | None#

_collect_evidence_approximations(max_logl: float, logl_samples: numpy.typing.ArrayLike, log_weights: numpy.typing.ArrayLike = None) → dict[str, float] | None#

_collect_sampler_summary(stats: dict = None, sampler_parameters: dict = None) → dict#

_write_sampler_summary_section(summary, sampler_summary: dict) → None#

static _get_results_field(results, *field_names)#

_get_active_sampler_results()#

_coerce_parameter_sample_matrix(parameter_samples, parameter_names)#

_evaluate_sample_log_likelihoods(parameter_samples)#

_build_samples_from_sampler_results(results=None, parameter_names=None)#

_get_samples_archive_path(ret_name: str | None = None) → str#

_load_samples_archive(ret_name: str | None = None)#

_load_jaxns_raw_results_samples(*, output_directory: str, ret_name: str, parameter_names: list[str])#

_build_samples_for_export()#

_save_samples_archive() → str | None#

_register_sampler_results(sampler, results)#

_finalize_sampler_run(sampler, results, stats=None, sampler_parameters=None)#

get_pymultinest_analyzer(ret_name: str = '')#

Get the PMN analyzer from a retrieval run

This function uses gets the PMN analyzer object for the current retrieval_name by default, though any retrieval_name in the out_PMN folder can be passed as an argument - useful for when you’re comparing multiple similar models.

Args:

ret_namestring: The name of the retrieval that prepends all the PMN output files.

get_base_figure_name() → str#

get_best_fit_chi2(samples: numpy.typing.NDArray[numpy.floating]) → float#

Get the 𝛘^2 of the best fit model - removing normalization term from log L

Args:

samplesnumpy.ndarray: An array of samples and likelihoods taken from a post_equal_weights file.

static get_best_fit_likelihood(samples: numpy.typing.NDArray[numpy.floating], print_value: bool = True) → tuple[float, int]#

Get the log likelihood of the best fit model

Args:

samplesnumpy.ndarray: An array of samples and likelihoods taken from a post_equal_weights file.
print_valuebool: If True, print the best fit likelihood value.

get_best_fit_model(best_fit_params: numpy.typing.NDArray[numpy.floating], parameters_read: list[str], ret_name: str = None, contribution: bool = False, prt_reference: str = None, model_generating_function: Callable = None, refresh: bool = True, mode: str = 'bestfit', save: bool = True)#

This function uses the best fit parameters to generate a pRT model that spans the entire wavelength range of the retrieval, to be used in plots.

Args:

best_fit_paramsnumpy.ndarray: A numpy array containing the best fit parameters, to be passed to build_param_dict
parameters_readlist: A list of the free parameters as read from the output files.
ret_namestr: If plotting a fit from a different retrieval, input the retrieval name to be included.
contributionbool: If True, calculate the emission or transmission contribution function as well as the spectrum.
prt_referencestr: If specified, the pRT object of the data with name pRT_reference will be used for plotting, instead of generating a new pRT object at R = 1000.
model_generating_function(callable, optional):: A function that returns the wavelength and spectrum, and takes a Radtrans object and the current set of parameters stored in self.configuration.parameters. This should be the same model function used in the retrieval.
refreshbool: If True (default value) the .npy files in the evaluate_[retrieval_name] folder will be replaced by recalculating the best fit model. This is useful if plotting intermediate results from a retrieval that is still running. If False no new spectrum will be calculated and the plotlib will be generated from the .npy files in the evaluate_[retrieval_name] folder.
modestr: If “best_fit”, will use the maximum likelihood parameter values to calculate the best fit model and contribution. If “median”, uses the median parameter values.
savebool: If True, save the best fit spectrum.

Returns:

best_fit_wavelengthsnumpy.ndarray: The wavelength array of the best fit model
best_fit_spectrumnumpy.ndarray: The emission or transmission spectrum array, with the same shape as best_fit_wavelengths

static _coerce_best_fit_auxiliary_outputs(auxiliary_outputs)#

get_best_fit_parameters(return_max_likelihood: bool = False)#

Get the retrieved best fit parameters.

Args:

return_max_likelihood:: if True, also return the max (log) likelihood and the chi2.

Returns:

A dict containing the free parameter names as keys and their retrieved best-fit values as values. If return_max_likelihood is True, return in addition the max (log) likelihood and the chi2.

get_chi2_from_sample(sample: numpy.typing.NDArray[numpy.floating]) → float#

Get the 𝛘^2 of the given sample relative to the data - removing normalization term from log L

Args:

samplenumpy.ndarray: A single sample and likelihood taken from a post_equal_weights file

get_chi2_normalisation_from_sample(sample: numpy.typing.NDArray[numpy.floating]) → float#

Get the 𝛘^2 normalization term from log L

Args:

samplenumpy.ndarray: A single sample and likelihood taken from a post_equal_weights file

get_elpd_per_datapoint(ret_name: str = None)#

get_evidence(ret_name: str = '') → tuple[float, float]#

Get the log10 Z and error for the retrieval

This function uses the pymultinest analyzer to get the evidence for the current retrieval_name by default, though any retrieval_name in the out_PMN folder can be passed as an argument - useful for when you’re comparing multiple similar models. This value is also printed in the summary file.

Args:

ret_namestring: The name of the retrieval that prepends all the PMN output files.

get_full_range_model(parameters: dict[str, petitRADTRANS.retrieval.parameter.Parameter], model_generating_function: Callable = None, contribution: bool = False, prt_object: petitRADTRANS.radtrans.Radtrans = None, prt_reference: petitRADTRANS.retrieval.data.Data = None) → tuple#

Retrieve a full wavelength range model based on the given parameters.

Parameters:

parameters (dict): A dictionary containing parameters used to generate the model. model_generating_function (callable, optional): A function to generate the model.

Defaults to None.

contribution (bool, optional): Return the emission or transmission contribution function.: Defaults to False.
prt_object (object, optional): RadTrans object for calculating the spectrum.: Defaults to None.
prt_reference (object, optional): Reference Data object for calculating the spectrum.: Defaults to None.

Returns:

object: The generated full range model.

_get_batched_full_range_model_outputs(free_parameter_values: numpy.typing.ArrayLike, free_param_names: list[str], *, model_generating_function: Callable = None, prt_object: petitRADTRANS.radtrans.Radtrans = None, prt_reference: petitRADTRANS.retrieval.data.Data | str = None, contribution: bool = False, pt_plot_mode: bool = False, adaptive_mesh_refinement: bool | None = None) → tuple[numpy.ndarray, numpy.ndarray]#

Return batched model outputs for a batch of sampled free parameters.

Model functions are evaluated with a single static model context and jax.vmap over the leading sample dimension when possible. Legacy adapters and differentiable models that still require Python-side concretization fall back to scalar evaluation to preserve compatibility.

When pt_plot_mode=False this returns batched wavelength and spectrum arrays. When pt_plot_mode=True this returns a shared pressure grid and a temperature matrix with the leading dimension matching the sample batch.

get_log_likelihood_per_datapoint(samples_use: numpy.typing.NDArray[numpy.floating], ret_name: str = None, n_samples_max: int = 2000, batch_size: int = 128, sample_thinning_seed: int = None) → dict[str, numpy.typing.NDArray[numpy.floating]]#

Compute and save per-datapoint log-likelihoods over posterior samples.

The per-point log-likelihood terms are evaluated through the JAX RetrievalRuntime. When all model groups use the differentiable model contract, samples are evaluated in jit-compiled, vmapped chunks of batch_size; otherwise (or when a model is not vmap-safe) the samples are evaluated one at a time through the same runtime path.

The resulting arrays of shape (n_samples, n_datapoints) are saved to evaluate_{ret_name}/{ret_name}_logL_per_datapoint_{data_name}.npy for use by get_elpd_per_datapoint().

Args:

samples_usenumpy.ndarray: Posterior free-parameter samples with shape (n_samples, n_free_parameters), e.g. the equal-weighted posterior without the log-likelihood column.
ret_namestring: Retrieval name used for the output file names. Defaults to the current retrieval name.
n_samples_maxint: Maximum number of posterior samples to evaluate. PSIS tail fits stabilize with ~1000-2000 samples, so evaluating more equal-weight samples is usually wasted forward-model work. If the input holds more samples, a random subset of this size is used. Set to None to evaluate all samples.
batch_sizeint: Number of samples evaluated per vmapped chunk. Bounds device memory usage during batched forward-model evaluation.
sample_thinning_seedint: Seed for the random subsampling applied when n_samples_max truncates the input.

Returns:

A dictionary mapping each data name to its per-datapoint log-likelihood array of shape (n_samples, n_datapoints).

get_mass_fractions(sample: numpy.typing.NDArray[numpy.floating], parameters_read: list[str] = None)#

This function returns the mass fraction abundances of each species as a function of pressure

Args:

samplenumpy.ndarray: A sample from the pymultinest output, the abundances returned will be computed for this set of parameters.
parameters_readlist: A list of the free parameters as read from the output files.

Returns:

abundancesdict: A dictionary of abundances. The keys are the species name, the values are the mass fraction abundances at each pressure
MMWnumpy.ndarray: The mean molecular weight at each pressure level in the atmosphere.

get_median_params(samples: numpy.typing.NDArray[numpy.floating], parameters_read: list[str], return_array=False)#

Build a parameter dictionary from the marginal median of each parameter.

Args:

samplesnumpy.ndarray: An array of samples and likelihoods taken from a post_equal_weights file.
parameters_readlist: A list of the free parameter names as read from the output files.
return_arraybool: If True, also return the median parameters as an array

get_parameters_prior_mid_range_values() → numpy.typing.NDArray[numpy.floating]#: Return the prior mid-range values of the free parameters.

get_quantile_parameters(quantile: float) → dict[str, float]#

Return the given quantile of the retrieved parameters’ posteriors. For example, if quantile = 0.84, the 84th percentile (~+1 sigma) of the retrieved parameters’ posterior is returned.

Args:

quantile:: Quantile of the retrieved parameter’s posterior.

Returns:

A dict with the free parameter names as keys and their value at the requested quantile.

get_reduced_chi2_from_sample(sample, subtract_n_parameters=False, verbose=False, show_chi2=False)#

Get the 𝛘^2/DoF of the given model - divide chi^2 by DoF or number of wavelength channels.

Args:

samplenumpy.ndarray: A single sample and likelihoods taken from a post_equal_weights file
subtract_n_parametersbool: If True, divide the Chi2 by the degrees of freedom (n_data - n_parameters). If False, divide only by n_data
verbosebool: If True, display the calculated best fit reduced chi^2, and also the best fit chi^2 if show_chi2 is True
show_chi2bool: If True, additionally display the calculated best fit chi^2 if verbose is True

get_reduced_chi2_from_model(wavelengths_model, spectrum_model, parameters, subtract_n_parameters=False, verbose=False, show_chi2=False)#

Get the 𝛘^2/DoF of the supplied spectrum - divide chi^2 by DoF

Args:

wavelengths_modelnp.ndarray: The wavelength grid of the model spectrum in micron.
spectrum_modelnp.ndarray: The model flux in the same units as the data.
parametersdict[Parameter]: Dictionary of Parameters passed to the forward model used to calculate the spectrum
subtract_n_parametersbool: If True, divide the Chi2 by the degrees of freedom (n_data - n_parameters). If False, divide only by n_data
verbosebool: If True, display the calculated best fit chi^2 and reduced chi^2
show_chi2bool: If True, additionally display the calculated best fit chi^2 if verbose is True

get_samples(names: list[str] = None, output_directory: str = os.getcwd(), ret_names: list[str] = None) → tuple[dict[str, numpy.typing.NDArray[numpy.floating]], dict[str, list[str]]]#

Get the samples of the requested retrievals.

Args:

names:: Names of the retrievals from which to get the samples.
output_directory :: Directories in which the retrievals are stored.
ret_nameslist(str): Additional list of retrieval names. Used if multiple retrievals are to be included in a corner plot.

Returns:

A dict containing the retrieval names as keys and the samples of each requested retrievals as values, and a dict containing the retrieval names as keys and the free parameters names as values.

get_samples_dict(return_likelihood: bool = False) → dict[str, numpy.typing.NDArray[numpy.floating]]#

Return the samples of this retrieval as a dict.

Args:

return_likelihood:: If True, the samples dictionary also contains the log likelihood of each sample.

Returns:

A dict containing the free parameters names as keys, and their sampled values as values.

combine_chains(num_chains: int | None = None, *, on_missing: str = 'warn', rhat_threshold: float = 1.01, verbose: bool = True)#

Combine per-chain MCMC outputs into one pooled posterior + diagnostics.

Reads the per-chain artifacts written by a multi-chain run (out_<Backend>/<retrieval_name>_chain<c>_samples.npz), pools the draws, computes convergence diagnostics (split-R-hat, ESS, divergence fraction), and writes the combined posterior under <retrieval_name>_combined in the same on-disk format that get_samples() reads, plus a <retrieval_name>_combined_diagnostics.json sidecar.

Gradient-MCMC backends (NumPyro, BlackJAX) are pooled with R-hat/ESS diagnostics; nested backends (JAXNS) are combined evidence-aware (per-run log Z weighting + between-run log Z scatter) via a separate code path.

Args:

num_chains:: If given, expect chains 0..num_chains-1 and report any missing. If None, combine whatever _chain<c>_ files are present.
on_missing:: "warn" (default) combines the survivors and warns; "raise" raises if any expected chain is missing.
rhat_threshold:: R-hat value above which the run is flagged as not converged.
verbose:: Print a short diagnostics summary.

Returns:

The combine summary, including the combined retrieval name, written file paths, and the diagnostics dictionary.

get_convergence_diagnostics() → dict | None#

Return the most recent multi-chain convergence diagnostics, if any.

Populated by combine_chains(); returns None if chains have not been combined in this session.

classmethod run_multichain(config_builder, *, num_chains: int, output_directory: str, retrieval_kwargs: dict | None = None, chain_execution: str = 'auto', chain_seed: int = 12345, per_chain_memory_gb: float | None = None, on_chain_failure: str = 'warn', combine: bool = True, rhat_threshold: float = 1.01, max_workers: int | None = None, sampler_type: str | None = None, base_retrieval_name: str | None = None, resume: bool = False, verbose: bool = True, **run_kwargs) → dict#

Run a multi-chain retrieval and (optionally) combine the chains.

This is the native entry point for multi-chain JAX-sampler runs. It keeps the parent process lightweight for the out-of-process modes (only the children build the heavy runtime) and merges the per-chain posteriors into one combined result with convergence diagnostics.

Parameters#

config_builder:: A top-level (module-level) callable returning a fresh RetrievalConfig. For the processes mode it is dispatched to spawn workers (so it must be picklable by reference: no closures or lambdas), and the call site must be guarded by if __name__ == "__main__":. It should be cheap (it is also called once in the parent to read the sampler type and retrieval name unless those are given explicitly).
num_chains:: Number of independent chains.
output_directory:: Output directory passed to each per-chain Retrieval.
retrieval_kwargs:: Extra keyword arguments for the Retrieval(config, ...) constructor.
chain_execution:: auto | vectorised | sharded | sequential | processes. auto chooses based on devices and num_chains (see petitRADTRANS.retrieval.multichain.resolve_chain_execution()).
chain_seed:: Base seed; chain c uses derive_chain_seed(chain_seed, c).
per_chain_memory_gb:: Resident memory per chain, used for the printed peak-memory estimate and to cap process concurrency.
on_chain_failure:: "warn" (combine survivors) or "raise".
combine:: If True (and the sampler is gradient-MCMC), combine the per-chain posteriors and compute diagnostics after the chains finish.

Returns#

dict with keys execution, num_chains, succeeded_chain_ids, failed (chain_id -> error), combine (the combine summary or None), and diagnostics (or None).

classmethod run_single_chain(config_builder, chain_id: int, *, output_directory: str, num_chains: int, chain_seed: int = 12345, sampler_type: str | None = None, base_retrieval_name: str | None = None, retrieval_kwargs: dict | None = None, resume: bool = True, verbose: bool = True, **run_kwargs) → str | None#

Run exactly one chain (used by the SLURM-array worker and as a primitive).

Builds a Retrieval from config_builder and runs chain chain_id into its per-chain namespace. With resume=True the chain is skipped (returns None) if its *_samples.npz artifact already exists. Returns the per-chain retrieval name, or None if skipped.

classmethod combine_chains_from_builder(config_builder, *, output_directory: str, num_chains: int, retrieval_kwargs: dict | None = None, on_missing: str = 'warn', verbose: bool = True)#

Build a Retrieval from config_builder and combine its chains.

Convenience for the SLURM-array combine task: it rebuilds a (cheap) Retrieval and calls combine_chains().

classmethod write_chain_launcher(config_builder, *, num_chains: int, output_directory: str, script_dir: str = '.', scheduler: str = 'slurm_array', builder_module: str | None = None, builder_function: str | None = None, retrieval_kwargs: dict | None = None, run_kwargs: dict | None = None, chain_seed: int = 12345, resume: bool = True, base_retrieval_name: str | None = None, python: str = 'python', cpus_per_task: int = 8, partition: str | None = None, time: str = '24:00:00', mem: str | None = None, gres: str | None = None, account: str | None = None, log_dir: str = 'out', job_name: str | None = None, env_setup_lines: tuple = (), extra_sbatch_lines: tuple = (), verbose: bool = True) → dict#

Generate a SLURM job-array launcher for a multi-node multi-chain run.

Writes a per-task worker script plus an array sbatch, a dependent combine sbatch, and a submit driver into script_dir. The array runs one chain per task (SLURM_ARRAY_TASK_ID); the combine task depends on it (afterany) and merges the survivors. config_builder must live in an importable module (not __main__) and be a top-level function; retrieval_kwargs/run_kwargs must be Python literals (load arrays inside the builder).

static get_special_parameters() → set[str]#

get_volume_mixing_ratios(sample: numpy.typing.NDArray[numpy.floating], parameters_read: list[str] = None)#

This function returns the VMRs of each species as a function of pressure.

Args:

samplenumpy.ndarray: A sample from the pymultinest output, the abundances returned will be computed for this set of parameters.
parameters_readlist: A list of the free parameters as read from the output files.

Returns:

vmrdict: A dictionary of abundances. The keys are the species name, the values are the mass fraction abundances at each pressure
MMWnumpy.ndarray: The mean molecular weight at each pressure level in the atmosphere.

log_likelihood(cube: numpy.typing.NDArray[numpy.floating], ndim: int = 0, nparam: int = 0, log_l_per_datapoint_dict: dict = None, return_model: bool = False)#

Sampler-facing log-likelihood entry point.

The normal scalar retrieval path now executes through RetrievalRuntime. Less common debug modes still fall back to the legacy implementation until the entire retrieval stack has been migrated.

When log_l_per_datapoint_dict is a dict, the per-datapoint log-likelihood vector of each dataset is appended to the corresponding dict entry and the dict is returned (used by PSIS leave-one-out analysis; see get_log_likelihood_per_datapoint()).

_log_likelihood_legacy(cube: numpy.typing.NDArray[numpy.floating], ndim: int = 0, nparam: int = 0, return_model: bool = False)#: Legacy scalar likelihood implementation retained for fallback modes.

_set_free_parameter_values(values: numpy.typing.ArrayLike) → None#: Copy sampled free-parameter values into the active retrieval configuration.

static _flatten_free_parameter_values(values: numpy.typing.ArrayLike) → numpy.ndarray#

Return a 1D numpy array from sampler-provided parameter values.

PyMultiNest can pass a ctypes-backed buffer that NumPy 2 refuses to wrap directly via np.asarray because of an unsupported PEP 3118 format string. Falling back to plain indexing avoids that incompatibility while keeping support for numpy, JAX, and Python sequence inputs.

_get_parameter_layout() → petitRADTRANS.retrieval.runtime.ParameterLayout#

_get_runtime() → petitRADTRANS.retrieval.runtime.RetrievalRuntime#

_has_only_legacy_model_groups() → bool#

_ensure_jax_sampler_model_support(sampler_name: str) → None#

_get_ultranest_vectorization_disable_reason() → str | None#

_resolve_ultranest_vectorized_mode(requested: bool) → bool#

log_likelihood_jax(parameters=None, configuration=None, log_l_per_datapoint_dict: dict = None, return_model: bool = False, uncertainties_mode: str = 'default', generate_mock_data: bool = False, print_log_likelihood_for_debugging: bool = False, *, physical_params: PhysicalParams | None = None)#

JAX-compatible log likelihood computation.

Uses the pre-built RetrievalRuntime stored on self instead of reconstructing it on every call. Accepts either a legacy parameters dict (for backward compatibility) or a physical_params instance.

Falls back to the legacy implementation when return_model or generate_mock_data are requested. When log_l_per_datapoint_dict is a dict, the per-datapoint log-likelihood vector of each dataset is appended to the corresponding dict entry and the dict is returned.

static _log_likelihood_jax_legacy(parameters, configuration, return_model: bool = False, uncertainties_mode: str = 'default', generate_mock_data: bool = False, print_log_likelihood_for_debugging: bool = False)#

static _compute_object_array_log_likelihood(data, spectrum_model: jax.numpy.ndarray, beta: float, beta_mode: str) → float#

Helper method to compute log likelihood for object array spectra.

This extracts the complex object array handling logic to improve clarity.

Args:: data: Data object with object array spectra spectrum_model: Model spectrum beta: Uncertainty scaling parameter beta_mode: Mode for applying beta
Returns:: Total log likelihood for all object array entries

prior(cube: numpy.typing.NDArray[float], ndim: int = 0, nparams: int = 0) → numpy.typing.NDArray[float]#: pyMultinest Prior function. Transforms unit hypercube into physical space.

prior_ultranest(cube: numpy.typing.NDArray[numpy.floating]) → numpy.typing.NDArray[numpy.floating]#: UltraNest prior function. Transforms unit hypercube into physical space.

prior_model_jaxns()#

_build_sampler_context() → petitRADTRANS.retrieval.sampler.SamplerContext#: Build a SamplerContext from the current Retrieval state.

_build_jaxns_sampler_interface()#

_build_unconstrained_mcmc_interface()#

_build_blackjax_logdensity_interface()#

run(sampler_object: petitRADTRANS.retrieval.sampler.Sampler = None, **sampler_kwargs)#

Run mode for the class. Uses pynultinest to sample parameter space and produce standard PMN outputs. Args:

sampler_typestr

The type of sampler to use. Can be one of the following:
pymultinest - requires Multinest installation. dynesty - requires Dynesty and runs dynamic nested sampling. blackjaxnuts - requires BlackJAX and runs NUTS with window adaptation. blackjaxhmc - requires BlackJAX and runs HMC with window adaptation. numpyronuts - requires NumPyro and runs NUTS with warmup adaptation. numpyrohmc - requires NumPyro and runs HMC with warmup adaptation. jaxns - Requires JAXNS. Default choice for JAXNS jaxnsshardedstaticnestedsampler - Requires JAXNS. Advanced option for JAXNS ultranest - Requires ultranest. Implements.

sample_effective_temperature(sample_dict, param_dict, ret_names=None, nsample=None, resolution=40)#

This function samples the outputs of a retrieval and computes Teff for each sample. For each sample, a model is computed at low resolution, and integrated to find the total radiant emittance, which is converted into a temperature using the stefan boltzmann law: $j^{star} = sigma T^{4}$. Teff itself is computed using util.calc_teff.

Args:

sample_dictdict: A dictionary, where each key is the name of a retrieval, and the values are the equal weighted samples.
param_dictdict: A dictionary where each key is the name of a retrieval, and the values are the names of the free parameters associated with that retrieval.
ret_namesOptional(list(string)): A list of retrieval names, each should be included in the sample_dict. If left as none, it defaults to only using the current retrieval name.
nsampleOptional(int): The number of times to compute Teff. If left empty, uses the configured reference_data_name. Recommended to use ~300 samples, probably more than the default plotting sample count.
resolutionint: The spectra resolution to compute the models at. Typically, this should be very low in order to enable rapid calculation.

Returns:

temperature_dictdict: A dictionary with retrieval names for keys, and the values are the calculated values of Teff for each sample.

save_best_fit_outputs(parameters, only_return_best_fit_spectra=False, retain_best_fit_spectra=True)#

save_best_fit_outputs_external_variability(parameters, only_return_best_fit_spectra=False, retain_best_fit_spectra=True)#

save_mass_fractions(sample_dict, parameter_dict, rets=None)#

Save mass fractions and line absorber species information for specified retrievals.

Parameters: - self: The instance of the class containing the function. - sample_dict (dict): A dictionary mapping retrieval names to lists of samples. - parameter_dict (dict): A dictionary mapping retrieval names to parameter values. - rets (list, optional): List of retrieval names to process. If None, uses the default retrieval name.

Returns: - mass_fractions (numpy.ndarray): Array containing mass fractions for each sample and species.

The function processes the specified retrievals and saves the corresponding mass fracs and line absorber species information to files in the output directory. If ‘rets’ is not provided, the default retrieval name is used. The mass fractinos are saved in a numpy file, and the line absorber species are saved in a JSON file.

Example usage:: ` sample_dict = {'Retrieval1': [...], 'Retrieval2': [...]} parameter_dict = {'Retrieval1': {...}, 'Retrieval2': {...}} mass_fractions = save_mass_fractions(sample_dict, parameter_dict) `

save_volume_mixing_ratios(sample_dict, parameter_dict, rets=None)#

Save volume mixing ratios (VMRs) and line absorber species information for specified retrievals.

Returns: - vmrs (numpy.ndarray): Array containing volume mixing ratios for each sample and species.

The function processes the specified retrievals and saves the corresponding VMRs and line absorber species information to files in the output directory. If ‘rets’ is not provided, the default retrieval name is used. The VMRs are saved in a numpy file, and the line absorber species are saved in a JSON file.

Example usage:: ` sample_dict = {'Retrieval1': [...], 'Retrieval2': [...]} parameter_dict = {'Retrieval1': {...}, 'Retrieval2': {...}} vmrs = save_volume_mixing_ratios(sample_dict, parameter_dict) `

initialise_radtrans_objects(scaling=10, width=3)#

Creates a pRT object for each data set that asks for a unique object. Checks if there are low resolution c-k models from exo-k, and creates them if necessary. The scaling and width parameters are retained for API compatibility and are ignored by the fixed-size adaptive mesh refinement path.

Args:

scalingint: A multiplicative factor that determines the size of the full high resolution pressure grid, which will have length self.p_global.shape[0] * scaling.
widthint: The number of cells in the low pressure grid to replace with the high resolution grid.

plot_abundances(samples_use, parameters_read, species_to_plot=None, contribution=False, refresh=True, model_generating_function=None, prt_reference=None, mode='bestfit', sample_posteriors=False, volume_mixing_ratio=False)#

Plot abundance profiles (mass fractions or volume mixing ratios) as a function of pressure. This is a wrapper for RetrievalPlotter.plot_abundances.

Args:: samples_use: Array of samples from the posterior. parameters_read: List of free parameter names. species_to_plot: List of species to plot (optional). contribution: If True, overplot the contribution function. refresh: If True, recalculate and overwrite cached results. model_generating_function: Optional model function for spectrum generation. prt_reference: Optional reference for pRT object. mode: ‘bestfit’ or ‘median’ for which sample to plot. sample_posteriors: If True, plot posterior intervals. volume_mixing_ratio: If True, plot volume mixing ratios instead of mass fractions.
Returns:: fig, ax: Matplotlib figure and axis objects.

plot_all(output_directory=None, ret_names=None, contribution=False, model_generating_function=None, prt_reference=None, mode='bestfit')#

Generate all standard plots for a retrieval, including best-fit spectrum, sample spectra, PT profile, corner plot, and abundances. This is a wrapper for RetrievalPlotter.plot_all.

Args:: output_directory: Directory to save plots (optional). ret_names: List of retrieval names for plotting (optional). contribution: If True, plot contribution function. model_generating_function: Optional model function for spectrum generation. prt_reference: Optional reference for pRT object. mode: ‘bestfit’ or ‘median’ for which sample to plot.
Returns:: None

plot_contribution(samples_use, parameters_read, model_generating_function=None, prt_reference=None, log_scale_contribution=False, n_contour_levels=30, refresh=True, mode='bestfit')#

Plot the contribution function of the best-fit or median model from a retrieval. This is a wrapper for RetrievalPlotter.plot_contribution.

Args:: samples_use: Array of samples from the posterior. parameters_read: List of free parameter names. model_generating_function: Optional model function for spectrum generation. prt_reference: Optional reference for pRT object. log_scale_contribution: If True, plot -log10(weighted flux). n_contour_levels: Number of contour levels in the plot. refresh: If True, recalculate and overwrite cached results. mode: ‘bestfit’ or ‘median’ for which sample to plot.
Returns:: fig, ax: Matplotlib figure and axis objects.

plot_corner(sample_dict, parameter_dict, parameters_read, plot_best_fit=True, true_values=None, **kwargs)#

Make a corner plot of the posterior samples for the retrieved parameters. This is a wrapper for RetrievalPlotter.plot_corner.

Args:: sample_dict: Dictionary of posterior samples for each retrieval. parameter_dict: Dictionary of parameter names for each retrieval. parameters_read: List of free parameter names. plot_best_fit: If True, mark the best-fit point on the plot. true_values: Optional dictionary of true parameter values for reference. **kwargs: Additional keyword arguments passed to the plotting function.
Returns:: fig: Matplotlib figure object.

plot_data(yscale='linear')#

Plot the observational data used in the retrieval. This is a wrapper for RetrievalPlotter.plot_data.

Args:: yscale: Y-axis scaling for the plot (default ‘linear’).
Returns:: None

plot_pt(sample_dict, parameters_read, contribution=False, refresh=False, model_generating_function=None, prt_reference=None, mode='bestfit')#

Plot the pressure-temperature (PT) profile with error contours for the retrieval. This is a wrapper for RetrievalPlotter.plot_pt.

Args:: sample_dict: Dictionary of posterior samples for each retrieval. parameters_read: List of free parameter names. contribution: If True, overplot the contribution function. refresh: If True, recalculate and overwrite cached results. model_generating_function: Optional model function for spectrum generation. prt_reference: Optional reference for pRT object. mode: ‘bestfit’ or ‘median’ for which sample to plot.
Returns:: fig, ax: Matplotlib figure and axis objects.

plot_sampled(samples_use, parameters_read, downsample_factor=None, save_outputs=False, nsample=None, model_generating_function=None, prt_reference=None, refresh=True)#

Plot a set of randomly sampled output spectra for each dataset in the retrieval. This is a wrapper for RetrievalPlotter.plot_sampled.

Args:: samples_use: Array of samples from the posterior. parameters_read: List of free parameter names. downsample_factor: Optional factor to downsample the spectra. save_outputs: If True, save the sampled spectra to disk. nsample: Number of samples to plot (optional). model_generating_function: Optional model function for spectrum generation. prt_reference: Optional reference for pRT object. refresh: If True, recalculate and overwrite cached results.
Returns:: fig, ax: Matplotlib figure and axis objects.

plot_spectra(samples_use, parameters_read, model_generating_function=None, prt_reference=None, refresh=True, mode='bestfit', marker_color_type=None, marker_cmap=None, marker_label='', only_save_best_fit_spectra=False)#

Plot the best-fit spectrum, the data from each dataset, and the residuals between the two. This is a wrapper for RetrievalPlotter.plot_spectra.

Args:: samples_use: Array of samples from the posterior. parameters_read: List of free parameter names. model_generating_function: Optional model function for spectrum generation. prt_reference: Optional reference for pRT object. refresh: If True, recalculate and overwrite cached results. mode: ‘bestfit’ or ‘median’ for which sample to plot. marker_color_type: Optional marker color type for data points. marker_cmap: Optional colormap for markers. marker_label: Optional label for markers. only_save_best_fit_spectra: If True, only save the best-fit spectra to disk.
Returns:: fig, ax, ax_r: Matplotlib figure and axis objects for the spectrum and residuals.

save_configuration()#: Save the retrieval_config configuration to file. Warning, can take up a significant amount of storage space, as the opacities for each Radtrans object will also be saved to file.

class petitRADTRANS.retrieval.RetrievalConfig(retrieval_name: str = 'retrieval_name', run_mode: str = 'retrieval', sampler_type: str = 'pymultinest', adaptive_mesh_refinement: bool = False, scattering_in_emission: bool = False, pressures: jax.typing.ArrayLike | None = None, feautrier_chunk_size: int | None = None, chunk_feautrier: bool = True, adaptive_feautrier_iterations: bool = False, amr=_UNSET)#

Contain all the data and model level information necessary to run a petitRADTRANS retrieval.

The name of the class will be used to name outputs. This class is passed to the Retrieval, which runs the actual pymultinest retrieval and produces the outputs.

The general usage of this class is to define it, add the parameters and their priors, add the opacity sources, the data together with a model for each dataset, and then configure a few plotting arguments.

Args:

retrieval_namestr: Name of this retrieval. Make it informative so that you can keep track of the outputs!
run_modestr: Can be either ‘retrieval’, which runs the retrieval normally using pymultinest, or ‘evaluate’, which produces plots from the best fit parameters stored in the output post_equal_weights file.
adaptive_mesh_refinementbool: Use an adaptive high resolution pressure grid around the location of cloud condensation. This will increase the size of the pressure grid by a constant factor that can be adjusted in the setup_pres function.
scattering_in_emissionbool: If using emission spectra, turn scattering on or off.
pressuresnumpy.array: A log-spaced array of pressures over which to retrieve. 100 points is standard, between 10^-6 and 10^3.
feautrier_chunk_sizeint, optional: Number of frequency bins solved per chunk of the Feautrier scattering solver used for emission spectra. Smaller values reduce the peak memory of the radiative transfer at the cost of a modest amount of extra scan overhead, which is useful on memory-constrained GPUs. If None (default), the value is taken from the PRT_FEAUTRIER_CHUNK_SIZE environment variable (falling back to a built-in default). This is forwarded to the ModelContext objects used to compute the radiative transfer.
chunk_feautrierbool: Whether the Feautrier scattering solver chunks the frequency axis (see feautrier_chunk_size). Only relevant when scattering_in_emission is True. Forwarded to the ModelContext objects.
adaptive_feautrier_iterationsbool: Whether the Feautrier scattering solver adaptively sets the number of iterations based on the photon destruction probability. Only relevant when scattering_in_emission is True. Forwarded to the ModelContext objects.

retrieval_name = 'retrieval_name'#

run_mode = 'retrieval'#

sampler_type = 'pymultinest'#

adaptive_mesh_refinement = False#

scaling = 1#

width = 1#

scattering_in_emission = False#

feautrier_chunk_size = None#

chunk_feautrier = True#

adaptive_feautrier_iterations = False#

parameters#

data#

instruments = ()#

line_species = ()#

cloud_species = ()#

rayleigh_species = ()#

continuum_opacities = ()#

_setup_pres(scaling: int = 10, width: int = 3) → jax.typing.ArrayLike#

add_parameter(name: str, is_free_parameter: bool, value: float | None = None, distribution: Any | None = None, plot_in_corner: bool = False, corner_ranges: tuple[float, float] | None = None, corner_transform: Callable | None = None, transform_prior_cube_coordinate: Callable | None = None, corner_label: str | None = None, units: Any | None = None, free=_UNSET) → None#

This function adds a Parameter (see parameter.py) to the dictionary of parameters. A Parameter has a name and a boolean parameter to set whether it is a free or fixed parameter during the retrieval. In addition, a value can be set, or a prior function can be given that transforms a random variable in [0,1] to the physical dimensions of the Parameter.

Args:

namestr: The name of the parameter. Must match the name used in the model function for the retrieval.
freebool: True if the parameter is a free parameter in the retrieval, false if it is fixed.
valuefloat: The value of the parameter in the units used by the model function.
transform_prior_cube_coordinatemethod: A function that transforms the unit interval to the physical units of the parameter. Typically given as a lambda function.
unitsastropy.units.UnitBase, optional: The physical units of the parameter value. Defaults to None (value already in the units expected by the model function). When set to an astropy unit, model functions can convert the value to the units required by the radiative transfer (e.g. cm, g). Currently used by planet_radius, system_distance, and planet_mass (and their aliases).

_add_uniform_free_parameter(name: str, lower: float, upper: float, corner_label: str = None) → None#

static _default_covariance_parameter_priors(data_object: petitRADTRANS.retrieval.data.Data) → dict[str, tuple[float, float]]#

_add_covariance_hyperparameters(data_object: petitRADTRANS.retrieval.data.Data, covariance_mode: str, n_local_covariance_kernels: int, covariance_parameter_priors: dict | None = None) → None#

static list_available_line_species() → set[str]#: List the currently installed opacity tables that are available for species that contribute to the line opacity.

static list_available_cloud_species() → set[str]#: List the currently installed opacity tables that are available for cloud species.

static list_available_cia_species() → set[str]#: List the currently installed opacity tables that are available for CIA species.

set_line_species(line_species: list[str] | tuple[str], use_equilibrium_chemistry: bool = False, free_mass_fraction_limits: tuple[float, float] = (-6.0, -0.5), plot_in_corner: bool = True, linelist=_UNSET, eq=_UNSET, abund_lim=_UNSET) → None#

Set RadTrans.line_species

This function adds a list of species to the pRT object that will define the line opacities of the model. The values in the list are strings, with the names matching the pRT opacity names, which vary between the c-k line opacities and the line-by-line opacities.

Args:

line_speciesList(str): The list of species to include in the retrieval
use_equilibrium_chemistrybool: If false, the retrieval should use free chemistry, and Parameters for the abundance of each species in the line_species will be added to the retrieval. Otherwise, equilibrium chemistry will be used. If you need fine control species, use the add_line_species and set up each species individually.
free_mass_fraction_limitsTuple(float,float): If free is True, this sets the boundaries of the uniform prior that will be applied for each species in line_species. The range of the prior goes from free_mass_fraction_limits[0] to free_mass_fraction_limits[1]. The abundance limits must be given in log10 units of the mass fraction.

set_rayleigh_species(rayleigh_species: list[str] = _UNSET, linelist=_UNSET) → None#

Set the list of species that contribute to the rayleigh scattering in the pRT object.

Args:

rayleigh_speciesList(str): A list of species that contribute to the rayleigh opacity.

set_continuum_opacities(continuum_opacities: list[str] = _UNSET, linelist=_UNSET) → None#

Set the list of species that contribute to the continuum opacity in the pRT object.

Args:

continuum_opacitiesList(str): A list of species that contribute to the continuum opacity.

add_line_species(species: str, use_equilibrium_chemistry: bool = False, free_mass_fraction_limits: tuple[float, float] = (-6.0, -0.5), fixed_mass_fraction_value: float | None = None, plot_in_corner: bool = True, corner_label: str | None = None, corner_ranges: tuple[float, float] | None = None, eq=_UNSET, abund_lim=_UNSET, fixed_abund=_UNSET) → None#

This function adds a single species to the pRT object that will define the line opacities of the model. The name must match the pRT opacity name, which vary between the c-k line opacities and the line-by-line opacities.

Args:

speciesstr: The species to include in the retrieval
use_equilibrium_chemistrybool: If False, the retrieval should use free chemistry, and Parameters for the abundance of the species will be added to the retrieval. Otherwise, (dis)equilibrium chemistry will be used.
free_mass_fraction_limitsTuple(float,float): If free is True, this sets the boundaries of the uniform prior that will be applied the species given. The range of the prior goes from free_mass_fraction_limits[0] to free_mass_fraction_limits[1] The abundance limits must be given in log10 units of the mass fraction.
fixed_mass_fraction_valuefloat: The log-mass fraction abundance of the species. Currently only supports vertically constant abundances. If this is set, then the species will not be a free parameter in the retrieval.

add_pressure_varying_line_species(species: str, mode: str = 'linear', pressure_spacing: str = 'relative', n_nodes: int = 3, free_mass_fraction_limits: tuple[float, float] = (-7.0, 0.0), log_pressure_range_prior: tuple[float, float] = (0, 9), fixed_pressure_node_species: str | None = None, abund_lim=_UNSET) → None#

This function adds a single species to the Radtrans object that will define the line opacities of the model. The name must match the pRT opacity name, which vary between the c-k line opacities and the line-by-line opacities. This species will have and abundance that varies with pressure, defined by the retrieved abundance at each of the provided abundance nodes.

This function adds a set of parameters to the retrieval to define the abundance profile of the species.

{species}_{mode}_abundance_profile defines which profile will be used (fixed parameter)
{species}_n_abundance_nodes sets the number of abundance nodes (fixed parameter)
{species}_n_pressure_nodes sets the number of pressure nodes (n_nodes - 2) (fixed parameter)
{species}_interpolation_mode sets the pressure spacing mode (fixed parameter)
{species}_pressure_node_{i} for i in 0 to n_nodes-2, the pressure at each node (free parameter)
{species}_abundance_node_{i} for i in 0 to n_nodes, the abundance at each node (free parameter)

Args:

species: str: The species to include in the retrieval
mode: str: One of ‘linear’, ‘cubic’, or ‘stepped’ - determines the interpolation method between abundance nodes.
pressure_spacing: str: One of ‘relative’, ‘absolute’, or ‘fixed’ - determines whether the pressure nodes are spaced relative to the top of the atmosphere pressure, retrieved in absolute log pressue, or fixed to the pressure nodes set by fixed_pressure_node_species.
n_nodes: int: The number of abundance nodes to use in the retrieval, including top and bottom nodes.
free_mass_fraction_limitsTuple(float,float): If free is True, this sets the boundaries of the uniform prior that will be applied the species given. The range of the prior goes from free_mass_fraction_limits[0] to free_mass_fraction_limits[1] The abundance limits must be given in log10 units of the mass fraction.
log_pressure_range_prior: Tuple(float,float): The prior range on the log pressure of the pressure nodes in log bar, only used if pressure_spacing is ‘absolute’ or ‘relative’.
fixed_pressure_node_species: str: If pressure_spacing is ‘fixed’, this species pressure nodes will be used as the pressure nodes. Note that this species must already have been added to the retrieval with add_pressure_varying_line_species.

remove_species_lines(species: str, free: bool = False) → None#

This function removes a species from the pRT line list, and if using a free chemistry retrieval, removes the associated Parameter of the species.

Args:

speciesstr: The species to remove from the retrieval
freebool: If true, the retrieval should use free chemistry, and Parameters for the abundance of the species will be removed to the retrieval

add_cloud_species(species: str, use_equilibrium_chemistry: bool = True, free_mass_fraction_limits: tuple[float, float] = (-3.5, 1.5), cloud_base_pressure_limits: tuple[float, float] | None = None, equilbrium_mass_fraction_scaling_factor: tuple[float, float] | None = None, fixed_mass_fraction_value: float | None = None, fixed_base_pressure_value: float | None = None, eq=_UNSET, abund_lim=_UNSET, p_base_lim=_UNSET, scaling_factor=_UNSET, fixed_abund=_UNSET, fixed_base=_UNSET) → None#

This function adds a single cloud species to the list of species. Optionally, it will add parameters to allow for a retrieval using an ackermann-marley model. If an equilibrium condensation model is used in th retrieval model function (use_equilibrium_chemistry=True), then a parameter is added that scales the equilibrium cloud abundance, as in Molliere (2020). If eq is false, two parameters are added, the cloud abundnace and the cloud base pressure. The limits set the prior ranges, both on a log scale.

NOTE: As of pRT version 2.4.9, the behaviour of this function has changed. In previous versions the abundance limits were set from free_mass_fraction_limits[0] to (free_mass_fraction_limits[0] + free_mass_fraction_limits[1]). This has been changed so that the limits of the prior range are from free_mass_fraction_limits[0] to free_mass_fraction_limits[1] (ie the actual boundaries). The same is true for PBase_lim.

Args:

speciesstr: Name of the pRT cloud species, including the cloud shape tag.
use_equilibrium_chemistrybool: Does the retrieval model use an equilibrium cloud model. This restricts the available species!
free_mass_fraction_limitstuple(float,float): If use_equilibrium_chemistry is True, this sets the prior range on the log mass fraction of the cloud species relative to the equilibrium abundance, with a typical range being (-3.5, 1.5). If use_equilibrium_chemistry is False, this sets the prior range on the actual log mass fraction abundance of the cloud species, with a typical range being (-5.0, 0.0). The abundance limits must be given in log10 units of the mass fraction.
cloud_base_pressure_limitstuple(float,float): Only used if not using an equilibrium model. Sets the prior range on the log10 of the cloud base pressure in bar, e.g. (-3, 3). If None, no cloud base pressure parameter is added.
equilbrium_mass_fraction_scaling_factortuple(float,float): If provided, adds a free parameter eq_scaling_{cloud_name} that scales the equilibrium condensate mass fraction. The tuple defines the (lower, upper) bounds of the uniform prior on this scaling factor in log10 space, e.g. (-3.0, 1.0). Only used when use_equilibrium_chemistry is True. If None, no scaling parameter is added.
fixed_mass_fraction_valueOptional(float): A vertically constant log mass fraction abundance for the cloud species. If set, this will not be a free parameter in the retrieval. Only compatible with non-equilibrium clouds.
fixed_base_pressure_valueOptional(float): The log cloud base pressure. If set, fixes this parameter to a constant value, and it will not be a free parameter in the retrieval. Only compatible with non-equilibrium clouds. Not yet compatible with most built in pRT models.

add_data(name: str, model_generating_function: Callable[Ellipsis, tuple[jax.typing.ArrayLike, jax.typing.ArrayLike, jax.typing.ArrayLike]] | None = None, path_to_observations: str | None = None, wavelengths: jax.typing.ArrayLike | None = None, spectrum: jax.typing.ArrayLike | None = None, uncertainties: jax.typing.ArrayLike | None = None, covariance: jax.typing.ArrayLike | None = None, line_opacity_mode: str = 'c-k', data_resolution: float | None = None, model_resolution: float | None = None, scale_flux: bool = False, scale_uncertainties: bool = False, fit_flux_offset: bool = False, fit_covariance: bool = False, covariance_mode: str = 'none', global_covariance_kernel: str = 'squared_exponential', local_covariance_kernel: str = 'squared_exponential', n_local_covariance_kernels: int = 0, covariance_jitter: float = 0.0, covariance_parameter_priors: dict | None = None, fit_instrumental_resolution: bool = False, external_radtrans_reference: object | None = None, wavelength_boundaries: tuple[float, float] | None = None, wavelength_bin_widths: jax.typing.ArrayLike | None = None, subtract_continuum: bool = False, radtrans_grid: bool = False, radtrans_object: object = None, mask: jax.typing.ArrayLike | None = None, photometric_transformation_function: Callable[Ellipsis, tuple[jax.typing.ArrayLike, jax.typing.ArrayLike]] | None = None, photometric_bin_edges: jax.typing.ArrayLike | None = None, scale=_UNSET, scale_err=_UNSET, offset_bool=_UNSET, resample=_UNSET, concatenate_flux_epochs_variability=_UNSET, variability_atmospheric_column_model_flux_return_mode=_UNSET, atmospheric_column_flux_mixer=_UNSET, photometry=_UNSET) → None#

Create a Data object for the Retrieval class. Each dataset is associated with an instance of petitRadTrans and an atmospheric model. The pRT instance can be overwritten, and associated with an existing pRT instance with the external_pRT_reference parameter. This setup allows for joint or independent retrievals on multiple datasets.

Args:

namestr: Identifier for this data set.
model_generating_functionmethod: A function, typically defined in run_definition.py that returns the model wavelength and spectrum (emission or transmission). This is the function that contains the physics of the model, and calls pRT in order to compute the spectrum.
path_to_observationsstr: Path to observations file, including filename. This can be a txt or dat file containing the wavelength, flux, transit depth and error, or a fits file containing the wavelength, spectrum and covariance matrix. Alternatively, the data information can be directly given by the wavelengths, spectrum, uncertainties, and mask attributes.
wavelengths:: (um) Wavelengths of the data.
spectrum:: Spectrum of the data.
uncertainties:: Uncertainties of the data, in the same units as the spectrum.
covariance:: Covariance matrix of the data, in the same units as the spectrum squared.
line_opacity_modestr: Should the retrieval be run using correlated-k opacities (default, ‘c-k’), or line by line (‘lbl’) opacities? If ‘lbl’ is selected, it is HIGHLY recommended to set the model_resolution parameter. In general, ‘c-k’ mode is recommended for retrievals of everything other than high-resolution (R>40000) spectra.
data_resolutionfloat or jnp.ndarray: Spectral resolution of the instrument. Optional, allows convolution of model to instrumental line width. If the data_resolution is an array, the resolution can vary as as a function of wavelength. The array should have the same shape as the input wavelength array, and should specify the spectral resolution at each wavelength bin.
model_resolutionfloat: Will be None by default. The resolution of the c-k opacity tables in pRT. This will generate a new c-k table using exo-k. The default (and maximum) correlated k resolution in pRT is $\\lambda/\\Delta \\lambda > 1000$ (R=500). Lowering the resolution will speed up the computation. If integer positive value, and if opacities == 'lbl' is True, then this will sample the high-resolution opacities at the specified resolution. This may be desired in the case where medium-resolution spectra are required with a $\\lambda/\\Delta \\lambda > 1000$, but much smaller than $10^6$, which is the resolution of the lbl mode. In this case it may make sense to carry out the calculations with line_by_line_opacity_sampling = 10e5, for example, and then re-binning to the final desired resolution: this may save time! The user should verify whether this leads to solutions which are identical to the re-binned results of the fiducial $10^6$ resolution. If not, this parameter must not be used. Note the difference between this parameter and the line_by_line_opacity_sampling parameter in the RadTrans class - the actual desired resolution should be set here.
scale_fluxbool: Turn on or off scaling the data by a constant factor. Set to True if scaling the data during the retrieval.
scale_uncertainties:: Turn on or off scaling the uncertainties by a constant factor. Set to True if scaling the uncertainties during the retrieval.
fit_flux_offset:: Turn on or off fitting a flux offset. Set to True if fitting a flux offset during the retrieval.
fit_covariance:: If True, construct a fitted covariance matrix during retrieval using GP-like kernel terms. This is currently supported for 1D spectroscopic datasets only.
covariance_mode:: Covariance model to use. Supported values are 'none', 'local', 'global', and 'global_local'.
global_covariance_kernel:: Global kernel name used when covariance_mode includes a global term.
local_covariance_kernel:: Local kernel name used when covariance_mode includes local terms.
n_local_covariance_kernels:: Number of local covariance kernels to include when using 'global_local' mode.
covariance_jitter:: Small diagonal stabilization term added to the fitted covariance matrix.
external_radtrans_referenceobject: An existing RadTrans object. Leave as none unless you’re sure of what you’re doing.
wavelength_boundariestuple,list: Set the wavelength range of the pRT object. Defaults to a range +/-5% greater than that of the data. Must at least be equal to the range of the data.
wavelength_bin_widthsnumpy.ndarray: Set the wavelength bin width to bin the Radtrans object to the data. Defaults to the data bins.
radtrans_grid: bool: Set to true if data has been binned to a pRT c-k grid.
radtrans_object:: An instance of Radtrans object to be used to generate model spectra in retrievals.
mask:: Mask of the data.

add_time_series_data(name: str, observation_times: jax.typing.ArrayLike, model_generating_function: Callable, n_model_timesteps: int, time_varying_parameters: list[str] | None = None, sinusoidal_parameters: list[str] | None = None, *, path_to_observations: str | None = None, filename_list: list[str] | None = None, wavelengths: jax.typing.ArrayLike | None = None, spectrum: jax.typing.ArrayLike | None = None, uncertainties: jax.typing.ArrayLike | None = None, mask: jax.typing.ArrayLike | None = None, wavelength_bin_widths: jax.typing.ArrayLike | None = None, data_resolution: float | None = None, model_resolution: float | None = None, wavelength_boundaries: tuple | None = None, external_radtrans_reference: str | None = None, line_opacity_mode: str = 'c-k', scale_flux: bool = False, scale_uncertainties: bool = False, fit_flux_offset: bool = False, radtrans_object: object = None, covariance: jax.typing.ArrayLike | None = None, mean_divide: bool = False)#

Create a TimeSeriesData object for time-variable retrievals.

This method creates the data object and registers the fixed N_time parameter. The caller is still responsible for adding the per-timestep or sinusoidal free parameters with appropriate priors via add_parameter().

Args:

namestr: Identifier for this data set.
observation_timesArrayLike: 1-D array of observation times in seconds (absolute or relative to the first observation).
model_generating_functionCallable: A runtime-native model function (e.g. time_series_gradient_emission).
n_model_timestepsint: Number of model spectra computed at regular intervals spanning the observation window. Spectra at the actual observation times are interpolated from these.
time_varying_parameterslist[str], optional: Parameter names that are freely varied at each model timestep. For each name P the retrieval must contain free parameters {P}_t_0 through {P}_t_{n_model_timesteps - 1}.
sinusoidal_parameterslist[str], optional: Parameter names whose time variation is described by a sinusoid. For each name P the retrieval must contain free parameters {P}_amplitude, {P}_period, {P}_phase, and {P}_offset.
path_to_observationsstr, optional: Path to a directory containing individual epoch files when using filename_list.
filename_listlist[str], optional: Per-epoch filenames inside path_to_observations. If given, data will be loaded via TimeSeriesData.load_single_spectrum_txt().
wavelengthsArrayLike, optional: 1-D wavelength grid (micron) shared by all epochs.
spectrumArrayLike, optional: 2-D flux array of shape (N_obs, N_wavelength).
uncertaintiesArrayLike, optional: 2-D uncertainty array matching spectrum.
maskArrayLike, optional: 2-D boolean mask matching spectrum.
wavelength_bin_widthsArrayLike, optional: 1-D or 2-D bin widths.

data_resolution, model_resolution, wavelength_boundaries, external_radtrans_reference, line_opacity_mode, scale_flux, scale_uncertainties, fit_flux_offset, radtrans_object, covariance

Same semantics as add_data().

mean_dividebool: Fit in mean-divided (relative-variability) space: the stored data become F(lambda, t) / <F(lambda)>_t with uncertainties sigma(lambda, t) / <F(lambda)>_t (the time mean is taken over valid epochs per wavelength), and at scoring time the projected model is likewise divided by its own time mean (M(lambda, t) / <M(lambda)>_t). Static (time-constant) model/calibration deficiencies cancel in this ratio while the time-variability signal is preserved against the measured uncertainties – no error inflation is involved. Requires inline spectrum and uncertainties arrays.

add_photometry(path: str, model_generating_function: Callable, model_resolution: float = 10.0, scale_flux: bool = False, wlen_range_micron: tuple[float, float] = None, photometric_transformation_function: Callable = None, external_prt_reference: object = None, opacity_mode: str = 'c-k') → None#

Create a Data class object for each photometric point in a photometry file. The photometry file must be a csv file and have the following structure: name, lower wavelength bound [um], upper wavelength boundary[um], flux [W/m2/micron], flux error [W/m2/micron]

Photometric data requires a transformation function to convert a spectrum into synthetic photometry. You must provide this function yourself, or have the species package installed. If using species, the name in the data file must be of the format instrument/filter.

Args:

model_generating_functionstr: Identifier for this data set.
pathstr: Path to observations file, including filename.
model_resolutionfloat: Spectral resolution of the model, allowing for low resolution correlated k tables from exo-k.
scale_fluxbool: Turn on or off scaling the data by a constant factor. Currently only set up to scale all photometric data in a given file.
wlen_range_micronTuple: A pair of wavelengths in units of micron that determine the lower and upper boundaries of the model computation.
external_prt_referencestr: The name of an existing Data object. This object’s prt_object will be used to calculate the chi squared of the new Data object. This is useful when two datasets overlap, as only one model computation is required to compute the log likelihood of both datasets.
photometric_transformation_functionmethod: A function that will transform a spectrum into an average synthetic photometric point, typically accounting for filter transmission.
opacity_mode: str: Opacity mode.