petitRADTRANS.retrieval

Submodules

Package Contents

Classes

Retrieval

This class implements the retrieval method using petitRADTRANS and pymultinest.

RetrievalConfig

The RetrievalConfig class contains all of the data and model level information necessary

Attributes

__author__

__copyright__

__maintainer__

__email__

__status__

petitRADTRANS.retrieval.__author__ = 'Evert Nasedkin'
petitRADTRANS.retrieval.__maintainer__ = 'Evert Nasedkin'
petitRADTRANS.retrieval.__email__ = 'nasedkinevert@gmail.com'
petitRADTRANS.retrieval.__status__ = 'Development'
class petitRADTRANS.retrieval.Retrieval(run_definition, output_dir='', use_MPI=False, sample_spec=False, ultranest=False, bayes_factor_species=None, corner_plot_names=None, short_names=None, pRT_plot_style=True, test_plotting=False)

This class implements the retrieval method using petitRADTRANS and pymultinest. A RetrievalConfig object is passed to this class to describe the retrieval data, parameters and priors. The run() method then uses pymultinest to sample the parameter space, producing posterior distributions for parameters and bayesian evidence for models. Various useful plotting functions have also been included, and can be run once the retrieval is complete.

Args:
run_definitionRetrievalConfig

A RetrievalConfig object that describes the retrieval to be run. This is the user facing class that must be setup for every retrieval.

output_dirStr

The directory in which the output folders should be written

sample_specBool

Produce plots and data files for random samples drawn from the outputs of pymultinest.

ultranestbool

If true, use Ultranest sampling rather than pymultinest. Provides a more accurate evidence estimate, but is significantly slower.

bayes_factor_speciesStr

A pRT species that should be removed to test for the bayesian evidence for its presence.

corner_plot_namesList(Str)

List of additional retrieval names that should be included in the corner plot.

short_namesList(Str)

For each corner_plot_name, a shorter name to be included when plotting.

pRT_plot_styleBool

Use the petitRADTRANS plotting style as described in plot_style.py. Recommended to turn this parameter to false if you want to use interactive plotting, or if the test_plotting parameter is True.

test_plottingBool

Only use when running locally. A boolean flag that will produce plots for each sample when pymultinest is run.

run(sampling_efficiency=0.8, const_efficiency_mode=False, n_live_points=4000, log_z_convergence=0.5, step_sampler=False, warmstart_max_tau=0.5, n_iter_before_update=50, resume=True, max_iters=0, frac_remain=0.1, importance_nested_sampling=True, Lepsilon=0.3, error_checking=True)

Run mode for the class. Uses pynultinest to sample parameter space and produce standard PMN outputs.

Args:
sampling_efficiencyFloat

pymultinest sampling efficiency. If const efficiency mode is true, should be set to around 0.05. Otherwise, it should be around 0.8 for parameter estimation and 0.3 for evidence comparison.

const_efficiency_modeBool

pymultinest constant efficiency mode

n_live_pointsInt

Number of live points to use in pymultinest, or the minimum number of live points to use for the Ultranest reactive sampler.

log_z_convergencefloat

If ultranest is being used, the convergence criterion on log z.

step_samplerbool

Use a step sampler to improve the efficiency in ultranest.

warmstart_max_taufloat

Warm start allows accelerated computation based on a different but similar UltraNest run.

n_iter_before_updateint

Number of live point replacements before printing an update to a log file.

max_itersint

Maximum number of sampling iterations. If 0, will continue until convergence criteria are satisfied.

frac_remainfloat

Ultranest convergence criterion. Halts integration if live point weights are below the specified value.

Lepsilonfloat

Ultranest convergence criterion. Use with noisy likelihoods. Halts integration if live points are wihin Lepsilon.

resumebool

Continue existing retrieval. If FALSE THIS WILL OVERWRITE YOUR EXISTING RETRIEVAL.

error_checkingbool

Test the model generating function for typical errors. ONLY TURN THIS OFF IF YOU KNOW WHAT YOU’RE DOING!

_run_ultranest(n_live_points, log_z_convergence, step_sampler, warmstart_max_tau, resume, max_iters, frac_remain, Lepsilon)

Run mode for the class. Uses ultranest to sample parameter space and produce standard outputs.

Args:
n_live_pointsInt

The minimum number of live points to use for the Ultranest reactive sampler.

log_z_convergencefloat

The convergence criterion on log z.

step_samplerbool

Use a step sampler to improve the efficiency in ultranest.

max_itersint

Maximum number of sampling iterations. If 0, will continue until convergence criteria are satisfied.

frac_remainfloat

Ultranest convergence criterion. Halts integration if live point weights are below the specified value.

Lepsilonfloat

Ultranest convergence criterion. Use with noisy likelihoods. Halts integration if live points are wihin Lepsilon.

resumebool

Continue existing retrieval. If FALSE THIS WILL OVERWRITE YOUR EXISTING RETRIEVAL.

generate_retrieval_summary(stats=None)

This function produces a human-readable text file describing the retrieval. It includes all the fixed and free parameters, the limits of the priors (if uniform), a description of the data used, and if the retrieval is complete, a summary of the best fit parameters and model evidence.

Args:
statsdict

A Pymultinest stats dictionary, from Analyzer.get_stats(). This contains the evidence and best fit parameters.

setup_data(scaling=10, width=3)

Creates a pRT object for each data set that asks for a unique object. Checks if there are low resolution c-k models from exo-k, and creates them if necessary. The scaling and width parameters adjust the AMR grid as described in RetrievalConfig.setup_pres and models.fixed_length_amr. It is recommended to keep the defaults.

Args:
scalingint

A multiplicative factor that determines the size of the full high resolution pressure grid, which will have length self.p_global.shape[0] * scaling.

widthint

The number of cells in the low pressure grid to replace with the high resolution grid.

_error_check_model_function()
prior(cube, ndim=0, nparams=0)

pyMultinest Prior function. Transforms unit hypercube into physical space.

prior_ultranest(cube)

pyMultinest Prior function. Transforms unit hypercube into physical space.

log_likelihood(cube, ndim=0, nparam=0, logL_per_datapoint_dict=None)

pyMultiNest required likelihood function.

This function wraps the model computation and log-likelihood calculations for pyMultiNest to sample. If PT_plot_mode is True, it will return only the pressure and temperature arrays rather than the wavelength and flux. If run_mode is ‘evaluate’, it will save the provided sample to the best-fit spectrum file, and add it to the best_fit_specs dictionary. If evaluate_sample_spectra is true, it will store the spectrum in posterior_sample_specs.

Args:
cubenumpy.ndarray

The transformed unit hypercube, providing the parameter values to be passed to the model_generating_function.

ndimint

The number of dimensions of the problem

nparamint

The number of parameters in the fit.

logL_per_datapoint_dictdict

Dictionary with instrument-entries. If provided, log likelihood per datapoint is appended to existing list.

Returns:
log_likelihoodfloat

The (negative) log likelihood of the model given the data.

save_best_fit_outputs(parameters)
static _get_samples(ultranest, names, output_dir=None, ret_names=None)
get_samples(output_dir=None, ret_names=None)

This function looks in the given output directory and finds the post_equal_weights file associated with the current retrieval name.

Args:
output_dirstr

Parent directory of the out_PMN/RETRIEVALNAME_post_equal_weights.dat file

ret_namesList(str)

A list of retrieval names to add to the sample and parameter dictionary. Functions the same as setting corner_files during initialisation.

Returns:
sample_dictdict

A dictionary with keys being the name of the retrieval, and values are a numpy ndarray containing the samples in the post_equal_weights file

parameter_dictdict

A dictionary with keys being the name of the retrieval, and values are a list of names of the parameters used in the retrieval. The first name corresponds to the first column of the samples, and so on.

get_max_likelihood_params(best_fit_params, parameters_read)

This function converts the sample from the post_equal_weights file with the maximum log likelihood, and converts it into a dictionary of Parameters that can be used in a model function.

Args:
best_fit_paramsnumpy.ndarray

An array of the best fit parameter values (or any other sample)

parameters_readlist

A list of the free parameter names as read from the output files.

get_median_params(samples, parameters_read, return_array=False)

This function builds a parameter dictionary based on the median value of each parameter. This will update the best_fit_parameter dictionary!

Args:
best_fit_paramsnumpy.ndarray

An array of the best fit parameter values (or any other sample)

parameters_readlist

A list of the free parameter names as read from the output files.

get_full_range_model(parameters, model_generating_function=None, ret_name=None, contribution=False, pRT_object=None, pRT_reference=None)

Retrieve a full wavelength range model based on the given parameters.

Parameters:

parameters (dict): A dictionary containing parameters used to generate the model. model_generating_function (callable, optional): A function to generate the model.

Defaults to None.

ret_name (str, optional): Name of the model to be returned.

TODO: Remove this parameter as it’s currently unused. Defaults to None.

contribution (bool, optional): Return the emission or transmission contribution function.

Defaults to False.

pRT_object (object, optional): RadTrans object for calculating the spectrum.

Defaults to None.

pRT_reference (object, optional): Reference Data object for calculating the spectrum.

Defaults to None.

Returns:

object: The generated full range model.

get_best_fit_model(best_fit_params, parameters_read, ret_name=None, contribution=False, pRT_reference=None, model_generating_function=None, refresh=True, mode='bestfit')

This function uses the best fit parameters to generate a pRT model that spans the entire wavelength range of the retrieval, to be used in plots.

Args:
best_fit_paramsnumpy.ndarray

A numpy array containing the best fit parameters, to be passed to get_max_likelihood_params

parameters_readlist

A list of the free parameters as read from the output files.

ret_namestr

If plotting a fit from a different retrieval, input the retrieval name to be included.

contributionbool

If True, calculate the emission or transmission contribution function as well as the spectrum.

pRT_referencestr

If specified, the pRT object of the data with name pRT_reference will be used for plotting, instead of generating a new pRT object at R = 1000.

model_generating_function(callable, optional):

A function that returns the wavelength and spectrum, and takes a pRT_Object and the current set of parameters stored in self.parameters. This should be the same model function used in the retrieval.

refreshbool

If True (default value) the .npy files in the evaluate_[retrieval_name] folder will be replaced by recalculating the best fit model. This is useful if plotting intermediate results from a retrieval that is still running. If False no new spectrum will be calculated and the plot will be generated from the .npy files in the evaluate_[retrieval_name] folder.

modestr

If “best_fit”, will use the maximum likelihood parameter values to calculate the best fit model and contribution. If “median”, uses the median parameter values.

Returns:
bf_wlennumpy.ndarray

The wavelength array of the best fit model

bf_spectrumnumpy.ndarray

The emission or transmission spectrum array, with the same shape as bf_wlen

get_mass_fractions(sample, parameters_read=None)

This function returns the mass fraction abundances of each species as a function of pressure

Args:
samplenumpy.ndarray

A sample from the pymultinest output, the abundances returned will be computed for this set of parameters.

parameters_readlist

A list of the free parameters as read from the output files.

Returns:
abundancesdict

A dictionary of abundances. The keys are the species name, the values are the mass fraction abundances at each pressure

MMWnumpy.ndarray

The mean molecular weight at each pressure level in the atmosphere.

get_volume_mixing_ratios(sample, parameters_read=None)

This function returns the VNRs of each species as a function of pressure

Args:
samplenumpy.ndarray

A sample from the pymultinest output, the abundances returned will be computed for this set of parameters.

parameters_readlist

A list of the free parameters as read from the output files.

Returns:
vmrdict

A dictionary of abundances. The keys are the species name, the values are the mass fraction abundances at each pressure

MMWnumpy.ndarray

The mean molecular weight at each pressure level in the atmosphere.

save_volume_mixing_ratios(sample_dict, parameter_dict, rets=None)

Save volume mixing ratios (VMRs) and line absorber species information for specified retrievals.

Parameters: - self: The instance of the class containing the function. - sample_dict (dict): A dictionary mapping retrieval names to lists of samples. - parameter_dict (dict): A dictionary mapping retrieval names to parameter values. - rets (list, optional): List of retrieval names to process. If None, uses the default retrieval name.

Returns: - vmrs (numpy.ndarray): Array containing volume mixing ratios for each sample and species.

The function processes the specified retrievals and saves the corresponding VMRs and line absorber species information to files in the output directory. If ‘rets’ is not provided, the default retrieval name is used. The VMRs are saved in a numpy file, and the line absorber species are saved in a JSON file.

Example usage:

` sample_dict = {'Retrieval1': [...], 'Retrieval2': [...]} parameter_dict = {'Retrieval1': {...}, 'Retrieval2': {...}} vmrs = save_volume_mixing_ratios(sample_dict, parameter_dict) `

save_mass_fractions(sample_dict, parameter_dict, rets=None)

Save mass fractions and line absorber species information for specified retrievals.

Parameters: - self: The instance of the class containing the function. - sample_dict (dict): A dictionary mapping retrieval names to lists of samples. - parameter_dict (dict): A dictionary mapping retrieval names to parameter values. - rets (list, optional): List of retrieval names to process. If None, uses the default retrieval name.

Returns: - mass_fractions (numpy.ndarray): Array containing mass fractions for each sample and species.

The function processes the specified retrievals and saves the corresponding mass fracs and line absorber species information to files in the output directory. If ‘rets’ is not provided, the default retrieval name is used. The mass fractinos are saved in a numpy file, and the line absorber species are saved in a JSON file.

Example usage:

` sample_dict = {'Retrieval1': [...], 'Retrieval2': [...]} parameter_dict = {'Retrieval1': {...}, 'Retrieval2': {...}} mass_fractions = save_mass_fractions(sample_dict, parameter_dict) `

get_evidence(ret_name='')

Get the log10 Z and error for the retrieval

This function uses the pymultinest analyzer to get the evidence for the current retrieval_name by default, though any retrieval_name in the out_PMN folder can be passed as an argument - useful for when you’re comparing multiple similar models. This value is also printed in the summary file.

Args:
ret_namestring

The name of the retrieval that prepends all the PMN output files.

static get_best_fit_likelihood(samples)

Get the log likelihood of the best fit model

Args:
samplesnumpy.ndarray

An array of samples and likelihoods taken from a post_equal_weights file

get_best_fit_chi2(samples)

Get the 𝛘^2 of the best fit model - removing normalization term from log L

Args:
samplesnumpy.ndarray

An array of samples and likelihoods taken from a post_equal_weights file

get_log_likelihood_per_datapoint(samples_use, ret_name=None)
get_elpd_per_datapoint(ret_name=None)
get_chi2(sample)

Get the 𝛘^2 of the given sample relative to the data - removing normalization term from log L

Args:
samplenumpy.ndarray

A single sample and likelihood taken from a post_equal_weights file

get_chi2_normalisation(sample)

Get the 𝛘^2 normalization term from log L

Args:
samplenumpy.ndarray

A single sample and likelihood taken from a post_equal_weights file

get_reduced_chi2(sample, subtract_n_parameters=False)

Get the 𝛘^2/DoF of the given model - divide chi^2 by DoF or number of wavelength channels.

Args:
samplenumpy.ndarray

A single sample and likelihoods taken from a post_equal_weights file

subtract_n_parametersbool

If True, divide the Chi2 by the degrees of freedom (n_data - n_parameters). If False, divide only by n_data

get_reduced_chi2_from_model(wlen_model, spectrum_model, subtract_n_parameters=False)

Get the 𝛘^2/DoF of the supplied spectrum - divide chi^2 by DoF

Args:
wlen_modelnp.ndarray

The wavelength grid of the model spectrum in micron.

spectrum_modelnp.ndarray

The model flux in the same units as the data.

subtract_n_parametersbool

If True, divide the Chi2 by the degrees of freedom (n_data - n_parameters). If False, divide only by n_data

get_analyzer(ret_name='')

Get the PMN analyzer from a retrieval run

This function uses gets the PMN analyzer object for the current retrieval_name by default, though any retrieval_name in the out_PMN folder can be passed as an argument - useful for when you’re comparing multiple similar models.

Args:
ret_namestring

The name of the retrieval that prepends all the PMN output files.

build_param_dict(sample, free_param_names)

This function builds a dictionary of parameters that can be passed to the model building functions. It requires a numpy array with the same length as the number of free parameters, and a list of all the parameter names in the order they appear in the array. The returned dictionary will contain all of these parameters, together with the fixed retrieval parameters.

Args:
samplenumpy.ndarray

An array or list of free parameter values

free_param_nameslist(string)

A list of names for each of the free parameters.

Returns:
paramsdict

A dictionary of Parameters, with values set to the values in sample.

sample_teff(sample_dict, param_dict, ret_names=None, nsample=None, resolution=40)

This function samples the outputs of a retrieval and computes Teff for each sample. For each sample, a model is computed at low resolution, and integrated to find the total radiant emittance, which is converted into a temperature using the stefan boltzmann law: $j^{star} = sigma T^{4}$. Teff itself is computed using util.calc_teff.

Args:
sample_dictdict

A dictionary, where each key is the name of a retrieval, and the values are the equal weighted samples.

param_dictdict

A dictionary where each key is the name of a retrieval, and the values are the names of the free parameters associated with that retrieval.

ret_namesOptional(list(string))

A list of retrieval names, each should be included in the sample_dict. If left as none, it defaults to only using the current retrieval name.

nsampleOptional(int)

The number of times to compute Teff. If left empty, uses the “take_PTs_from” plot_kwarg. Recommended to use ~300 samples, probably more than is set in the kwarg!

resolutionint

The spectra resolution to compute the models at. Typically, this should be very low in order to enable rapid calculation.

Returns:
tdictdict

A dictionary with retrieval names for keys, and the values are the calculated values of Teff for each sample.

plot_all(output_dir=None, ret_names=None, contribution=False, model_generating_function=None, pRT_reference=None, mode='bestfit')

Produces plots for the best fit spectrum, a sample of 100 output spectra, the best fit PT profile and a corner plot for parameters specified in the run definition.

By default, this runs the following functions:
plot_spectra: Plots the best fit spectrum together with the data, with an extra

panel showing the residuals between the model and data.

plot_PT: plots the pressure-temperature profile contours plot_corner : Corner plot based on the posterior sample distributions plot_abundances : Abundance profiles for each line species used.

if contribution = True:

plot_contribution : The emission or transmission contribution function In addition to plotting the contribution function, the contribution will also be overlaid on top of the PT profiles and abundance profiles.

if self.evaluate_sample_spectra = True

plot_sampled : Randomly draws N samples from the posterior distribution, and plots the resulting spectrum overtop the data.

Args:
output_dir: string

Output directory to store the plots. Defaults to selt.output_dir.

ret_nameslist(str)

List of retrieval names. Used if multiple retrievals are to be included in a single corner plot.

contributionbool

If true, plot the emission or transmission contribution function.

pRT_referencestr

If specified, the pRT object of the data with name pRT_reference will be used for plotting, instead of generating a new pRT object at R = 1000.

model_generating_function(callable, optional):

A function that returns the wavelength and spectrum, and takes a pRT_Object and the current set of parameters stored in self.parameters. This should be the same model function used in the retrieval.

modestr

If ‘bestfit’, consider the maximum likelihood sample for plotting, if median, calculate the model based on the median retrieved parameters.

plot_spectra(samples_use, parameters_read, model_generating_function=None, pRT_reference=None, refresh=True, mode='bestfit', marker_color_type=None, marker_cmap=plt.cm.bwr, marker_label='')

Plot the best fit spectrum, the data from each dataset and the residuals between the two. Saves a file to OUTPUT_DIR/evaluate_RETRIEVAL_NAME/RETRIEVAL_NAME_MODE_spec.pdf

Args:
samples_usenumpy.ndarray

An array of the samples from the post_equal_weights file, used to find the best fit sample

parameters_readlist

A list of the free parameters as read from the output files.

model_generating_functionmethod

A function that will take in the standard ‘model’ arguments (pRT_object, params, pt_plot_mode, AMR, resolution) and will return the wavlength and flux arrays as calculated by petitRadTrans. If no argument is given, it uses the method of the first dataset included in the retrieval.

pRT_referencestr

If specified, the pRT object of the data with name pRT_reference will be used for plotting, instead of generating a new pRT object at R = 1000.

model_generating_function(callable, optional):

A function that returns the wavelength and spectrum, and takes a pRT_Object and the current set of parameters stored in self.parameters. This should be the same model function used in the retrieval.

refreshbool

If True (default value) the .npy files in the evaluate_[retrieval_name] folder will be replaced by recalculating the best fit model. This is useful if plotting intermediate results from a retrieval that is still running. If False no new spectrum will be calculated and the plot will be generated from the .npy files in the evaluate_[retrieval_name] folder.

modestr

Use ‘bestfit’ (minimum likelihood) parameters, or median parameter values.

marker_color_typestr

Data-attribute to plot as marker colors. Use ‘delta_elpd’, ‘elpd’, or ‘pareto_k’.

marker_cmapmatplotlib colormap

Colormap to use for marker colors.

marker_labelstr

Label to add to colorbar corresponding to marker colors.

Returns:
figmatplotlib.figure

The matplotlib figure, containing the data, best fit spectrum and residuals.

axmatplotlib.axes

The upper pane of the plot, containing the best fit spectrum and data

ax_rmatplotlib.axes

The lower pane of the plot, containing the residuals between the fit and the data

plot_sampled(samples_use, parameters_read, downsample_factor=None, save_outputs=False, nsample=None, model_generating_function=None, pRT_reference=None, refresh=True)

Plot a set of randomly sampled output spectra for each dataset in the retrieval.

This will save nsample files for each dataset included in the retrieval. Note that if you change the model_resolution of your Data and rerun this function, the files will NOT be updated - if the files exists the function defaults to reading from file rather than recomputing. Delete all of the sample functions and run it again.

Args:
samples_usenp.ndarray

posterior samples from pynmultinest outputs (post_equal_weights)

parameters_readlist(str)

list of free parameters as read from the output files.

downsample_factorint

Factor by which to reduce the resolution of the sampled model, for smoother plotting. Defaults to None. A value of None will result in the full resolution spectrum. Note that this factor can only reduce the resolution from the underlying model_resolution of the data.

nsampleint

Number of samples to draw from the posterior distribution. Defaults to the value of self.rd.plot_kwargs[“nsample”].

save_outputsbool

If true, saves each calculated spectrum as a .npy file. The name of the file indicates the index from the post_equal_weights file that was used to generate the sample.

pRT_referencestr

If specified, the pRT object of the data with name pRT_reference will be used for plotting, instead of generating a new pRT object at R = 1000.

model_generating_function(callable, optional):

A function that returns the wavelength and spectrum, and takes a pRT_Object and the current set of parameters stored in self.parameters. This should be the same model function used in the retrieval.

refreshbool

If True (default value) the .npy files in the evaluate_[retrieval_name] folder will be replaced by recalculating the best fit model. This is useful if plotting intermediate results from a retrieval that is still running. If False no new spectrum will be calculated and the plot will be generated from the .npy files in the evaluate_[retrieval_name] folder.

plot_PT(sample_dict, parameters_read, contribution=False, refresh=False, model_generating_function=None, pRT_reference=None, mode='bestfit')

Plot the PT profile with error contours

Args:
sample_dictnp.ndarray

posterior samples from pynmultinest outputs (post_equal_weights)

parameters_readList

List of free parameters as read from the output file.

contributionbool

Weight the opacity of the pt profile by the emission contribution function, and overplot the contribution curve.

refreshbool

If True (default value) the .npy files in the evaluate_[retrieval_name] folder will be replaced by recalculating the best fit model. This is useful if plotting intermediate results from a retrieval that is still running. If False no new spectrum will be calculated and the plot will be generated from the .npy files in the evaluate_[retrieval_name] folder.

pRT_referencestr

If specified, the pRT object of the data with name pRT_reference will be used for plotting, instead of generating a new pRT object at R = 1000.

model_generating_function(callable, optional):

A function that returns the wavelength and spectrum, and takes a pRT_Object and the current set of parameters stored in self.parameters. This should be the same model function used in the retrieval.

modestr

‘bestfit’ or ‘median’, indicating which set of values should be used to calculate the contribution function.

Returns:

fig : matplotlib.figure ax : matplotlib.axes

plot_corner(sample_dict, parameter_dict, parameters_read, plot_best_fit=True, true_values=None, **kwargs)

Make the corner plots

Args:
sample_dictDict

Dictionary of samples from PMN outputs, with keys being retrieval names

parameter_dictDict

Dictionary of parameters for each of the retrievals to be plotted.

parameters_readList

Used to plot correct parameters, as some in self.parameters are not free, and aren’t included in the PMN outputs

plot_best_fitbool

If true, plot vertical lines to indicate the maximum likelihood parameter values.

true-valuesnp.ndarray

An array of values for each plotted parameter, where a vertical line will be plotted for each value. Can be used to indicate true values if retrieving on synthetic data, or to overplot additional measurements.

kwargsdict

Each kwarg can be one of the kwargs used in corner.corner. These can be used to adjust the title_kwargs,label_kwargs,hist_kwargs, hist2d_kawargs or the contour kwargs. Each kwarg must be a dictionary with the arguments as keys and values as the values.

plot_data(yscale='linear')

Plot the data used in the retrieval.

plot_contribution(samples_use, parameters_read, model_generating_function=None, pRT_reference=None, log_scale_contribution=False, n_contour_levels=30, refresh=True, mode='bestfit')

Plot the contribution function of the bestfit or median model from a retrieval. This plot indicates the relative contribution from each wavelength and each pressure level in the atmosphere to the spectrum.

Args:
samples_usenumpy.ndarray

An array of the samples from the post_equal_weights file, used to find the best fit sample

parameters_readlist

A list of the free parameters as read from the output files.

pRT_referencestr

If specified, the pRT object of the data with name pRT_reference will be used for plotting, instead of generating a new pRT object at R = 1000.

model_generating_function(callable, optional):

A function that returns the wavelength and spectrum, and takes a pRT_Object and the current set of parameters stored in self.parameters. This should be the same model function used in the retrieval.

log_scale_contributionbool

If true, take the log10 of the contribution function to visualise faint features.

n_contour_levelsint

Number of contour levels to pass to the matplotlib contourf function.

refreshbool

If True (default value) the .npy files in the evaluate_[retrieval_name] folder will be replaced by recalculating the best fit model. This is useful if plotting intermediate results from a retrieval that is still running. If False no new spectrum will be calculated and the plot will be generated from the .npy files in the evaluate_[retrieval_name] folder.

modestr

‘bestfit’ or ‘median’, indicating which set of values should be used to calculate the contribution function.

Returns:
figmatplotlib.figure

The matplotlib figure, containing the data, best fit spectrum and residuals.

axmatplotlib.axes

The upper pane of the plot, containing the best fit spectrum and data

ax_rmatplotlib.axes

The lower pane of the plot, containing the residuals between the fit and the data

plot_abundances(samples_use, parameters_read, species_to_plot=None, contribution=False, refresh=True, model_generating_function=None, pRT_reference=None, mode='bestfit', sample_posteriors=False, volume_mixing_ratio=False)

Plot the abundance profiles in mass fractions or volume mixing ratios as a function of pressure.

Args:
samples_usenumpy.ndarray

An array of the samples from the post_equal_weights file, used to find the best fit sample

parameters_readlist

A list of the free parameters as read from the output files.

species_to_plotlist

A list of which molecular species to include in the plot.

contributionbool

If true, overplot the emission or transmission contribution function.

pRT_referencestr

If specified, the pRT object of the data with name pRT_reference will be used for plotting, instead of generating a new pRT object at R = 1000.

model_generating_function(callable, optional):

A function that returns the wavelength and spectrum, and takes a pRT_Object and the current set of parameters stored in self.parameters. This should be the same model function used in the retrieval.

refreshbool

If True (default value) the .npy files in the evaluate_[retrieval_name] folder will be replaced by recalculating the best fit model. This is useful if plotting intermediate results from a retrieval that is still running. If False no new spectrum will be calculated and the plot will be generated from the .npy files in the evaluate_[retrieval_name] folder.

modestr

‘bestfit’ or ‘median’, indicating which set of values should be used for plotting the abundances.

sample_posteriorsbool

If true, sample the posterior distribtions to calculate confidence intervales for the retrieved abundance profiles.

volume_mixing_ratiobool

If true, plot in units of volume mixing ratio (number fraction) instead of mass fractions.

Returns:
figmatplotlib.figure

The matplotlib figure, containing the data, best fit spectrum and residuals.

axmatplotlib.axes

The upper pane of the plot, containing the best fit spectrum and data

ax_rmatplotlib.axes

The lower pane of the plot, containing the residuals between the fit and the data

class petitRADTRANS.retrieval.RetrievalConfig(retrieval_name='retrieval_name', run_mode='retrieval', AMR=False, scattering=False, distribution='lognormal', pressures=None, write_out_spec_sample=False)

The RetrievalConfig class contains all of the data and model level information necessary to run a petitRADTRANS retrieval. The name of the class will be used to name outputs. This class is passed to the Retrieval, which runs the actual pymultinest retrieval and produces the outputs.

The general usage of this class is to define it, add the parameters and their priors, add the opacity sources, the data together with a model for each dataset, and then configure a few plotting arguments.

Args:
retrieval_namestr

Name of this retrieval. Make it informative so that you can keep track of the outputs!

run_modestr

Can be either ‘retrieval’, which runs the retrieval normally using pymultinest, or ‘evaluate’, which produces plots from the best fit parameters stored in the output post_equal_weights file.

AMRbool

Use an adaptive high resolution pressure grid around the location of cloud condensation. This will increase the size of the pressure grid by a constant factor that can be adjusted in the setup_pres function.

scatteringbool

If using emission spectra, turn scattering on or off.

pressuresnumpy.array

A log-spaced array of pressures over which to retrieve. 100 points is standard, between 10^-6 and 10^3.

_plot_defaults()
_setup_pres(scaling=10, width=3)

This converts the standard pressure grid into the correct length for the AMR pressure grid. The scaling adjusts the resolution of the high resolution grid, while the width determines the size of the high pressure region. This function is automatically called in Retrieval.setupData().

Args:
scalingint

A multiplicative factor that determines the size of the full high resolution pressure grid, which will have length self.p_global.shape[0] * scaling.

widthint

The number of cells in the low pressure grid to replace with the high resolution grid.

add_parameter(name, free, value=None, transform_prior_cube_coordinate=None)

This function adds a Parameter (see parameter.py) to the dictionary of parameters. A Parameter has a name and a boolean parameter to set whether it is a free or fixed parameter during the retrieval. In addition, a value can be set, or a prior function can be given that transforms a random variable in [0,1] to the physical dimensions of the Parameter.

Args:
namestr

The name of the parameter. Must match the name used in the model function for the retrieval.

freebool

True if the parameter is a free parameter in the retrieval, false if it is fixed.

valuefloat

The value of the parameter in the units used by the model function.

transform_prior_cube_coordinatemethod

A function that transforms the unit interval to the physical units of the parameter. Typically given as a lambda function.

list_available_line_species()

List the currently installed opacity tables that are available for species that contribute to the line opacity.

list_available_cloud_species()

List the currently installed opacity tables that are available for cloud species.

list_available_cia_species()

List the currently installed opacity tables that are available for CIA species.

set_line_species(linelist, eq=False, abund_lim=(-6.0, -0.5))

Set RadTrans.line_species

This function adds a list of species to the pRT object that will define the line opacities of the model. The values in the list are strings, with the names matching the pRT opacity names, which vary between the c-k line opacities and the line-by-line opacities.

NOTE: As of pRT version 2.4.9, the behaviour of this function has changed. In previous versions the abundance limits were set from abund_lim[0] to (abund_lim[0] + abund_lim[1]). This has been changed so that the limits of the prior range are from abund_lim[0] to abund_lim[1] (ie the actual boundaries).

Args:
linelistList(str)

The list of species to include in the retrieval

eqbool

If false, the retrieval should use free chemistry, and Parameters for the abundance of each species in the linelist will be added to the retrieval. Otherwise, equilibrium chemistry will be used. If you need fine control species, use the add_line_species and set up each species individually.

abund_limTuple(float,float)

If free is True, this sets the boundaries of the uniform prior that will be applied for each species in linelist. The range of the prior goes from abund_lim[0] to abund_lim[1]. The abundance limits must be given in log10 units of the mass fraction.

set_rayleigh_species(linelist)

Set the list of species that contribute to the rayleigh scattering in the pRT object.

Args:
linelistList(str)

A list of species that contribute to the rayleigh opacity.

set_continuum_opacities(linelist)

Set the list of species that contribute to the continuum opacity in the pRT object.

Args:
linelistList(str)

A list of species that contribute to the continuum opacity.

add_line_species(species, eq=False, abund_lim=(-7.0, 0.0), fixed_abund=None)

This function adds a single species to the pRT object that will define the line opacities of the model. The name must match the pRT opacity name, which vary between the c-k line opacities and the line-by-line opacities.

NOTE: As of pRT version 2.4.9, the behaviour of this function has changed. In previous versions the abundance limits were set from abund_lim[0] to (abund_lim[0] + abund_lim[1]). This has been changed so that the limits of the prior range are from abund_lim[0] to abund_lim[1] (ie the actual boundaries).

Args:
speciesstr

The species to include in the retrieval

eqbool

If False, the retrieval should use free chemistry, and Parameters for the abundance of the species will be added to the retrieval. Otherwise, (dis)equilibrium chemistry will be used.

abund_limTuple(float,float)

If free is True, this sets the boundaries of the uniform prior that will be applied the species given. The range of the prior goes from abund_lim[0] to abund_lim[1] The abundance limits must be given in log10 units of the mass fraction.

fixed_abundfloat

The log-mass fraction abundance of the species. Currently only supports vertically constant abundances. If this is set, then the species will not be a free parameter in the retrieval.

remove_species_lines(species, free=False)

This function removes a species from the pRT line list, and if using a free chemistry retrieval, removes the associated Parameter of the species.

Args:
speciesstr

The species to remove from the retrieval

freebool

If true, the retrieval should use free chemistry, and Parameters for the abundance of the species will be removed to the retrieval

add_cloud_species(species, eq=True, abund_lim=(-3.5, 1.5), scaling_factor=None, PBase_lim=None, fixed_abund=None, fixed_base=None)

This function adds a single cloud species to the list of species. Optionally, it will add parameters to allow for a retrieval using an ackermann-marley model. If an equilibrium condensation model is used in th retrieval model function (eq=True), then a parameter is added that scales the equilibrium cloud abundance, as in Molliere (2020). If eq is false, two parameters are added, the cloud abundnace and the cloud base pressure. The limits set the prior ranges, both on a log scale.

NOTE: As of pRT version 2.4.9, the behaviour of this function has changed. In previous versions the abundance limits were set from abund_lim[0] to (abund_lim[0] + abund_lim[1]). This has been changed so that the limits of the prior range are from abund_lim[0] to abund_lim[1] (ie the actual boundaries). The same is true for PBase_lim.

Args:
speciesstr

Name of the pRT cloud species, including the cloud shape tag.

eqbool

Does the retrieval model use an equilibrium cloud model. This restricts the available species!

abund_limtuple(float,float)

If eq is True, this sets the scaling factor for the equilibrium condensate abundance, typical range would be (-3,1). If eq is false, this sets the range on the actual cloud abundance, with a typical range being (-5,0).

PBase_limtuple(float,float)

Only used if not using an equilibrium model. Sets the limits on the log of the cloud base pressure. Obsolete.

fixed_abundOptional(float)

A vertically constant log mass fraction abundance for the cloud species. If set, this will not be a free parameter in the retrieval. Only compatible with non-equilibrium clouds.

fixed_baseOptional(float)

The log cloud base pressure. If set, fixes this parameter to a constant value, and it will not be a free parameter in the retrieval. Only compatible with non-equilibrium clouds. Not yet compatible with most built in pRT models.

add_data(name, path, model_generating_function, data_resolution=None, model_resolution=None, distance=None, scale=False, scale_err=False, offset_bool=False, wlen_range_micron=None, external_pRT_reference=None, opacity_mode='c-k', wlen_bins=None, pRT_grid=False, pRT_object=None, wlen=None, flux=None, flux_error=None, mask=None)

Create a Data class object. # TODO complete docstring Args:

namestr

Identifier for this data set.

pathstr

Path to observations file, including filename. This can be a txt or dat file containing the wavelength, flux, transit depth and error, or a fits file containing the wavelength, spectrum and covariance matrix.

model_generating_functionfnc

A function, typically defined in run_definition.py that returns the model wavelength and spectrum (emission or transmission). This is the function that contains the physics of the model, and calls pRT in order to compute the spectrum.

data_resolutionfloat

Spectral resolution of the instrument. Optional, allows convolution of model to instrumental line width.

model_resolutionfloat

Spectral resolution of the model, allowing for low resolution correlated k tables from exo-k.

distancefloat

The distance to the object in cgs units. Defaults to a 10pc normalized distance. All data must be scaled to the same distance before running the retrieval, which can be done using the scale_to_distance method in the Data class.

scalebool

Turn on or off scaling the data by a constant factor.

wlen_range_micronTuple

A pair of wavelenths in units of micron that determine the lower and upper boundaries of the model computation.

external_pRT_referencestr

The name of an existing Data object. This object’s prt_object will be used to calculate the chi squared of the new Data object. This is useful when two datasets overlap, as only one model computation is required to compute the log likelihood of both datasets.

opacity_modestr

Should the retrieval be run using correlated-k opacities (default, ‘c-k’), or line by line (‘lbl’) opacities? If ‘lbl’ is selected, it is HIGHLY recommended to set the model_resolution parameter.

pRT_grid: bool

Set to true if data has been binned to pRT R = 1,000 c-k grid.

add_photometry(path, model_generating_function, model_resolution=10, distance=None, scale=False, wlen_range_micron=None, photometric_transformation_function=None, external_pRT_reference=None, opacity_mode='c-k')

Create a Data class object for each photometric point in a photometry file. The photometry file must be a csv file and have the following structure: name, lower wavelength bound [um], upper wavelength boundary[um], flux [W/m2/micron], flux error [W/m2/micron]

Photometric data requires a transformation function to convert a spectrum into synthetic photometry. You must provide this function yourself, or have the species package installed. If using species, the name in the data file must be of the format instrument/filter.

Args:
model_generating_functionstr

Identifier for this data set.

pathstr

Path to observations file, including filename.

model_resolutionfloat

Spectral resolution of the model, allowing for low resolution correlated k tables from exo-k.

scalebool

Turn on or off scaling the data by a constant factor. Currently only set up to scale all photometric data in a given file.

distancefloat

The distance to the object in cgs units. Defaults to a 10pc normalized distance. All data must be scaled to the same distance before running the retrieval, which can be done using the scale_to_distance method in the Data class.

wlen_range_micronTuple

A pair of wavelenths in units of micron that determine the lower and upper boundaries of the model computation.

external_pRT_referencestr

The name of an existing Data object. This object’s prt_object will be used to calculate the chi squared of the new Data object. This is useful when two datasets overlap, as only one model computation is required to compute the log likelihood of both datasets.

photometric_transformation_functionmethod

A function that will transform a spectrum into an average synthetic photometric point, typicall accounting for filter transmission.

opacity_mode: str

Opacity mode.