petitRADTRANS.retrieval

Submodules

Package Contents

Classes

Retrieval

This class implements the retrieval method using petitRADTRANS and pymultinest.

RetrievalConfig

The RetrievalConfig class contains all of the data and model level information necessary

Attributes

__author__

__copyright__

__maintainer__

__email__

__status__

petitRADTRANS.retrieval.__author__ = Evert Nasedkin
petitRADTRANS.retrieval.__maintainer__ = Evert Nasedkin
petitRADTRANS.retrieval.__email__ = nasedkinevert@gmail.com
petitRADTRANS.retrieval.__status__ = Development
class petitRADTRANS.retrieval.Retrieval(run_definition, output_dir='', test_plotting=False, sample_spec=False, ultranest=False, sampling_efficiency=None, const_efficiency_mode=None, n_live_points=None, resume=None, bayes_factor_species=None, corner_plot_names=None, short_names=None, pRT_plot_style=True)

This class implements the retrieval method using petitRADTRANS and pymultinest. A RetrievalConfig object is passed to this class to describe the retrieval data, parameters and priors. The run() method then uses pymultinest to sample the parameter space, producing posterior distributions for parameters and bayesian evidence for models. Various useful plotting functions have also been included, and can be run once the retrieval is complete.

Args:
run_definitionRetrievalConfig

A RetrievalConfig object that describes the retrieval to be run. This is the user facing class that must be setup for every retrieval.

output_dirStr

The directory in which the output folders should be written

test_plottingBool

Only use when running locally. A boolean flag that will produce plots for each sample when pymultinest is run.

sample_specBool

Produce plots and data files for 100 randomly sampled outputs from pymultinest.

ultranestbool

If true, use Ultranest sampling rather than pymultinest. This is still a work in progress, so use with caution!

bayes_factor_speciesStr

A pRT species that should be removed to test for the bayesian evidence for it’s presence.

corner_plot_namesList(Str)

List of additional retrieval names that should be included in the corner plot.

short_namesList(Str)

For each corner_plot_name, a shorter name to be included when plotting.

pRT_plot_styleBool

Use the petitRADTRANS plotting style as described in plot_style.py. Recommended to turn this parameter to false if you want to use interactive plotting, or if the test_plotting parameter is True.

run(self, sampling_efficiency=0.8, const_efficiency_mode=True, n_live_points=4000, log_z_convergence=0.5, step_sampler=False, warmstart_max_tau=0.5, resume=True)

Run mode for the class. Uses pynultinest to sample parameter space and produce standard PMN outputs.

Args:
sampling_efficiencyFloat

pymultinest sampling efficiency. If const efficiency mode is true, should be set to around 0.05. Otherwise, it should be around 0.8 for parameter estimation and 0.3 for evidence comparison.

const_efficiency_modeBool

pymultinest constant efficiency mode

n_live_pointsInt

Number of live points to use in pymultinest, or the minimum number of live points to use for the Ultranest reactive sampler.

log_z_convergencefloat

If ultranest is being used, the convergence criterion on log z.

step_samplerbool

Use a step sampler to improve the efficiency in ultranest.

warmstart_max_taufloat

Warm start allows accelerated computation based on a different but similar UltraNest run.

resumebool

Continue existing retrieval. If FALSE THIS WILL OVERWRITE YOUR EXISTING RETRIEVAL.

_run_ultranest(self, n_live_points=4000, log_z_convergence=0.5, step_sampler=True, warmstart_max_tau=0.5, resume=True)

Run mode for the class. Uses ultranest to sample parameter space and produce standard outputs.

Args:
n_live_pointsInt

The minimum number of live points to use for the Ultranest reactive sampler.

log_z_convergencefloat

The convergence criterion on log z.

step_samplerbool

Use a step sampler to improve the efficiency in ultranest.

resumebool

Continue existing retrieval. If FALSE THIS WILL OVERWRITE YOUR EXISTING RETRIEVAL.

generate_retrieval_summary(self, stats=None)

This function produces a human-readable text file describing the retrieval. It includes all of the fixed and free parameters, the limits of the priors (if uniform), a description of the data used, and if the retrieval is complete, a summary of the best fit parameters and model evidence.

Args:
statsdict

A Pymultinest stats dictionary, from Analyzer.get_stats(). This contains the evidence and best fit parameters.

setup_data(self, scaling=10, width=3)

Creates a pRT object for each data set that asks for a unique object. Checks if there are low resolution c-k models from exo-k, and creates them if necessary. The scaling and width parameters adjust the AMR grid as described in RetrievalConfig.setup_pres and models.fixed_length_amr. It is recommended to keep the defaults.

Args:
scalingint

A multiplicative factor that determines the size of the full high resolution pressure grid, which will have length self.p_global.shape[0] * scaling.

widthint

The number of cells in the low pressure grid to replace with the high resolution grid.

prior(self, cube, ndim=0, nparams=0)

pyMultinest Prior function. Transforms unit hypercube into physical space.

prior_ultranest(self, cube)

pyMultinest Prior function. Transforms unit hypercube into physical space.

log_likelihood(self, cube, ndim=0, nparam=0)

pyMultiNest required likelihood function.

This function wraps the model computation and log-likelihood calculations for pyMultiNest to sample. If PT_plot_mode is True, it will return the calculate only the pressure and temperature arrays rather than the wavlength and flux. If run_mode is evaluate, it will save the provided sample to the best-fit spectrum file, and add it to the best_fit_specs dictionary. If evaluate_sample_spectra is true, it will store the spectrum in posterior_sample_specs.

Args:
cubenumpy.ndarray

The transformed unit hypercube, providing the parameter values to be passed to the model_generating_function.

ndimint

The number of dimensions of the problem

nparamint

The number of parameters in the fit.

Returns:
log_likelihoodfloat

The (negative) log likelihood of the model given the data.

get_samples(self, output_dir=None, ret_names=[])

This function looks in the given output directory and finds the post_equal_weights file associated with the current retrieval name.

Args:
output_dirstr

Parent directory of the out_PMN/RETRIEVALNAME_post_equal_weights.dat file

ret_namesList(str)

A list of retrieval names to add to the sample and parameter dictionary. Functions the same as setting corner_files during initialisation.

Returns:
sample_dictdict

A dictionary with keys being the name of the retrieval, and values are a numpy ndarray containing the samples in the post_equal_weights file

parameter_dictdict

A dictionary with keys being the name of the retrieval, and values are a list of names of the parameters used in the retrieval. The first name corresponds to the first column of the samples, and so on.

get_best_fit_params(self, best_fit_params, parameters_read)

This function converts the sample from the post_equal_weights file with the maximum log likelihood, and converts it into a dictionary of Parameters that can be used in a model function.

Args:
best_fit_paramsnumpy.ndarray

An array of the best fit parameter values (or any other sample)

parameters_readlist

A list of the free parameters as read from the output files.

get_best_fit_model(self, best_fit_params, parameters_read, model_generating_func=None, ret_name=None)

This function uses the best fit parameters to generate a pRT model that spans the entire wavelength range of the retrieval, to be used in plots.

Args:
best_fit_paramsnumpy.ndarray

A numpy array containing the best fit parameters, to be passed to get_best_fit_params

parameters_readlist

A list of the free parameters as read from the output files.

model_generating_funmethod

A function that will take in the standard ‘model’ arguments (pRT_object, params, pt_plot_mode, AMR, resolution) and will return the wavlength and flux arrays as calculated by petitRadTrans. If no argument is given, it uses the method of the dataset given in the take_PTs_from kwarg.

ret_namestr

If plotting a fit from a different retrieval, input the retrieval name to be included.

Returns:
bf_wlennumpy.ndarray

The wavelength array of the best fit model

bf_spectrumnumpy.ndarray

The emission or transmission spectrum array, with the same shape as bf_wlen

get_abundances(self, sample, parameters_read=None)

This function returns the abundances of each species as a function of pressure

Args:
samplenumpy.ndarray

A sample from the pymultinest output, the abundances returned will be computed for this set of parameters.

Returns:
abundancesdict

A dictionary of abundances. The keys are the species name, the values are the mass fraction abundances at each pressure

MMWnumpy.ndarray

The mean molecular weight at each pressure level in the atmosphere.

get_evidence(self, ret_name='')

Get the log10 Z and error for the retrieval

This function uses the pymultinest analyzer to get the evidence for the current retrieval_name by default, though any retrieval_name in the out_PMN folder can be passed as an argument - useful for when you’re comparing multiple similar models. This value is also printed in the summary file.

Args:
ret_namestring

The name of the retrieval that prepends all of the PMN output files.

get_analyzer(self, ret_name='')

Get the PMN analyer from a retrieval run

This function uses gets the PMN analyzer object for the current retrieval_name by default, though any retrieval_name in the out_PMN folder can be passed as an argument - useful for when you’re comparing multiple similar models.

Args:
ret_namestring

The name of the retrieval that prepends all of the PMN output files.

plot_all(self, output_dir=None, ret_names=[])

Produces plots for the best fit spectrum, a sample of 100 output spectra, the best fit PT profile and a corner plot for parameters specified in the run definition.

plot_spectra(self, samples_use, parameters_read, model_generating_func=None)

Plot the best fit spectrum, the data from each dataset and the residuals between the two. Saves a file to OUTPUT_DIR/evaluate_RETRIEVAL_NAME/best_fit_spec.pdf

Args:
samples_usenumpy.ndarray

An array of the samples from the post_equal_weights file, used to find the best fit sample

parameters_readlist

A list of the free parameters as read from the output files.

model_generating_funmethod

A function that will take in the standard ‘model’ arguments (pRT_object, params, pt_plot_mode, AMR, resolution) and will return the wavlength and flux arrays as calculated by petitRadTrans. If no argument is given, it uses the method of the first dataset included in the retrieval.

Returns:
figmatplotlib.figure

The matplotlib figure, containing the data, best fit spectrum and residuals.

axmatplotlib.axes

The upper pane of the plot, containing the best fit spectrum and data

ax_rmatplotlib.axes

The lower pane of the plot, containing the residuals between the fit and the data

plot_sampled(self, samples_use, parameters_read, downsample_factor=None)

Plot a set of randomly sampled output spectra for each dataset in the retrieval.

This will save nsample files for each dataset included in the retrieval. Note that if you change the model_resolution of your Data and rerun this function, the files will NOT be updated - if the files exists the function defaults to reading from file rather than recomputing. Delete all of the sample functions and run it again.

Args:
samples_usenp.ndarray

posterior samples from pynmultinest outputs (post_equal_weights)

downsample_factorint

Factor by which to reduce the resolution of the sampled model, for smoother plotting. Defaults to None. A value of None will result in the full resolution spectrum. Note that this factor can only reduce the resolution from the underlying model_resolution of the data.

plot_PT(self, sample_dict, parameters_read)

Plot the PT profile with error contours

Args:
samples_usenp.ndarray

posterior samples from pynmultinest outputs (post_equal_weights)

parameters_readList

Used to plot correct parameters, as some in self.parameters are not free, and aren’t included in the PMN outputs

Returns:

fig : matplotlib.figure ax : matplotlib.axes

plot_corner(self, sample_dict, parameter_dict, parameters_read, **kwargs)

Make the corner plots

Args:
samples_dictDict

Dictionary of samples from PMN outputs, with keys being retrieval names

parameter_dictDict

Dictionary of parameters for each of the retrievals to be plotted.

parameters_readList

Used to plot correct parameters, as some in self.parameters are not free, and aren’t included in the PMN outputs

kwargsdict

Each kwarg can be one of the kwargs used in corner.corner. These can be used to adjust the title_kwargs,label_kwargs,hist_kwargs, hist2d_kawargs or the contour kwargs. Each kwarg must be a dictionary with the arguments as keys and values as the values.

class petitRADTRANS.retrieval.RetrievalConfig(retrieval_name='retrieval_name', run_mode='retrieval', AMR=False, scattering=False, pressures=None, write_out_spec_sample=False)

The RetrievalConfig class contains all of the data and model level information necessary to run a petitRADTRANS retrieval. The name of the class will be used to name outputs. This class is passed to the Retrieval, which runs the actual pymultinest retrieval and produces the outputs.

The general usage of this class is to define it, add the parameters and their priors, add the opacity sources, the data together with a model for each dataset, and then configure a few plotting arguments.

Args:
retrieval_namestr

Name of this retrieval. Make it informative so that you can keep track of the outputs!

run_modestr

Can be either ‘retrieval’, which runs the retrieval normally using pymultinest, or ‘evaluate’, which produces plots from the best fit parameters stored in the output post_equal_weights file.

AMRbool

Use an adaptive high resolution pressure grid around the location of cloud condensation. This will increase the size of the pressure grid by a constant factor that can be adjusted in the setup_pres function.

scatteringbool

If using emission spectra, turn scattering on or off.

pressuresnumpy.array

A log-spaced array of pressures over which to retrieve. 100 points is standard, between 10^-6 and 10^3.

_plot_defaults(self)
_setup_pres(self, scaling=10, width=3)

This converts the standard pressure grid into the correct length for the AMR pressure grid. The scaling adjusts the resolution of the high resolution grid, while the width determines the size of the high pressure region. This function is automatically called in Retrieval.setupData().

Args:
scalingint

A multiplicative factor that determines the size of the full high resolution pressure grid, which will have length self.p_global.shape[0] * scaling.

widthint

The number of cells in the low pressure grid to replace with the high resolution grid.

add_parameter(self, name, free, value=None, transform_prior_cube_coordinate=None)

This function adds a Parameter (see parameter.py) to the dictionary of parameters. A Parameter has a name and a boolean parameter to set whether it is a free or fixed parameter during the retrieval. In addition, a value can be set, or a prior function can be given that transforms a random variable in [0,1] to the physical dimensions of the Parameter.

Args:
namestr

The name of the parameter. Must match the name used in the model function for the retrieval.

freebool

True if the parameter is a free parameter in the retrieval, false if it is fixed.

valuefloat

The value of the parameter in the units used by the model function.

transform_prior_cube_coordinatemethod

A function that transforms the unit interval to the physical units of the parameter. Typically given as a lambda function.

list_available_line_species(self)

List the currently installed opacity tables that are available for species that contribute to the line opacity.

list_available_cloud_species(self)

List the currently installed opacity tables that are available for cloud species.

list_available_cia_species(self)

List the currently installed opacity tables that are available for CIA species.

set_line_species(self, linelist, eq=False, abund_lim=(- 6.0, 6.0))

Set RadTrans.line_species

This function adds a list of species to the pRT object that will define the line opacities of the model. The values in the list are strings, with the names matching the pRT opacity names, which vary between the c-k line opacities and the line-by-line opacities.

Args:
linelistList(str)

The list of species to include in the retrieval

eqbool

If false, the retrieval should use free chemistry, and Parameters for the abundance of each species in the linelist will be added to the retrieval. Otherwise equilibrium chemistry will be used. If you need fine control species, use the add_line_species and set up each species individually.

abund_limTuple(float,float)

If free is True, this sets the boundaries of the uniform prior that will be applied for each species in linelist. The range of the prior goes from abund_lim[0] to abund_lim[0] + abund_lim[1]. The abundance limits must be given in log10 units of the mass fraction.

set_rayleigh_species(self, linelist)

Set the list of species that contribute to the rayleigh scattering in the pRT object.

Args:
linelistList(str)

A list of species that contribute to the rayleigh opacity.

set_continuum_opacities(self, linelist)

Set the list of species that contribute to the continuum opacity in the pRT object.

Args:
linelistList(str)

A list of species that contribute to the continuum opacity.

add_line_species(self, species, eq=False, abund_lim=(- 8.0, 7.0), fixed_abund=None)

This function adds a single species to the pRT object that will define the line opacities of the model. The name must match the pRT opacity name, which vary between the c-k line opacities and the line-by-line opacities.

Args:
speciesstr

The species to include in the retrieval

eqbool

If False, the retrieval should use free chemistry, and Parameters for the abundance of the species will be added to the retrieval. Otherwise (dis)equilibrium chemistry will be used.

abund_limTuple(float,float)

If free is True, this sets the boundaries of the uniform prior that will be applied the species given. The range of the prior goes from abund_lim[0] to abund_lim[0] + abund_lim[1]. The abundance limits must be given in log10 units of the mass fraction.

fixed_abundfloat

The log-mass fraction abundance of the species. Currently only supports vertically constant abundances. If this is set, then the species will not be a free parameter in the retrieval.

remove_species_lines(self, species, free=False)

This function removes a species from the pRT line list, and if using a free chemistry retrieval, removes the associated Parameter of the species.

Args:
speciesstr

The species to remove from the retrieval

freebool

If true, the retrieval should use free chemistry, and Parameters for the abundance of the species will be removed to the retrieval

add_cloud_species(self, species, eq=True, abund_lim=(- 3.5, 4.5), PBase_lim=(- 5.0, 7.0), fixed_abund=None, fixed_base=None)

This function adds a single cloud species to the list of species. Optionally, it will add parameters to allow for a retrieval using an ackermann-marley model. If an equilibrium condensation model is used in th retrieval model function (eq=True), then a parameter is added that scales the equilibrium cloud abundance, as in Molliere (2020). If eq is false, two parameters are added, the cloud abundnace and the cloud base pressure. The limits set the prior ranges, both on a log scale.

Args:
speciesstr

Name of the pRT cloud species, including the cloud shape tag.

eqbool

Does the retrieval model use an equilibrium cloud model. This restricts the available species!

abund_limtuple(float,float)

If eq is True, this sets the scaling factor for the equilibrium condensate abundance, typical range would be (-3,0). If eq is false, this sets the the range on the actual cloud abundance, with a typical range being (-5,7). Note that the upper limit is set from abund_lim[0] + abund_lim[1].

PBase_limtuple(float,float)

Only used if not using an equilibrium model. Sets the limits on teh log of the cloud base pressure. Obsolete.

fixed_abundOptional(float)

A vertically constant log mass fraction abundance for the cloud species. If set, this will not be a free parameter in the retrieval. Only compatible with non-equilibrium clouds.

fixed_baseOptional(float)

The log cloud base pressure. If set, fixes this parameter to a constant value, and it will not be a free parameter in the retrieval. Only compatible with non-equilibrium clouds. Not yet compatible with most built in pRT models.

add_data(self, name, path, model_generating_function, data_resolution=None, model_resolution=None, distance=None, scale=False, wlen_range_micron=None, external_pRT_reference=None, opacity_mode='c-k')

Create a Data class object.

Args:
namestr

Identifier for this data set.

pathstr

Path to observations file, including filename. This can be a txt or dat file containing the wavelength, flux, transit depth and error, or a fits file containing the wavelength, spectrum and covariance matrix.

model_generating_functionfnc

A function, typically defined in run_definition.py that returns the model wavelength and spectrum (emission or transmission). This is the function that contains the physics of the model, and calls pRT in order to compute the spectrum.

data_resolutionfloat

Spectral resolution of the instrument. Optional, allows convolution of model to instrumental line width.

model_resolutionfloat

Spectral resolution of the model, allowing for low resolution correlated k tables from exo-k.

distancefloat

The distance to the object in cgs units. Defaults to a 10pc normalized distance. All data must be scaled to the same distance before running the retrieval, which can be done using the scale_to_distance method in the Data class.

scalebool

Turn on or off scaling the data by a constant factor.

wlen_range_micronTuple

A pair of wavelenths in units of micron that determine the lower and upper boundaries of the model computation.

external_pRT_referencestr

The name of an existing Data object. This object’s pRT_object will be used to calculate the chi squared of the new Data object. This is useful when two datasets overlap, as only one model computation is required to compute the log likelihood of both datasets.

opacity_modestr

Should the retrieval be run using correlated-k opacities (default, ‘c-k’), or line by line (‘lbl’) opacities? If ‘lbl’ is selected, it is HIGHLY recommended to set the model_resolution parameter. In general, ‘c-k’ mode is recommended for retrievals of everything other than high-resolution (R>40000) spectra.

add_photometry(self, path, model_generating_function, model_resolution=10, distance=None, scale=False, wlen_range_micron=None, photometric_transformation_function=None, external_pRT_reference=None, opacity_mode='c-k')

Create a Data class object for each photometric point in a photometry file. The photometry file must be a csv file and have the following structure: name, lower wavelength bound [um], upper wavelength boundary[um], flux [W/m2/micron], flux error [W/m2/micron]

Photometric data requires a transformation function to conver a spectrum into synthetic photometry. You must provide this function yourself, or have the species package installed. If using species, the name in the data file must be of the format instrument/filter.

Args:
namestr

Identifier for this data set.

pathstr

Path to observations file, including filename.

model_resolutionfloat

Spectral resolution of the model, allowing for low resolution correlated k tables from exo-k.

scalebool

Turn on or off scaling the data by a constant factor. Currently only set up to scale all photometric data in a given file.

distancefloat

The distance to the object in cgs units. Defaults to a 10pc normalized distance. All data must be scaled to the same distance before running the retrieval, which can be done using the scale_to_distance method in the Data class.

wlen_range_micronTuple

A pair of wavelenths in units of micron that determine the lower and upper boundaries of the model computation.

external_pRT_referencestr

The name of an existing Data object. This object’s pRT_object will be used to calculate the chi squared of the new Data object. This is useful when two datasets overlap, as only one model computation is required to compute the log likelihood of both datasets.

photometric_transformation_functionmethod

A function that will transform a spectrum into an average synthetic photometric point, typicall accounting for filter transmission.