petitRADTRANS.retrieval.retrieval
¶
Module Contents¶
Classes¶
This class implements the retrieval method using petitRADTRANS and pymultinest. |
- class petitRADTRANS.retrieval.retrieval.Retrieval(run_definition, output_dir='', test_plotting=False, sample_spec=False, ultranest=False, sampling_efficiency=None, const_efficiency_mode=None, n_live_points=None, resume=None, bayes_factor_species=None, corner_plot_names=None, short_names=None, pRT_plot_style=True)¶
This class implements the retrieval method using petitRADTRANS and pymultinest. A RetrievalConfig object is passed to this class to describe the retrieval data, parameters and priors. The run() method then uses pymultinest to sample the parameter space, producing posterior distributions for parameters and bayesian evidence for models. Various useful plotting functions have also been included, and can be run once the retrieval is complete.
- Args:
- run_definitionRetrievalConfig
A RetrievalConfig object that describes the retrieval to be run. This is the user facing class that must be setup for every retrieval.
- output_dirStr
The directory in which the output folders should be written
- test_plottingBool
Only use when running locally. A boolean flag that will produce plots for each sample when pymultinest is run.
- sample_specBool
Produce plots and data files for 100 randomly sampled outputs from pymultinest.
- ultranestbool
If true, use Ultranest sampling rather than pymultinest. This is still a work in progress, so use with caution!
- bayes_factor_speciesStr
A pRT species that should be removed to test for the bayesian evidence for it’s presence.
- corner_plot_namesList(Str)
List of additional retrieval names that should be included in the corner plot.
- short_namesList(Str)
For each corner_plot_name, a shorter name to be included when plotting.
- pRT_plot_styleBool
Use the petitRADTRANS plotting style as described in plot_style.py. Recommended to turn this parameter to false if you want to use interactive plotting, or if the test_plotting parameter is True.
- run(sampling_efficiency=0.8, const_efficiency_mode=True, n_live_points=4000, log_z_convergence=0.5, step_sampler=False, warmstart_max_tau=0.5, n_iter_before_update=50, resume=True, max_iters=0, frac_remain=0.1, Lepsilon=0.3)¶
Run mode for the class. Uses pynultinest to sample parameter space and produce standard PMN outputs.
- Args:
- sampling_efficiencyFloat
pymultinest sampling efficiency. If const efficiency mode is true, should be set to around 0.05. Otherwise, it should be around 0.8 for parameter estimation and 0.3 for evidence comparison.
- const_efficiency_modeBool
pymultinest constant efficiency mode
- n_live_pointsInt
Number of live points to use in pymultinest, or the minimum number of live points to use for the Ultranest reactive sampler.
- log_z_convergencefloat
If ultranest is being used, the convergence criterion on log z.
- step_samplerbool
Use a step sampler to improve the efficiency in ultranest.
- warmstart_max_taufloat
Warm start allows accelerated computation based on a different but similar UltraNest run.
- resumebool
Continue existing retrieval. If FALSE THIS WILL OVERWRITE YOUR EXISTING RETRIEVAL.
- _run_ultranest(n_live_points, log_z_convergence, step_sampler, warmstart_max_tau, resume, max_iters, frac_remain, Lepsilon)¶
Run mode for the class. Uses ultranest to sample parameter space and produce standard outputs.
- Args:
- n_live_pointsInt
The minimum number of live points to use for the Ultranest reactive sampler.
- log_z_convergencefloat
The convergence criterion on log z.
- step_samplerbool
Use a step sampler to improve the efficiency in ultranest.
- resumebool
Continue existing retrieval. If FALSE THIS WILL OVERWRITE YOUR EXISTING RETRIEVAL.
- generate_retrieval_summary(stats=None)¶
This function produces a human-readable text file describing the retrieval. It includes all of the fixed and free parameters, the limits of the priors (if uniform), a description of the data used, and if the retrieval is complete, a summary of the best fit parameters and model evidence.
- Args:
- statsdict
A Pymultinest stats dictionary, from Analyzer.get_stats(). This contains the evidence and best fit parameters.
- setup_data(scaling=10, width=3)¶
Creates a pRT object for each data set that asks for a unique object. Checks if there are low resolution c-k models from exo-k, and creates them if necessary. The scaling and width parameters adjust the AMR grid as described in RetrievalConfig.setup_pres and models.fixed_length_amr. It is recommended to keep the defaults.
- Args:
- scalingint
A multiplicative factor that determines the size of the full high resolution pressure grid, which will have length self.p_global.shape[0] * scaling.
- widthint
The number of cells in the low pressure grid to replace with the high resolution grid.
- prior(cube, ndim=0, nparams=0)¶
pyMultinest Prior function. Transforms unit hypercube into physical space.
- prior_ultranest(cube)¶
pyMultinest Prior function. Transforms unit hypercube into physical space.
- log_likelihood(cube, ndim=0, nparam=0)¶
pyMultiNest required likelihood function.
This function wraps the model computation and log-likelihood calculations for pyMultiNest to sample. If PT_plot_mode is True, it will return the calculate only the pressure and temperature arrays rather than the wavlength and flux. If run_mode is evaluate, it will save the provided sample to the best-fit spectrum file, and add it to the best_fit_specs dictionary. If evaluate_sample_spectra is true, it will store the spectrum in posterior_sample_specs.
- Args:
- cubenumpy.ndarray
The transformed unit hypercube, providing the parameter values to be passed to the model_generating_function.
- ndimint
The number of dimensions of the problem
- nparamint
The number of parameters in the fit.
- Returns:
- log_likelihoodfloat
The (negative) log likelihood of the model given the data.
- get_samples(output_dir=None, ret_names=[])¶
This function looks in the given output directory and finds the post_equal_weights file associated with the current retrieval name.
- Args:
- output_dirstr
Parent directory of the out_PMN/RETRIEVALNAME_post_equal_weights.dat file
- ret_namesList(str)
A list of retrieval names to add to the sample and parameter dictionary. Functions the same as setting corner_files during initialisation.
- Returns:
- sample_dictdict
A dictionary with keys being the name of the retrieval, and values are a numpy ndarray containing the samples in the post_equal_weights file
- parameter_dictdict
A dictionary with keys being the name of the retrieval, and values are a list of names of the parameters used in the retrieval. The first name corresponds to the first column of the samples, and so on.
- get_best_fit_params(best_fit_params, parameters_read)¶
This function converts the sample from the post_equal_weights file with the maximum log likelihood, and converts it into a dictionary of Parameters that can be used in a model function.
- Args:
- best_fit_paramsnumpy.ndarray
An array of the best fit parameter values (or any other sample)
- parameters_readlist
A list of the free parameters as read from the output files.
- get_full_range_model(parameters, model_generating_func=None, ret_name=None, contribution=False, pRT_object=None)¶
- get_best_fit_model(best_fit_params, parameters_read, ret_name=None, contribution=False)¶
This function uses the best fit parameters to generate a pRT model that spans the entire wavelength range of the retrieval, to be used in plots.
- Args:
- best_fit_paramsnumpy.ndarray
A numpy array containing the best fit parameters, to be passed to get_best_fit_params
- parameters_readlist
A list of the free parameters as read from the output files.
- model_generating_funmethod
A function that will take in the standard ‘model’ arguments (pRT_object, params, pt_plot_mode, AMR, resolution) and will return the wavlength and flux arrays as calculated by petitRadTrans. If no argument is given, it uses the method of the dataset given in the take_PTs_from kwarg.
- ret_namestr
If plotting a fit from a different retrieval, input the retrieval name to be included.
- Returns:
- bf_wlennumpy.ndarray
The wavelength array of the best fit model
- bf_spectrumnumpy.ndarray
The emission or transmission spectrum array, with the same shape as bf_wlen
- get_abundances(sample, parameters_read=None)¶
This function returns the abundances of each species as a function of pressure
- Args:
- samplenumpy.ndarray
A sample from the pymultinest output, the abundances returned will be computed for this set of parameters.
- Returns:
- abundancesdict
A dictionary of abundances. The keys are the species name, the values are the mass fraction abundances at each pressure
- MMWnumpy.ndarray
The mean molecular weight at each pressure level in the atmosphere.
- get_evidence(ret_name='')¶
Get the log10 Z and error for the retrieval
This function uses the pymultinest analyzer to get the evidence for the current retrieval_name by default, though any retrieval_name in the out_PMN folder can be passed as an argument - useful for when you’re comparing multiple similar models. This value is also printed in the summary file.
- Args:
- ret_namestring
The name of the retrieval that prepends all of the PMN output files.
- get_best_fit_likelihood(samples)¶
Get the log likelihood of the best fit model
- Args:
- samplesnumpy.ndarray
An array of samples and likelihoods taken from a post_equal_weights file
- get_best_fit_chi2(samples)¶
Get the 𝛘^2 of the best fit model - removing normalization term from log L
- Args:
- samplesnumpy.ndarray
An array of samples and likelihoods taken from a post_equal_weights file
- get_reduced_chi2(samples)¶
Get the 𝛘^2/DoF of the best fit model - divide chi^2 by DoF
- Args:
- samplesnumpy.ndarray
An array of samples and likelihoods taken from a post_equal_weights file
- get_analyzer(ret_name='')¶
Get the PMN analyer from a retrieval run
This function uses gets the PMN analyzer object for the current retrieval_name by default, though any retrieval_name in the out_PMN folder can be passed as an argument - useful for when you’re comparing multiple similar models.
- Args:
- ret_namestring
The name of the retrieval that prepends all of the PMN output files.
- build_param_dict(sample, free_param_names)¶
This function builds a dictionary of parameters that can be passed to the model building functions. It requires a numpy array with the same length as the number of free parameters, and a list of all of the parameter names in the order they appear in the array. The returned dictionary will contain all of these parameters, together with the fixed retrieval parameters.
- Args:
- samplenumpy.ndarray
An array or list of free parameter values
- free_param_nameslist(string)
A list of names for each of the free parameters.
- Returns:
- paramsdict
A dictionary of Parameters, with values set to the values in sample.
- sample_teff(sample_dict, param_dict, ret_names=None, nsample=None, resolution=40)¶
This function samples the outputs of a retrieval and computes Teff for each sample. For each sample, a model is computed at low resolution, and integrated to find the total radiant emittance, which is converted into a temperature using the stefan boltzmann law: $j^{star} = sigma T^{4}$. Teff itself is computed using util.calc_teff.
- Args:
- sample_dictdict
A dictionary, where each key is the name of a retrieval, and the values are the equal weighted samples.
- param_dictdict
A dictionary where each key is the name of a retrieval, and the values are the names of the free parameters associated with that retrieval.
- ret_namesOptional(list(string))
A list of retrieval names, each should be included in the sample_dict. If left as none, it defaults to only using the current retrieval name.
- nsampleOptional(int)
The number of times to compute Teff. If left empty, uses the “take_PTs_from” plot_kwarg. Recommended to use ~300 samples, probably more than is set in the kwarg!
- resolutionint
The spectra resolution to compute the models at. Typically, this should be very low in order to enable rapid calculation.
- Returns:
- tdictdict
A dictionary with retrieval names for keys, and the values are the calculated values of Teff for each sample.
- plot_all(output_dir=None, ret_names=[], contribution=False)¶
Produces plots for the best fit spectrum, a sample of 100 output spectra, the best fit PT profile and a corner plot for parameters specified in the run definition.
- plot_spectra(samples_use, parameters_read, model_generating_func=None)¶
Plot the best fit spectrum, the data from each dataset and the residuals between the two. Saves a file to OUTPUT_DIR/evaluate_RETRIEVAL_NAME/best_fit_spec.pdf
- Args:
- samples_usenumpy.ndarray
An array of the samples from the post_equal_weights file, used to find the best fit sample
- parameters_readlist
A list of the free parameters as read from the output files.
- model_generating_funmethod
A function that will take in the standard ‘model’ arguments (pRT_object, params, pt_plot_mode, AMR, resolution) and will return the wavlength and flux arrays as calculated by petitRadTrans. If no argument is given, it uses the method of the first dataset included in the retrieval.
- Returns:
- figmatplotlib.figure
The matplotlib figure, containing the data, best fit spectrum and residuals.
- axmatplotlib.axes
The upper pane of the plot, containing the best fit spectrum and data
- ax_rmatplotlib.axes
The lower pane of the plot, containing the residuals between the fit and the data
- plot_sampled(samples_use, parameters_read, downsample_factor=None, save_outputs=False)¶
Plot a set of randomly sampled output spectra for each dataset in the retrieval.
This will save nsample files for each dataset included in the retrieval. Note that if you change the model_resolution of your Data and rerun this function, the files will NOT be updated - if the files exists the function defaults to reading from file rather than recomputing. Delete all of the sample functions and run it again.
- Args:
- samples_usenp.ndarray
posterior samples from pynmultinest outputs (post_equal_weights)
- downsample_factorint
Factor by which to reduce the resolution of the sampled model, for smoother plotting. Defaults to None. A value of None will result in the full resolution spectrum. Note that this factor can only reduce the resolution from the underlying model_resolution of the data.
- plot_PT(sample_dict, parameters_read, contribution=False)¶
Plot the PT profile with error contours
- Args:
- samples_usenp.ndarray
posterior samples from pynmultinest outputs (post_equal_weights)
- parameters_readList
Used to plot correct parameters, as some in self.parameters are not free, and aren’t included in the PMN outputs
- weightedbool
Weight the opacity of the pt profile by the emission contribution function, and overplot the contribution curve.
- Returns:
fig : matplotlib.figure ax : matplotlib.axes
- plot_corner(sample_dict, parameter_dict, parameters_read, plot_best_fit=True, **kwargs)¶
Make the corner plots
- Args:
- samples_dictDict
Dictionary of samples from PMN outputs, with keys being retrieval names
- paramete` 1 1` Qar_dictDict
Dictionary of parameters for each of the retrievals to be plotted.
- parameters_readList
Used to plot correct parameters, as some in self.parameters are not free, and aren’t included in the PMN outputs
- kwargsdict
Each kwarg can be one of the kwargs used in corner.corner. These can be used to adjust the title_kwargs,label_kwargs,hist_kwargs, hist2d_kawargs or the contour kwargs. Each kwarg must be a dictionary with the arguments as keys and values as the values.
- plot_data()¶
- plot_contribution(samples_use, parameters_read, model_generating_func=None, log_scale_contribution=False, n_contour_levels=30)¶
Plot the contribution function from the best fit spectrum, the data from each dataset and the residuals between the two. Saves a file to OUTPUT_DIR/evaluate_RETRIEVAL_NAME/best_fit_spec.pdf
- Args:
- samples_usenumpy.ndarray
An array of the samples from the post_equal_weights file, used to find the best fit sample
- parameters_readlist
A list of the free parameters as read from the output files.
- model_generating_funmethod
A function that will take in the standard ‘model’ arguments (pRT_object, params, pt_plot_mode, AMR, resolution) and will return the wavlength and flux arrays as calculated by petitRadTrans. If no argument is given, it uses the method of the first dataset included in the retrieval.
- Returns:
- figmatplotlib.figure
The matplotlib figure, containing the data, best fit spectrum and residuals.
- axmatplotlib.axes
The upper pane of the plot, containing the best fit spectrum and data
- ax_rmatplotlib.axes
The lower pane of the plot, containing the residuals between the fit and the data
- plot_abundances(samples_use, parameters_read, species_to_plot=None, contribution=False)¶