petitRADTRANS.retrieval.data
============================

.. py:module:: petitRADTRANS.retrieval.data


Classes
-------

.. autoapisummary::

   petitRADTRANS.retrieval.data.Data
   petitRADTRANS.retrieval.data.PhotometryData
   petitRADTRANS.retrieval.data.TimeSeriesData


Module Contents
---------------

.. py:class:: Data(name: str, path_to_observations: str | None = None, data_resolution: float | None = None, model_resolution: float | None = None, system_distance: float | None = None, external_radtrans_reference: object | None = None, model_generating_function: Callable[Ellipsis, tuple[jax.typing.ArrayLike, jax.typing.ArrayLike, jax.typing.ArrayLike]] | None = None, wavelength_boundaries: tuple[float, float] | None = None, scale_flux: bool = False, scale_uncertainties: bool = False, fit_flux_offset: bool = False, fit_instrumental_resolution: bool = False, subtract_continuum: bool = False, wavelength_bin_widths: jax.typing.ArrayLike | None = None, line_opacity_mode: str = 'c-k', radtrans_grid: bool = False, radtrans_object: object = None, wavelengths: jax.typing.ArrayLike | None = None, spectrum: jax.typing.ArrayLike | None = None, uncertainties: jax.typing.ArrayLike | None = None, covariance: jax.typing.ArrayLike | None = None, fit_covariance: bool = False, covariance_mode: str = 'none', global_covariance_kernel: str = 'squared_exponential', local_covariance_kernel: str = 'squared_exponential', n_local_covariance_kernels: int = 0, covariance_jitter: float = 0.0, mask: jax.typing.ArrayLike | None = None)

   This class stores the spectral data to be retrieved from a single instrument or observation.

   Each dataset is associated with an instance of petitRadTrans and an atmospheric model.
   The pRT instance can be overwritten, and associated with an existing pRT instance with the
   external_pRT_reference parameter.
   This setup allows for joint or independent retrievals on multiple datasets.
   Args:
       name : str
           Identifier for this data set.
       path_to_observations : str
           Path to observations file, including filename. This can be a txt or dat file
           containing the wavelength, flux, transit depth and error, or a fits file
           containing the wavelength, spectrum and covariance matrix.
           Alternatively, the data information can be directly given by the wavelengths, spectrum, uncertainties, and
           mask attributes.
       data_resolution : float or jnp.ndarray
           Spectral resolution of the instrument. Optional, allows convolution of model to
           instrumental line width. If the data_resolution is an array, the resolution can
           vary as as a function of wavelength. The array should have the same shape as
           the input wavelength array, and should specify the spectral resolution at each
           wavelength bin.
       model_resolution : float
           Will be ``None`` by default.  The resolution of the c-k opacity tables in pRT.
           This will generate a new c-k table using exo-k. The default (and maximum)
           correlated k resolution in pRT is :math:`\\lambda/\\Delta \\lambda > 1000` (R=500).
           Lowering the resolution will speed up the computation.
           If integer positive value, and if ``opacities == 'lbl'`` is ``True``, then this
           will sample the high-resolution opacities at the specified resolution.
           This may be desired in the case where medium-resolution spectra are
           required with a :math:`\\lambda/\\Delta \\lambda > 1000`, but much smaller than
           :math:`10^6`, which is the resolution of the ``lbl`` mode. In this case it
           may make sense to carry out the calculations with line_by_line_opacity_sampling = 10e5,
           for example, and then re-binning to the final desired resolution:
           this may save time! The user should verify whether this leads to
           solutions which are identical to the re-binned results of the fiducial
           :math:`10^6` resolution. If not, this parameter must not be used.
           Note the difference between this parameter and the line_by_line_opacity_sampling
           parameter in the RadTrans class - the actual desired resolution should
           be set here.
       system_distance : float
           The distance to the object in CGS units. Defaults to a 10pc normalized distance.
       external_radtrans_reference : object
           An existing RadTrans object. Leave as none unless you're sure of what you're doing.
       model_generating_function : method
           A function, typically defined in run_definition.py that returns the model wavelength and spectrum
           (emission or transmission).
           This is the function that contains the physics of the model, and calls pRT in order to compute the
           spectrum.
       wavelength_boundaries : tuple,list
           Set the wavelength range of the pRT object. Defaults to a range +/-5% greater than that of the data.
           Must at least be equal to the range of the data.
       scale_flux : bool
           Turn on or off scaling the data by a constant factor. Set to True if scaling the data during the
           retrieval.
       scale_uncertainties:
           Turn on or off scaling the uncertainties by a constant factor. Set to True if scaling the uncertainties during the
           retrieval.
       fit_flux_offset:
           Turn on or off fitting a flux offset. Set to True if fitting a flux offset during the retrieval.
       wavelength_bin_widths : numpy.ndarray
           Set the wavelength bin width to bin the Radtrans object to the data. Defaults to the data bins.
       photometry : bool
           Set to True if using photometric data.
       photometric_transformation_function : method
           Transform the photometry (account for filter transmission etc.).
           This function must take in the wavelength and flux of a spectrum,
           and output a single photometric point (and optionally flux error).
       photometric_bin_edges : Tuple, numpy.ndarray
           The edges of the photometric bin in micron. [low,high]
       line_opacity_mode : str
           Should the retrieval be run using correlated-k opacities (default, 'c-k'),
           or line by line ('lbl') opacities? If 'lbl' is selected, it is HIGHLY
           recommended to set the model_resolution parameter. In general,
           'c-k' mode is recommended for retrievals of everything other than
           high-resolution (R>40000) spectra.
       radtrans_grid: bool
           Set to true if data has been binned to a pRT c-k grid.
       radtrans_object:
           An instance of Radtrans object to be used to generate model spectra in retrievals.
       wavelengths:
           (um) Wavelengths of the data.
       spectrum:
           Spectrum of the data.
       uncertainties:
           Uncertainties of the data, in the same units as the spectrum.
       covariance:
           Covariance matrix of the data, in the same units as the spectrum squared.
       fit_covariance:
           If True, construct a fitted covariance matrix during retrieval using GP-like kernel terms.
           This is currently supported for 1D spectroscopic datasets only.
       covariance_mode:
           Covariance model to use. Supported values are ``'none'``, ``'local'``, ``'global'``, and
           ``'global_local'``.
       global_covariance_kernel:
           Global kernel name used when ``covariance_mode`` includes a global term.
       local_covariance_kernel:
           Local kernel name used when ``covariance_mode`` includes local terms.
       n_local_covariance_kernels:
           Number of local covariance kernels to include when using ``'global_local'`` mode.
       covariance_jitter:
           Small diagonal stabilization term added to the fitted covariance matrix.
       mask:
           Mask of the data.


   .. py:attribute:: resolving_power_str
      :value: '.R'


   .. py:attribute:: name
      :type:  str


   .. py:attribute:: path_to_observations
      :type:  str | None
      :value: None


   .. py:attribute:: radtrans_object
      :type:  object
      :value: None


   .. py:attribute:: wavelengths
      :type:  jax.typing.ArrayLike | None
      :value: None


   .. py:attribute:: spectrum
      :type:  jax.typing.ArrayLike | None
      :value: None


   .. py:attribute:: uncertainties
      :type:  jax.typing.ArrayLike | None
      :value: None


   .. py:attribute:: mask
      :type:  jax.typing.ArrayLike


   .. py:attribute:: system_distance
      :type:  float


   .. py:attribute:: data_resolution
      :type:  float | None
      :value: None


   .. py:attribute:: data_resolution_array_model
      :type:  jax.typing.ArrayLike | None
      :value: None


   .. py:attribute:: model_resolution
      :type:  float | None
      :value: None


   .. py:attribute:: external_radtrans_reference
      :type:  object
      :value: None


   .. py:attribute:: model_generating_function
      :type:  Callable
      :value: None


   .. py:attribute:: line_opacity_mode
      :type:  str
      :value: 'c-k'


   .. py:attribute:: covariance
      :type:  jax.typing.ArrayLike | None
      :value: None


   .. py:attribute:: inv_cov
      :type:  jax.typing.ArrayLike | None
      :value: None


   .. py:attribute:: covariance_cholesky_factor
      :type:  jax.typing.ArrayLike | None
      :value: None


   .. py:attribute:: log_covariance_determinant
      :type:  float | None
      :value: None


   .. py:attribute:: fit_covariance
      :type:  bool
      :value: False


   .. py:attribute:: covariance_mode
      :type:  str
      :value: 'none'


   .. py:attribute:: global_covariance_kernel
      :type:  str
      :value: 'squared_exponential'


   .. py:attribute:: local_covariance_kernel
      :type:  str
      :value: 'squared_exponential'


   .. py:attribute:: n_local_covariance_kernels
      :type:  int
      :value: 0


   .. py:attribute:: covariance_jitter
      :type:  float


   .. py:attribute:: scale_flux
      :type:  bool
      :value: False


   .. py:attribute:: scale_uncertainties
      :type:  bool
      :value: False


   .. py:attribute:: fit_flux_offset
      :type:  bool
      :value: False


   .. py:attribute:: fit_instrumental_resolution
      :type:  bool
      :value: False


   .. py:attribute:: subtract_continuum
      :type:  bool
      :value: False


   .. py:attribute:: scale_factor
      :type:  float
      :value: 1.0


   .. py:attribute:: offset
      :type:  float
      :value: 0.0


   .. py:attribute:: bval
      :type:  float


   .. py:attribute:: wavelength_boundaries
      :type:  tuple[float, float] | None
      :value: None


   .. py:attribute:: wavelength_bin_widths
      :type:  jax.typing.ArrayLike | float | None
      :value: None


   .. py:attribute:: radtrans_grid
      :value: False


   .. py:method:: loadtxt(path: str, delimiter: str = ',', comments: str = '#') -> None

      Read a TXT or DAT file containing a header above 3 or 4 data columns representing a spectrum.
      Headers should start with the symbol given in the 'comment' argument.
      The 4 possible data columns are, from left to right:
          1. The wavelengths in microns,
          2. [Optional] The wavelengths bin widths, in microns.
          3. The flux or transit depths,
          4. The error on each data point.

      Alternatively, the wavelength column
      Checks will be performed to determine the correct delimiter, but the recommended format is to use a CSV file
      with columns for wavelength, flux and error.

      Args:
          path : str
              Directory and filename of the data.
          delimiter : string, int
              The string used to separate values. By default, commas act as delimiter.
              An integer or sequence of integers can also be provided as width(s) of each field.
          comments : string
              The character used to indicate the start of a comment.
              All the characters occurring on a line after a comment are discarded


   .. py:method:: load_x1d_fits(path)

      Load in a x1d fits file as produced by the STSci JWST pipeline.
      Expects units of Jy for the flux and micron for the wavelength.

      Args:
          path : str
              Directory and filename of the data.


   .. py:method:: loadfits(path) -> None

      Load a Radtrans-formatted fits file.
      Must include extension SPECTRUM with fields WAVELENGTH, FLUX
      and COVARIANCE (or ERROR).

      Args:
          path : str
              Directory and filename of the data.


   .. py:method:: initialise_data_resolution(wavelengths_model: jax.typing.ArrayLike) -> None


   .. py:method:: update_bins(wavelengths: jax.typing.ArrayLike) -> None

      Update the wavelength bin widths based on the provided wavelength array.

      This method computes the bin width for each wavelength bin as the difference
      between consecutive wavelength values. The last bin width is set equal to the
      second-to-last bin width to maintain array length consistency.

      Args:
          wavelengths : ArrayLike
              Array of wavelength bin centers or edges.


   .. py:method:: scale_to_distance(new_distance: float) -> float

      Update the distance variable in the data class.

      This will rescale the flux to the new distance.

      Args:
          new_distance : float
              The distance to the object in CGS units.


   .. py:method:: line_b_uncertainty_scaling(parameters)

      This function implements the 10^b scaling from Line 2015, which allows
      for us to account for underestimated uncertainties:

      We modify the standard error on the data point by the factor 10^b to account for
      underestimated uncertainties and/or unknown missing forward model physics
      (Foreman-Mackey et al. 2013, Hogg et al. 2010, Tremain et al. 2002), e.g., imperfect fits.
      This results in a more generous estimate of the parameter uncertainties. Note that this
      is similar to inflating the error bars post-facto in order to achieve reduced chi-squares
      of unity, except that this approach is more formal because uncertainties in this parameter
      are properly marginalized into the other relevant parameters. Generally, the factor 10^b
      takes on values that fall between the minimum and maximum of the square of the data uncertainties.

      Args:
          parameters: Dict
              Dictionary of Parameters, should contain key 'uncertianty_scaling_b'.
              This can be done for all data sets, or specified with a tag at the end of
              the key to apply different factors to different datasets.
      Returns:
          b: float
              10**b error bar scaling factor.


   .. py:method:: _validate_covariance_configuration() -> None


   .. py:method:: to_observation_state(model_contract: str = 'legacy') -> petitRADTRANS.retrieval.runtime.ObservationState


   .. py:method:: _cache_covariance_factorization() -> None


   .. py:method:: refresh_covariance_bookkeeping() -> None


   .. py:method:: to_likelihood_state(n_free_parameters: int = 0, parameter_names: tuple[str, Ellipsis] = ()) -> petitRADTRANS.retrieval.runtime.LikelihoodState


   .. py:method:: apply_flux_scaling(input_spectrum, parameters_dict: dict) -> jax.typing.ArrayLike

      Applies multiplicative flux scaling based on parameters in parameters_dict.


   .. py:method:: offset_flux(input_spectrum, parameters_dict: dict) -> jax.typing.ArrayLike

      Applies additive flux offset based on parameters in parameters_dict.


   .. py:method:: scale_and_inflate_uncertainties(parameters_dict: dict) -> tuple[jax.typing.ArrayLike, jax.typing.ArrayLike]

      Applies scaling to self.uncertainties or self.covariance.
      Returns a tuple: (scaled_uncertainties, scaled_covariance).


.. py:class:: PhotometryData(name, photometric_transformation_function, model_generating_function, photometric_bin_edges=None, wavelength_boundaries=None, path_to_observations=None, data_resolution=None, model_resolution=None, system_distance=None, external_radtrans_reference=None, scale_flux=False, scale_uncertainties=False, fit_flux_offset=False, line_opacity_mode='c-k', radtrans_object=None, wavelengths=None)

   Bases: :py:obj:`Data`


   Data class for photometric observations.
   Handles single or multiple photometric bands.


   .. py:attribute:: photometric_transformation_function


   .. py:attribute:: photometric_bin_edges


   .. py:attribute:: wavelength_boundaries
      :value: None


   .. py:attribute:: wavelength_bin_widths
      :value: None


   .. py:method:: _build_photometry_metadata() -> dict[str, object]


   .. py:method:: to_observation_state(model_contract: str = 'legacy') -> petitRADTRANS.retrieval.runtime.ObservationState


   .. py:method:: to_likelihood_state(n_free_parameters: int = 0, parameter_names: tuple[str, Ellipsis] = ()) -> petitRADTRANS.retrieval.runtime.LikelihoodState


.. py:class:: TimeSeriesData(name, path_to_observations=None, filename_list=None, observation_times=None, data_resolution=None, model_resolution=None, system_distance=None, external_radtrans_reference=None, model_generating_function=None, wavelength_boundaries=None, scale_flux=False, scale_uncertainties=False, fit_flux_offset=False, line_opacity_mode='c-k', radtrans_grid=False, concatenate_flux_epochs_variability=False, atmospheric_column_flux_mixer=None, variability_atmospheric_column_model_flux_return_mode=False, radtrans_object=None, spectrum=None, wavelengths=None, wavelength_bin_widths=None, uncertainties=None, covariance=None, mask=None)

   Bases: :py:obj:`Data`


   Data class for photometric observations.
   Handles single or multiple photometric bands.


   .. py:attribute:: filename_list
      :value: None


   .. py:attribute:: concatenate_flux_epochs_variability
      :value: False


   .. py:attribute:: atmospheric_column_flux_mixer
      :value: None


   .. py:attribute:: variability_atmospheric_column_model_flux_return_mode
      :value: False


   .. py:attribute:: time_series_config
      :value: None


   .. py:method:: _collapse_shared_epoch_axis(values)
      :staticmethod:


   .. py:method:: to_observation_state(model_contract: str = 'legacy') -> petitRADTRANS.retrieval.runtime.ObservationState


   .. py:method:: load_single_spectrum_txt(delimiter=',', comments='#')


   .. py:method:: loadtxt(path, delimiter=',', comments='#')

      This function reads in a .txt or .dat file containing the spectrum. Headers should be commented out with '#',
      the first column must be the wavelength in micron, the second column the flux or transit depth,
      and the final column must be the error on each data point.
      Checks will be performed to determine the correct delimiter, but the recommended format is to use a
      csv file with columns for wavelength, flux and error.

      Args:
          path : str
              Directory and filename of the data.
          delimiter : string, int
              The string used to separate values. By default, commas act as delimiter.
              An integer or sequence of integers can also be provided as width(s) of each field.
          comments : string
              The character used to indicate the start of a comment.
              All the characters occurring on a line after a comment are discarded


   .. py:method:: initialise_data_resolution(wavelengths_model: jax.typing.ArrayLike) -> None