petitRADTRANS.sbi.encoders#

The encoder is a learned function whose architecture directly controls what information the flow can condition on. The default deliberately separates spectral amplitude from shape so the posterior cannot collapse to the prior on shape-determined parameters (composition, temperature, clouds).

Classes#

`SpectralConvolutionEncoder`	The default and recommended encoder.
`PhotometryPointEncoder`	Learned photometry encoder using per-point MLP features and pooling.
`DatasetSetAggregator`	Learned permutation-invariant aggregator over block embeddings.
`ObservationEncoder`	Learned hierarchical encoder dispatching over modalities.

Module Contents#

class petitRADTRANS.sbi.encoders.SpectralConvolutionEncoder(embedding_dim: int = 64, n_wavelengths: int = 233, channels: int = 64, dilations: tuple[int, Ellipsis] = (1, 2, 4, 8, 16, 32), key: jax.Array | None = None)#

Bases: equinox.Module

The default and recommended encoder. Observable-agnostic spectral encoder that explicitly factors a spectrum into amplitude and shape, so the network can never collapse to an amplitude-only summary (the failure mode that left composition / temperature / cloud marginals at the prior).

The spectrum is split into:

amplitude features (low-dim): the global log(scale) plus the per-spectrum robust center / scale / sign statistics. For emission this carries the orders-of-magnitude flux scale; for transmission the baseline transit depth.
a shape tensor [(v-center)/scale, sigma/scale, normalized_lambda, mask] with the amplitude divided out.

The shape tensor is encoded by a dilated 1-D convolution stack (dilations 1, 2, 4, …), giving a receptive field that spans the whole band. A learned-query attention pools the per-wavelength features into a global shape embedding. Amplitude enters through a separate, un-bottlenecked MLP, so the shape network never spends capacity re-encoding amplitude. The two are concatenated, fused by an MLP, and LayerNorm-normalised.

Parameters#

embedding_dim:

Size of the spectral embedding returned for one observation block.: This is the length of the summary vector handed to the flow. Bigger = more capacity to describe the spectrum, but more parameters to train. Defaults to 64.

n_wavelengths:

Fixed number of spectral points per observation block. Inputs are padded or truncated to this length (see _pad_to_fixed()), so it must match the observation schema of the SBI task.

channels:

Width of the convolution stack and attention projections, i.e. the number of feature channels carried through the shape branch.

dilations:

Dilation factor of each residual dilated convolution, applied in order. The successive doubling (1, 2, 4, ...) grows the receptive field geometrically so the stack spans the whole band with few layers. Controls how far along the spectrum the network can “see” at each layer. A convolution normally only looks at immediate neighbors. “Dilation” spreads its view out with gaps: dilation 1 looks at adjacent points, dilation 2 skips every other point, and so on.

key:

Optional JAX random key used for weight initialization. Defaults to random.PRNGKey(0) when omitted.

Attributes#

input_projection:: 1x1 equinox.nn.Conv1d lifting the 5 spectral channels [(v-center)/scale, sigma/scale, normalized_lambda, mask] to channels feature channels.

— Shape branch — dilated_convs:

Tuple of equinox.nn.Conv1d layers (one per entry in dilations) applied as GELU residual blocks to build the wide-receptive-field shape representation.

attention_key:: equinox.nn.Linear producing the per-wavelength attention keys used by the learned-query pooling.
attention_value:: equinox.nn.Linear producing the per-wavelength attention values that are weighted and summed into the shape embedding.
attention_query:: Learned query vector of shape (channels,) scored against every wavelength’s key to focus pooling on informative band-heads.

— Amplitude branch — amplitude_projection:

equinox.nn.MLP mapping the low-dimensional amplitude features (global log(scale) plus per-spectrum center/scale/sign statistics) into a channels-wide amplitude embedding, kept separate so the shape branch never re-encodes amplitude.

fusion:: equinox.nn.MLP fusing the concatenated shape and amplitude embeddings (2 * channels wide) down to embedding_dim.
norm:: equinox.nn.LayerNorm applied to the fused embedding so the flow conditioner receives a consistently scaled vector.
embedding_dim:: Static output embedding size (mirrors the constructor argument).
n_wavelengths:: Static fixed spectral length inputs are padded/truncated to.
channels:: Static convolution / attention channel width.

input_projection: equinox.nn.Conv1d#

dilated_convs: tuple#

attention_key: equinox.nn.Linear#

attention_value: equinox.nn.Linear#

attention_query: jax.numpy.ndarray#

amplitude_projection: equinox.nn.MLP#

fusion: equinox.nn.MLP#

norm: equinox.nn.LayerNorm#

embedding_dim: int#

n_wavelengths: int#

channels: int#

_forward(values: jax.numpy.ndarray, uncertainties: jax.numpy.ndarray, coordinates: jax.numpy.ndarray, mask: jax.numpy.ndarray, log_median_flux: jax.numpy.ndarray | None) → jax.numpy.ndarray#

_pad_to_fixed(arr: jax.numpy.ndarray) → jax.numpy.ndarray#

_prepare_arrays(values: jax.numpy.ndarray, uncertainties: jax.numpy.ndarray, coordinates: jax.numpy.ndarray, mask: jax.numpy.ndarray, n: int) → tuple[jax.numpy.ndarray, jax.numpy.ndarray, jax.numpy.ndarray, jax.numpy.ndarray]#

encode_block(block: petitRADTRANS.sbi.observation.ObservationBlock) → jax.numpy.ndarray#: Encode one spectral observation block.

_encode_block_raw(values: jax.numpy.ndarray, uncertainties: jax.numpy.ndarray, coordinates: jax.numpy.ndarray, mask: jax.numpy.ndarray, log_median_flux: jax.numpy.ndarray | None = None, absolute_values: jax.numpy.ndarray | None = None) → jax.numpy.ndarray#

Encode pre-processed block arrays (vmappable — no ObservationBlock).

The robust decomposition is computed directly from values.

class petitRADTRANS.sbi.encoders.PhotometryPointEncoder(embedding_dim: int = 64, hidden_dim: int = 96, key: jax.Array | None = None)#

Bases: equinox.Module

Learned photometry encoder using per-point MLP features and pooling.

Parameters#

embedding_dim:: Size of the returned photometric embedding.
hidden_dim:: Hidden width of the per-point MLP.
key:: Optional JAX random key used for initialization.

Notes#

Each photometric point is represented by value, uncertainty, coordinate, and an inferred width feature before permutation-invariant pooling.

point_mlp: equinox.nn.MLP#

output_projection: equinox.nn.Linear#

embedding_dim: int#

encode_block(block: petitRADTRANS.sbi.observation.ObservationBlock) → jax.numpy.ndarray#

Encode one photometric observation block.

Parameters#

block:: Photometry-like observation block to encode.

Returns#

jnp.ndarray: Dense embedding for the supplied photometric block.

_encode_block_raw(values: jax.numpy.ndarray, uncertainties: jax.numpy.ndarray, coordinates: jax.numpy.ndarray, mask: jax.numpy.ndarray) → jax.numpy.ndarray#

Encode pre-processed block arrays (vmappable — no ObservationBlock input).

Parameters#

values, uncertainties, coordinates:: Float32 1-D arrays already produced by _as_vector. uncertainties and coordinates may have length 0.
mask:: Boolean 1-D array already produced by _safe_mask.

Returns#

jnp.ndarray: Dense photometric embedding of shape (embedding_dim,).

class petitRADTRANS.sbi.encoders.DatasetSetAggregator(embedding_dim: int = 128, hidden_dim: int = 128, key: jax.Array | None = None)#

Bases: equinox.Module

Learned permutation-invariant aggregator over block embeddings.

Parameters#

embedding_dim:: Target dimensionality of the aggregated observation embedding.
hidden_dim:: Hidden width of the block projection MLP.
key:: Optional JAX random key used for initialization.

block_projection: equinox.nn.MLP#

output_projection: equinox.nn.Linear#

embedding_dim: int#

aggregate(block_embeddings: list[jax.numpy.ndarray]) → jax.numpy.ndarray#

Aggregate a list of block embeddings into one observation embedding.

Parameters#

block_embeddings:: Encoded observation blocks for one retrieval target.

Returns#

jnp.ndarray: Permutation-invariant aggregated embedding.

class petitRADTRANS.sbi.encoders.ObservationEncoder(embedding_dim: int = 128, spectrum_embedding_dim: int = 64, photometry_embedding_dim: int = 64, hidden_dim: int = 128, spectrum_encoder_type: str = 'convolution', n_wavelengths: int = 233, key: jax.Array | None = None)#

Bases: equinox.Module, petitRADTRANS.sbi.observation.AbstractObservationEncoder

Learned hierarchical encoder dispatching over modalities.

Parameters#

embedding_dim:: Size of the final joint observation embedding.
spectrum_embedding_dim:: Intermediate embedding size used by the spectral sub-encoder.
photometry_embedding_dim:: Intermediate embedding size used by the photometry sub-encoder.
hidden_dim:: Hidden width shared across the component encoders and aggregator.
key:: Optional JAX random key used to initialize all submodules.

Notes#

Spectrum and photometry blocks are handled by dedicated sub-encoders and then merged with a permutation-invariant aggregator. Unsupported modalities currently fall back to resized raw-value vectors.

spectrum_encoder: equinox.Module#

photometry_encoder: PhotometryPointEncoder#

aggregator: DatasetSetAggregator#

embedding_dim: int#

_encode_block(block: petitRADTRANS.sbi.observation.ObservationBlock) → jax.numpy.ndarray#

Encode one observation block with modality-aware dispatch.

Parameters#

block:: Observation block to encode.

Returns#

jnp.ndarray: Fixed-width embedding for the supplied block.

encode(blocks: list[petitRADTRANS.sbi.observation.ObservationBlock]) → petitRADTRANS.sbi.observation.EncodedObservation#

Encode one structured observation made of multiple blocks.

Parameters#

blocks:: Observation blocks associated with one target system.

Returns#

EncodedObservation: Aggregated embedding and light metadata describing the block set.

encode_stacked_batch(blocks_batch: list[list[petitRADTRANS.sbi.observation.ObservationBlock]]) → jax.numpy.ndarray#

Encode a batch of identically-structured observations using vmap.

All observations must share the same block structure (same number of blocks, same array shapes per block). This is always the case for SBI tasks where every observation is produced by the same forward model.

Parameters#

blocks_batch:: Outer list indexes samples; inner list holds the per-block observation data for one sample.

Returns#

jnp.ndarray: Float32 array of shape (n_samples, embedding_dim).

encode_from_prestacked(obs: Any) → jax.numpy.ndarray#

Encode a batch of observations from pre-stacked arrays.

Accepts a PreStackedObservations instance whose array fields have already been extracted from ObservationBlock objects outside the JAX JIT boundary. Only the vmapped XLA computation runs here, enabling the training step to be compiled once and reused across all batches.

Parameters#

obs:: Pre-stacked observation container with stacked_blocks (one (values, uncertainties, coordinates, mask, log_scale, absolute_values) tuple per block, each of shape (batch_size, n_wl) except the scalar log_scale arrays) and modalities (static tuple of modality value strings).

Returns#

jnp.ndarray: Float32 array of shape (batch_size, embedding_dim).