Internals

Typing & Validation

pyroed.typing.validate(schema: Dict[str, List[Optional[str]]], *, constraints: Optional[List[Callable]] = None, feature_blocks: Optional[List[List[str]]] = None, gibbs_blocks: Optional[List[List[str]]] = None, experiment: Optional[Dict[str, torch.Tensor]] = None, config: Optional[Dict[str, Any]] = None) None[source]

Validates a Pyroed problem specification.

Parameters
  • schema (OrderedDict) – A schema dict.

  • constraints (list) – An optional list of constraints.

  • feature_blocks (list) – An optional list of choice blocks for linear regression.

  • gibbs_blocks (list) – An optional list of choice blocks for Gibbs sampling.

  • experiment (dict) – An optional dict containing all old experiment data.

Models

pyroed.models.linear_response(schema: Dict[str, List[Optional[str]]], coefs: Dict[Optional[Tuple[str, ...]], torch.Tensor], sequence: torch.Tensor, extra_features: Optional[torch.Tensor]) torch.Tensor[source]

Linear response function.

Parameters
  • schema (OrderedDict) – A schema dict.

  • coefs (dict) – A dictionary mapping feature tuples to coefficient tensors.

  • sequence (torch.Tensor) – A tensor representing a sequence.

  • extra_features (torch.Tensor) – An optional tensor of extra features, i.e. those computed by a custom features_fn rather than standard cross features from FEATURE_BLOCKS.

Returns

The response.

Return type

torch.Tensor

pyroed.models.model(schema: Dict[str, List[Optional[str]]], feature_blocks: List[List[str]], extra_features: Optional[torch.Tensor], experiment: Dict[str, torch.Tensor], *, max_batch_id: Optional[int] = None, response_type: str = 'unit_interval', likelihood_temperature: float = 1.0, quantization_bins: int = 100)[source]

A Pyro model for Bayesian linear regression.

Parameters
  • schema (OrderedDict) – A schema dict.

  • feature_blocks (list) – A list of choice blocks for linear regression.

  • experiment (dict) – A dict containing all old experiment data.

  • response_type (str) – Type of response, one of: “real”, “unit_interval”.

  • quantization_bins (int) – Number of bins in which to quantize the “unit_interval” response response_type.

Returns

A dictionary mapping feature tuples to coefficient tensors.

Return type

dict

Inference

pyroed.inference.fit_svi(model: Callable, *, lr: float = 0.02, num_steps: int = 501, jit_compile: bool = False, log_every: int = 100, plot: bool = False) Callable[[], Dict[str, torch.Tensor]][source]

Fits a model via stochastic variational inference.

Parameters

model (callable) – A Bayesian regression model from pyroed.models.

Returns

A variational distribution that can generate samples.

Return type

callable

pyroed.inference.fit_mcmc(model: Callable, *, num_samples: int = 500, warmup_steps: int = 500, num_chains: int = 1, jit_compile: bool = True) Callable[[], Dict[str, torch.Tensor]][source]

Fits a model via Hamiltonian Monte Carlo.

Parameters

model (callable) – A Bayesian regression model from pyroed.models.

Returns

A sampler that draws from the empirical distribution.

Return type

Sampler

class pyroed.inference.Sampler(samples: Dict[str, torch.Tensor])[source]

Bases: object

Helper to sample from an empirical distribution.

Parameters

samples (dict) – A dictionary of batches of samples.

Optimization

pyroed.optimizers.optimize_simulated_annealing(schema: Dict[str, List[Optional[str]]], constraints: List[Callable], gibbs_blocks: List[List[str]], coefs: dict, *, feature_fn: Optional[Callable[[torch.Tensor], torch.Tensor]] = None, temperature_schedule: torch.Tensor, max_tries=10000, log_every=100) torch.Tensor[source]

Finds an optimal sequence via annealed Gibbs sampling.

Parameters
  • schema (OrderedDict) – A schema dict.

  • constraints (list) – A list of constraints.

  • gibbs_blocks (list) – A list of choice blocks for Gibbs sampling.

  • coefs (dict) – A dictionary mapping feature tuples to coefficient tensors.

Returns

The single best found sequence.

Return type

torch.Tensor

Experiment Design

pyroed.oed.thompson_sample(schema: Dict[str, List[Optional[str]]], constraints: List[Callable], feature_blocks: List[List[str]], gibbs_blocks: List[List[str]], experiment: Dict[str, torch.Tensor], *, design_size: int = 10, feature_fn: Optional[Callable[[torch.Tensor], torch.Tensor]] = None, response_type: str = 'unit_interval', inference: str = 'svi', mcmc_num_samples: int = 500, mcmc_warmup_steps: int = 500, mcmc_num_chains: int = 1, svi_num_steps: int = 501, svi_reparam: bool = True, svi_plot: bool = False, sa_num_steps: int = 1000, max_tries: int = 1000, thompson_temperature: float = 1.0, jit_compile: Optional[bool] = None, log_every: int = 100) Set[Tuple[int, ...]][source]

Performs Bayesian optimization via Thompson sampling.

This fits a Bayesian model to existing experimental data, and draws Thompson samples wrt that model. To draw each Thompson sample, this first samples parameters from the fitted posterior (with likelihood annealed by thompson_temperature), then finds an optimal sequenc wrt those parameters via simulated annealing.

The Bayesian model can be fit either via stochastic variational inference (SVI, faster but less accurate) or Markov chain Monte Carlo (MCMC, slower but more accurate).

Parameters
  • schema (OrderedDict) – A schema dict.

  • constraints (list) – A list of constraints.

  • feature_blocks (list) – A list of choice blocks for linear regression.

  • gibbs_blocks (list) – A list of choice blocks for Gibbs sampling.

  • experiment (dict) – A dict containing all old experiment data.

  • design_size (int) – Number of designs to try to return (sometimes fewer designs are found).

  • feature_fn (callable) – An optional callback to generate additional features.

  • response_type (str) – Type of response, one of: “real”, “unit_interval”.

  • inference (str) – Inference algorithm, one of: “svi”, “mcmc”.

  • mcmc_num_samples (int) – If inference == "mcmc", this sets the number of posterior samples to draw from MCMC. Should be larger than design_size.

  • mcmc_warmup_steps (int) – If inference == "mcmc", this sets the number of warmup steps for MCMC. Should be the same order of magnitude as ``mcmc_num_samples.

  • svi_num_steps (int) – If inference == "svi" this sets the number of steps to run stochastic variational inference.

  • svi_reparam (bool) – Whether to reparametrize SVI inference. This only works when thompson_temperature == 1.

  • sa_num_steps (int) – Number of steps to run simulated annealing, at each Thompson sample.

  • svi_plot (bool) – If inference == "svi" whether to plot loss curve.

  • max_tries (int) – Number of extra Thompson samples to draw in search of novel sequences to add to the design.

  • thompson_temperature (float) – Likelihood annealing temperature at which Thompson samples are drawn. Defaults to 1. You may want to increase this if you are have trouble finding novel designs, i.e. if this function returns fewer designs than you request.

  • jit_compile (bool) – Optional flag to force jit compilation during inference. Defaults to safe values for both SVI and MCMC inference.

  • log_every (int) – Logging interval for internal algorithms. To disable logging, set this to zero.

Returns

A design as a set of tuples of choices.

Return type

set