Internals¶
Typing & Validation¶
- pyroed.typing.validate(schema: Dict[str, List[Optional[str]]], *, constraints: Optional[List[Callable]] = None, feature_blocks: Optional[List[List[str]]] = None, gibbs_blocks: Optional[List[List[str]]] = None, experiment: Optional[Dict[str, torch.Tensor]] = None, config: Optional[Dict[str, Any]] = None) None [source]¶
Validates a Pyroed problem specification.
- Parameters
schema (OrderedDict) – A schema dict.
constraints (list) – An optional list of constraints.
feature_blocks (list) – An optional list of choice blocks for linear regression.
gibbs_blocks (list) – An optional list of choice blocks for Gibbs sampling.
experiment (dict) – An optional dict containing all old experiment data.
Models¶
- pyroed.models.linear_response(schema: Dict[str, List[Optional[str]]], coefs: Dict[Optional[Tuple[str, ...]], torch.Tensor], sequence: torch.Tensor, extra_features: Optional[torch.Tensor]) torch.Tensor [source]¶
Linear response function.
- Parameters
schema (OrderedDict) – A schema dict.
coefs (dict) – A dictionary mapping feature tuples to coefficient tensors.
sequence (torch.Tensor) – A tensor representing a sequence.
extra_features (torch.Tensor) – An optional tensor of extra features, i.e. those computed by a custom
features_fn
rather than standard cross features fromFEATURE_BLOCKS
.
- Returns
The response.
- Return type
- pyroed.models.model(schema: Dict[str, List[Optional[str]]], feature_blocks: List[List[str]], extra_features: Optional[torch.Tensor], experiment: Dict[str, torch.Tensor], *, max_batch_id: Optional[int] = None, response_type: str = 'unit_interval', likelihood_temperature: float = 1.0, quantization_bins: int = 100)[source]¶
A Pyro model for Bayesian linear regression.
- Parameters
schema (OrderedDict) – A schema dict.
feature_blocks (list) – A list of choice blocks for linear regression.
experiment (dict) – A dict containing all old experiment data.
response_type (str) – Type of response, one of: “real”, “unit_interval”.
quantization_bins (int) – Number of bins in which to quantize the “unit_interval” response response_type.
- Returns
A dictionary mapping feature tuples to coefficient tensors.
- Return type
Inference¶
- pyroed.inference.fit_svi(model: Callable, *, lr: float = 0.02, num_steps: int = 501, jit_compile: bool = False, log_every: int = 100, plot: bool = False) Callable[[], Dict[str, torch.Tensor]] [source]¶
Fits a model via stochastic variational inference.
- Parameters
model (callable) – A Bayesian regression model from
pyroed.models
.- Returns
A variational distribution that can generate samples.
- Return type
callable
- pyroed.inference.fit_mcmc(model: Callable, *, num_samples: int = 500, warmup_steps: int = 500, num_chains: int = 1, jit_compile: bool = True) Callable[[], Dict[str, torch.Tensor]] [source]¶
Fits a model via Hamiltonian Monte Carlo.
- Parameters
model (callable) – A Bayesian regression model from
pyroed.models
.- Returns
A sampler that draws from the empirical distribution.
- Return type
Optimization¶
- pyroed.optimizers.optimize_simulated_annealing(schema: Dict[str, List[Optional[str]]], constraints: List[Callable], gibbs_blocks: List[List[str]], coefs: dict, *, feature_fn: Optional[Callable[[torch.Tensor], torch.Tensor]] = None, temperature_schedule: torch.Tensor, max_tries=10000, log_every=100) torch.Tensor [source]¶
Finds an optimal sequence via annealed Gibbs sampling.
- Parameters
- Returns
The single best found sequence.
- Return type
Experiment Design¶
- pyroed.oed.thompson_sample(schema: Dict[str, List[Optional[str]]], constraints: List[Callable], feature_blocks: List[List[str]], gibbs_blocks: List[List[str]], experiment: Dict[str, torch.Tensor], *, design_size: int = 10, feature_fn: Optional[Callable[[torch.Tensor], torch.Tensor]] = None, response_type: str = 'unit_interval', inference: str = 'svi', mcmc_num_samples: int = 500, mcmc_warmup_steps: int = 500, mcmc_num_chains: int = 1, svi_num_steps: int = 501, svi_reparam: bool = True, svi_plot: bool = False, sa_num_steps: int = 1000, max_tries: int = 1000, thompson_temperature: float = 1.0, jit_compile: Optional[bool] = None, log_every: int = 100) Set[Tuple[int, ...]] [source]¶
Performs Bayesian optimization via Thompson sampling.
This fits a Bayesian model to existing experimental data, and draws Thompson samples wrt that model. To draw each Thompson sample, this first samples parameters from the fitted posterior (with likelihood annealed by
thompson_temperature
), then finds an optimal sequenc wrt those parameters via simulated annealing.The Bayesian model can be fit either via stochastic variational inference (SVI, faster but less accurate) or Markov chain Monte Carlo (MCMC, slower but more accurate).
- Parameters
schema (OrderedDict) – A schema dict.
constraints (list) – A list of constraints.
feature_blocks (list) – A list of choice blocks for linear regression.
gibbs_blocks (list) – A list of choice blocks for Gibbs sampling.
experiment (dict) – A dict containing all old experiment data.
design_size (int) – Number of designs to try to return (sometimes fewer designs are found).
feature_fn (callable) – An optional callback to generate additional features.
response_type (str) – Type of response, one of: “real”, “unit_interval”.
inference (str) – Inference algorithm, one of: “svi”, “mcmc”.
mcmc_num_samples (int) – If
inference == "mcmc"
, this sets the number of posterior samples to draw from MCMC. Should be larger thandesign_size
.mcmc_warmup_steps (int) – If
inference == "mcmc", this sets the number of warmup steps for MCMC. Should be the same order of magnitude as ``mcmc_num_samples
.svi_num_steps (int) – If
inference == "svi"
this sets the number of steps to run stochastic variational inference.svi_reparam (bool) – Whether to reparametrize SVI inference. This only works when
thompson_temperature == 1
.sa_num_steps (int) – Number of steps to run simulated annealing, at each Thompson sample.
svi_plot (bool) – If
inference == "svi"
whether to plot loss curve.max_tries (int) – Number of extra Thompson samples to draw in search of novel sequences to add to the design.
thompson_temperature (float) – Likelihood annealing temperature at which Thompson samples are drawn. Defaults to 1. You may want to increase this if you are have trouble finding novel designs, i.e. if this function returns fewer designs than you request.
jit_compile (bool) – Optional flag to force jit compilation during inference. Defaults to safe values for both SVI and MCMC inference.
log_every (int) – Logging interval for internal algorithms. To disable logging, set this to zero.
- Returns
A design as a set of tuples of choices.
- Return type