# zhusuan.distributions¶

## Base class¶

class Distribution(dtype, param_dtype, is_continuous, is_reparameterized, use_path_derivative=False, group_ndims=0, **kwargs)

Bases: object

The Distribution class is the base class for various probabilistic distributions which support batch inputs, generating batches of samples and evaluate probabilities at batches of given values.

The typical input shape for a Distribution is like batch_shape + input_shape. where input_shape represents the shape of non-batch input parameter, batch_shape represents how many independent inputs are fed into the distribution.

Samples generated are of shape ([n_samples]+ )batch_shape + value_shape. The first additional axis is omitted only when passed n_samples is None (by default), in which case one sample is generated. value_shape is the non-batch value shape of the distribution. For a univariate distribution, its value_shape is [].

There are cases where a batch of random variables are grouped into a single event so that their probabilities should be computed together. This is achieved by setting group_ndims argument, which defaults to 0. The last group_ndims number of axes in batch_shape are grouped into a single event. For example, Normal(..., group_ndims=1) will set the last axis of its batch_shape to a single event, i.e., a multivariate Normal with identity covariance matrix.

When evaluating probabilities at given values, the given Tensor should be broadcastable to shape (... + )batch_shape + value_shape. The returned Tensor has shape (... + )batch_shape[:-group_ndims].

For more details and examples, please refer to Basic Concepts in ZhuSuan.

For both, the parameter dtype represents type of samples. For discrete, can be set by user. For continuous, automatically determined from parameter types.

The value type of prob and log_prob will be param_dtype which is deduced from the parameter(s) when initializating. And dtype must be among int16, int32, int64, float16, float32 and float64.

When two or more parameters are tensors and they have different type, TypeError will be raised.

Parameters: dtype – The value type of samples from the distribution. param_dtype – The parameter(s) type of the distribution. is_continuous – Whether the distribution is continuous. is_reparameterized – A bool. Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013). use_path_derivative – A bool. Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference” group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See above for more detailed explanation.
batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

## Univariate distributions¶

class Normal(mean=0.0, _sentinel=None, std=None, logstd=None, group_ndims=0, is_reparameterized=True, use_path_derivative=False, check_numerics=False, **kwargs)

The class of univariate Normal distribution. See Distribution for details.

Warning

The order of arguments logstd/std has changed to std/logstd since 0.3.1. Please use named arguments: Normal(mean, std=..., ...) or Normal(mean, logstd=..., ...).

Parameters: mean – A float Tensor. The mean of the Normal distribution. Should be broadcastable to match std or logstd. _sentinel – Used to prevent positional parameters. Internal, do not use. std – A float Tensor. The standard deviation of the Normal distribution. Should be positive and broadcastable to match mean. logstd – A float Tensor. The log standard deviation of the Normal distribution. Should be broadcastable to match mean. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation. is_reparameterized – A Bool. If True, gradients on samples from this distribution are allowed to propagate into inputs, using the reparametrization trick from (Kingma, 2013). use_path_derivative – A bool. Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference” check_numerics – Bool. Whether to check numeric issues.
batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
logstd

The log standard deviation of the Normal distribution.

mean

The mean of the Normal distribution.

param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
std

The standard deviation of the Normal distribution.

use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

class FoldNormal(mean=0.0, _sentinel=None, std=None, logstd=None, group_ndims=0, is_reparameterized=True, use_path_derivative=False, check_numerics=False, **kwargs)

The class of univariate FoldNormal distribution. See Distribution for details.

Warning

The order of arguments logstd/std has changed to std/logstd since 0.3.1. Please use named arguments: FoldNormal(mean, std=..., ...) or FoldNormal(mean, logstd=..., ...).

Parameters: mean – A float Tensor. The mean of the FoldNormal distribution. Should be broadcastable to match std or logstd. _sentinel – Used to prevent positional parameters. Internal, do not use. std – A float Tensor. The standard deviation of the FoldNormal distribution. Should be positive and broadcastable to match mean. logstd – A float Tensor. The log standard deviation of the FoldNormal distribution. Should be broadcastable to match mean. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation. is_reparameterized – A Bool. If True, gradients on samples from this distribution are allowed to propagate into inputs, using the reparametrization trick from (Kingma, 2013). use_path_derivative – A bool. Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference” check_numerics – Bool. Whether to check numeric issues.
batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
logstd

The log standard deviation of the FoldNormal distribution.

mean

The mean of the FoldNormal distribution.

param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
std

The standard deviation of the FoldNormal distribution.

use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

class Bernoulli(logits, dtype=tf.int32, group_ndims=0, **kwargs)

The class of univariate Bernoulli distribution. See Distribution for details.

Parameters: logits – A float Tensor. The log-odds of probabilities of being 1. $\mathrm{logits} = \log \frac{p}{1 - p}$ dtype – The value type of samples from the distribution. Can be int (tf.int16, tf.int32, tf.int64) or float (tf.float16, tf.float32, tf.float64). Default is int32. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation.
batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
logits

The log-odds of probabilities of being 1.

param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

class Categorical(logits, dtype=tf.int32, group_ndims=0, **kwargs)

The class of univariate Categorical distribution. See Distribution for details.

Parameters: logits – A N-D (N >= 1) float32 or float64 Tensor of shape (…, n_categories). Each slice [i, j,…, k, :] represents the un-normalized log probabilities for all categories. $\mathrm{logits} \propto \log p$ dtype – The value type of samples from the distribution. Can be float32, float64, int32, or int64. Default is int32. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation.

A single sample is a (N-1)-D Tensor with tf.int32 values in range [0, n_categories).

batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
logits

The un-normalized log probabilities.

n_categories

The number of categories in the distribution.

param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

Discrete
class Uniform(minval=0.0, maxval=1.0, group_ndims=0, is_reparameterized=True, check_numerics=False, **kwargs)

The class of univariate Uniform distribution. See Distribution for details.

Parameters: minval – A float Tensor. The lower bound on the range of the uniform distribution. Should be broadcastable to match maxval. maxval – A float Tensor. The upper bound on the range of the uniform distribution. Should be element-wise bigger than minval. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation. is_reparameterized – A Bool. If True, gradients on samples from this distribution are allowed to propagate into inputs, using the reparametrization trick from (Kingma, 2013). check_numerics – Bool. Whether to check numeric issues.
batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
maxval

The upper bound on the range of the uniform distribution.

minval

The lower bound on the range of the uniform distribution.

param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

class Gamma(alpha, beta, group_ndims=0, check_numerics=False, **kwargs)

The class of univariate Gamma distribution. See Distribution for details.

Parameters: alpha – A float Tensor. The shape parameter of the Gamma distribution. Should be positive and broadcastable to match beta. beta – A float Tensor. The inverse scale parameter of the Gamma distribution. Should be positive and broadcastable to match alpha. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation. check_numerics – Bool. Whether to check numeric issues.
alpha

The shape parameter of the Gamma distribution.

batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

beta

The inverse scale parameter of the Gamma distribution.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

class Beta(alpha, beta, group_ndims=0, check_numerics=False, **kwargs)

The class of univariate Beta distribution. See Distribution for details.

Parameters: alpha – A float Tensor. One of the two shape parameters of the Beta distribution. Should be positive and broadcastable to match beta. beta – A float Tensor. One of the two shape parameters of the Beta distribution. Should be positive and broadcastable to match alpha. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation. check_numerics – Bool. Whether to check numeric issues.
alpha

One of the two shape parameters of the Beta distribution.

batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

beta

One of the two shape parameters of the Beta distribution.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

class Poisson(rate, dtype=tf.int32, group_ndims=0, check_numerics=False, **kwargs)

The class of univariate Poisson distribution. See Distribution for details.

Parameters: rate – A float Tensor. The rate parameter of Poisson distribution. Must be positive. dtype – The value type of samples from the distribution. Can be int (tf.int16, tf.int32, tf.int64) or float (tf.float16, tf.float32, tf.float64). Default is int32. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation. check_numerics – Bool. Whether to check numeric issues.
batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
rate

The rate parameter of Poisson.

sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

class Binomial(logits, n_experiments, dtype=tf.int32, group_ndims=0, check_numerics=False, **kwargs)

The class of univariate Binomial distribution. See Distribution for details.

Parameters: logits – A float Tensor. The log-odds of probabilities. $\mathrm{logits} = \log \frac{p}{1 - p}$ n_experiments – A 0-D int32 Tensor. The number of experiments for each sample. dtype – The value type of samples from the distribution. Can be int (tf.int16, tf.int32, tf.int64) or float (tf.float16, tf.float32, tf.float64). Default is int32. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation. check_numerics – Bool. Whether to check numeric issues.
batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
logits

The log-odds of probabilities.

n_experiments

The number of experiments.

param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

class InverseGamma(alpha, beta, group_ndims=0, check_numerics=False, **kwargs)

The class of univariate InverseGamma distribution. See Distribution for details.

Parameters: alpha – A float Tensor. The shape parameter of the InverseGamma distribution. Should be positive and broadcastable to match beta. beta – A float Tensor. The scale parameter of the InverseGamma distribution. Should be positive and broadcastable to match alpha. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation. check_numerics – Bool. Whether to check numeric issues.
alpha

The shape parameter of the InverseGamma distribution.

batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

beta

The scale parameter of the InverseGamma distribution.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

class Laplace(loc, scale, group_ndims=0, is_reparameterized=True, use_path_derivative=False, check_numerics=False, **kwargs)

The class of univariate Laplace distribution. See Distribution for details.

Parameters: loc – A float Tensor. The location parameter of the Laplace distribution. Should be broadcastable to match scale. scale – A float Tensor. The scale parameter of the Laplace distribution. Should be positive and broadcastable to match loc. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation. is_reparameterized – A Bool. If True, gradients on samples from this distribution are allowed to propagate into inputs, using the reparametrization trick from (Kingma, 2013). use_path_derivative – A bool. Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference” check_numerics – Bool. Whether to check numeric issues.
batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

loc

The location parameter of the Laplace distribution.

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
scale

The scale parameter of the Laplace distribution.

use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

class BinConcrete(temperature, logits, group_ndims=0, is_reparameterized=True, use_path_derivative=False, check_numerics=False, **kwargs)

The class of univariate BinConcrete distribution from (Maddison, 2016). It is the binary case of Concrete. See Distribution for details.

Parameters: temperature – A 0-D float Tensor. The temperature of the relaxed distribution. The temperature should be positive. logits – A float Tensor. The log-odds of probabilities of being 1. $\mathrm{logits} = \log \frac{p}{1 - p}$ group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation. is_reparameterized – A Bool. If True, gradients on samples from this distribution are allowed to propagate into inputs, using the reparametrization trick from (Kingma, 2013). use_path_derivative – A bool. Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference” check_numerics – Bool. Whether to check numeric issues.
batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
logits

The log-odds of probabilities.

param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
temperature

The temperature of BinConcrete.

use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

BinGumbelSoftmax

## Multivariate distributions¶

class MultivariateNormalCholesky(mean, cov_tril, group_ndims=0, is_reparameterized=True, use_path_derivative=False, check_numerics=False, **kwargs)

The class of multivariate normal distribution, where covariance is parameterized with the lower triangular matrix $$L$$ in Cholesky decomposition $$LL^T = \Sigma$$.

See Distribution for details.

Parameters: mean – An N-D float Tensor of shape […, n_dim]. Each slice [i, j, …, k, :] represents the mean of a single multivariate normal distribution. cov_tril – An (N+1)-D float Tensor of shape […, n_dim, n_dim]. Each slice [i, …, k, :, :] represents the lower triangular matrix in the Cholesky decomposition of the covariance of a single distribution. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation. is_reparameterized – A Bool. If True, gradients on samples from this distribution are allowed to propagate into inputs, using the reparametrization trick from (Kingma, 2013). use_path_derivative – A bool. Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference” check_numerics – Bool. Whether to check numeric issues.
batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

cov_tril

The lower triangular matrix in the cholosky decomposition of the covariance.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
mean

The mean of the normal distribution.

param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

class Multinomial(logits, n_experiments, normalize_logits=True, dtype=tf.int32, group_ndims=0, **kwargs)

The class of Multinomial distribution. See Distribution for details.

Parameters: logits – A N-D (N >= 1) float Tensor of shape […, n_categories]. Each slice [i, j, …, k, :] represents the log probabilities for all categories. By default (when normalize_logits=True), the probabilities could be un-normalized. $\mathrm{logits} \propto \log p$ n_experiments – A 0-D int32 Tensor or None. When it is a 0-D int32 integer, it represents the number of experiments for each sample, which should be invariant among samples. In this situation _sample function is supported. When it is None, _sample function is not supported, and when calculating probabilities the number of experiments will be inferred from given, so it could vary among samples. normalize_logits – A bool indicating whether logits should be normalized when computing probability. If you believe logits is already normalized, set it to False to speed up. Default is True. dtype – The value type of samples from the distribution. Can be int (tf.int16, tf.int32, tf.int64) or float (tf.float16, tf.float32, tf.float64). Default is int32. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation.

A single sample is a N-D Tensor with the same shape as logits. Each slice [i, j, …, k, :] is a vector of counts for all categories.

batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
logits

The un-normalized log probabilities.

n_categories

The number of categories in the distribution.

n_experiments

The number of experiments for each sample.

param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

class UnnormalizedMultinomial(logits, normalize_logits=True, dtype=tf.int32, group_ndims=0, **kwargs)

The class of UnnormalizedMultinomial distribution. UnnormalizedMultinomial distribution calculates probabilities differently from Multinomial: It considers the bag-of-words given as a statistics of an ordered result sequence, and calculates the probability of the (imagined) ordered sequence. Hence it does not multiply the term

$\binom{n}{k_1, k_2, \dots} = \frac{n!}{\prod_{i} k_i!}$

See Distribution for details.

Parameters: logits – A N-D (N >= 1) float Tensor of shape […, n_categories]. Each slice [i, j, …, k, :] represents the log probabilities for all categories. By default (when normalize_logits=True), the probabilities could be un-normalized. $\mathrm{logits} \propto \log p$ normalize_logits – A bool indicating whether logits should be normalized when computing probability. If you believe logits is already normalized, set it to False to speed up. Default is True. dtype – The value type of samples from the distribution. Can be int (tf.int16, tf.int32, tf.int64) or float (tf.float16, tf.float32, tf.float64). Default is int32. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation.

A single sample is a N-D Tensor with the same shape as logits. Each slice [i, j, …, k, :] is a vector of counts for all categories.

batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
logits

The un-normalized log probabilities.

n_categories

The number of categories in the distribution.

param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

BagofCategoricals
class OnehotCategorical(logits, dtype=tf.int32, group_ndims=0, **kwargs)

The class of one-hot Categorical distribution. See Distribution for details.

Parameters: logits – A N-D (N >= 1) float Tensor of shape (…, n_categories). Each slice [i, j, …, k, :] represents the un-normalized log probabilities for all categories. $\mathrm{logits} \propto \log p$ dtype – The value type of samples from the distribution. Can be int (tf.int16, tf.int32, tf.int64) or float (tf.float16, tf.float32, tf.float64). Default is int32. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation.

A single sample is a N-D Tensor with the same shape as logits. Each slice [i, j, …, k, :] is a one-hot vector of the selected category.

batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
logits

The un-normalized log probabilities.

n_categories

The number of categories in the distribution.

param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

OnehotDiscrete
class Dirichlet(alpha, group_ndims=0, check_numerics=False, **kwargs)

The class of Dirichlet distribution. See Distribution for details.

Parameters: alpha – A N-D (N >= 1) float Tensor of shape (…, n_categories). Each slice [i, j, …, k, :] represents the concentration parameter of a Dirichlet distribution. Should be positive. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation.

A single sample is a N-D Tensor with the same shape as alpha. Each slice [i, j, …, k, :] of the sample is a vector of probabilities of a Categorical distribution [x_1, x_2, … ], which lies on the simplex

$\sum_{i} x_i = 1, 0 < x_i < 1$
alpha

The concentration parameter of the Dirichlet distribution.

batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
n_categories

The number of categories in the distribution.

param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

class ExpConcrete(temperature, logits, group_ndims=0, is_reparameterized=True, use_path_derivative=False, check_numerics=False, **kwargs)

The class of ExpConcrete distribution from (Maddison, 2016), transformed from Concrete by taking logarithm. See Distribution for details.

Parameters: temperature – A 0-D float Tensor. The temperature of the relaxed distribution. The temperature should be positive. logits – A N-D (N >= 1) float Tensor of shape (…, n_categories). Each slice [i, j, …, k, :] represents the un-normalized log probabilities for all categories. $\mathrm{logits} \propto \log p$ group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation. is_reparameterized – A Bool. If True, gradients on samples from this distribution are allowed to propagate into inputs, using the reparametrization trick from (Kingma, 2013). use_path_derivative – A bool. Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference” check_numerics – Bool. Whether to check numeric issues.
batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
logits

The un-normalized log probabilities.

n_categories

The number of categories in the distribution.

param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
temperature

The temperature of ExpConcrete.

use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

ExpGumbelSoftmax
class Concrete(temperature, logits, group_ndims=0, is_reparameterized=True, use_path_derivative=False, check_numerics=False, **kwargs)

The class of Concrete (or Gumbel-Softmax) distribution from (Maddison, 2016; Jang, 2016), served as the continuous relaxation of the OnehotCategorical. See Distribution for details.

Parameters: temperature – A 0-D float Tensor. The temperature of the relaxed distribution. The temperature should be positive. logits – A N-D (N >= 1) float Tensor of shape (…, n_categories). Each slice [i, j, …, k, :] represents the un-normalized log probabilities for all categories. $\mathrm{logits} \propto \log p$ group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation. is_reparameterized – A Bool. If True, gradients on samples from this distribution are allowed to propagate into inputs, using the reparametrization trick from (Kingma, 2013). use_path_derivative – A bool. Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference” check_numerics – Bool. Whether to check numeric issues.
batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
logits

The un-normalized log probabilities.

n_categories

The number of categories in the distribution.

param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
temperature

The temperature of Concrete.

use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

GumbelSoftmax
class MatrixVariateNormalCholesky(mean, u_tril, v_tril, group_ndims=0, is_reparameterized=True, use_path_derivative=False, check_numerics=False, **kwargs)

The class of matrix variate normal distribution, where covariances $$U$$ and $$V$$ are parameterized with the lower triangular matrix in Cholesky decomposition,

$L_u \text{s.t.} L_u L_u^T = U,\; L_v \text{s.t.} L_v L_v^T = V$

See Distribution for details.

Parameters: mean – An N-D float Tensor of shape […, n_row, n_col]. Each slice [i, j, …, k, :, :] represents the mean of a single matrix variate normal distribution. u_tril – An N-D float Tensor of shape […, n_row, n_row]. Each slice [i, j, …, k, :, :] represents the lower triangular matrix in the Cholesky decomposition of the among-row covariance of a single matrix variate normal distribution. v_tril – An N-D float Tensor of shape […, n_col, n_col]. Each slice [i, j, …, k, :, :] represents the lower triangular matrix in the Cholesky decomposition of the among-column covariance of a single matrix variate normal distribution. group_ndims – A 0-D int32 Tensor representing the number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. Default is 0, which means a single value is an event. See Distribution for more detailed explanation. is_reparameterized – A Bool. If True, gradients on samples from this distribution are allowed to propagate into inputs, using the reparametrization trick from (Kingma, 2013). use_path_derivative – A bool. Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference” check_numerics – Bool. Whether to check numeric issues.
batch_shape

The shape showing how many independent inputs (which we call batches) are fed into the distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape. We borrow this concept from tf.contrib.distributions.

dtype

The sample type of the distribution.

get_batch_shape()

Static batch_shape.

Returns: A TensorShape instance.
get_value_shape()

Static value_shape.

Returns: A TensorShape instance.
group_ndims

The number of dimensions in batch_shape (counted from the end) that are grouped into a single event, so that their probabilities are calculated together. See Distribution for more detailed explanation.

is_continuous

Whether the distribution is continuous.

is_reparameterized

Whether the gradients of samples can and are allowed to propagate back into inputs, using the reparametrization trick from (Kingma, 2013).

log_prob(given)

Compute log probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate log probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
mean

The mean of the matrix variate normal distribution.

param_dtype

The parameter(s) type of the distribution.

path_param(param)

Automatically transforms a parameter based on use_path_derivative

prob(given)

Compute probability density (mass) function at given value.

Parameters: given – A Tensor. The value at which to evaluate probability density (mass) function. Must be able to broadcast to have a shape of (... + )batch_shape + value_shape. A Tensor of shape (... + )batch_shape[:-group_ndims].
sample(n_samples=None)

Return samples from the distribution. When n_samples is None (by default), one sample of shape batch_shape + value_shape is generated. For a scalar n_samples, the returned Tensor has a new sample dimension with size n_samples inserted at axis=0, i.e., the shape of samples is [n_samples] + batch_shape + value_shape.

Parameters: n_samples – A 0-D int32 Tensor or None. How many independent samples to draw from the distribution. A Tensor of samples.
u_tril

The lower triangular matrix in the Cholesky decomposition of the among-row covariance.

use_path_derivative

Whether when taking the gradients of the log-probability to propagate them through the parameters of the distribution (False meaning you do propagate them). This is based on the paper “Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference”

v_tril

The lower triangular matrix in the Cholesky decomposition of the among-column covariance.

value_shape

The non-batch value shape of a distribution. For batch inputs, the shape of a generated sample is batch_shape + value_shape.

## Distribution utils¶

log_combination(n, ks)

Compute the log combination function.

$\log \binom{n}{k_1, k_2, \dots} = \log n! - \sum_{i}\log k_i!$
Parameters: n – A N-D float Tensor. Can broadcast to match tf.shape(ks)[:-1]. ks – A (N + 1)-D float Tensor. Each slice [i, j, …, k, :] is a vector of [k_1, k_2, …]. A N-D Tensor of type same as n.
explicit_broadcast(x, y, x_name, y_name)

Explicit broadcast two Tensors to have the same shape.

maybe_explicit_broadcast(x, y, x_name, y_name)
is_same_dynamic_shape(x, y)