tfp.glm.fit

Runs multiple Fisher scoring steps.

Used in the notebooks

Used in the tutorials

model_matrix (Batch of) float-like, matrix-shaped Tensor where each row represents a sample's features.
response (Batch of) vector-shaped Tensor where each element represents a sample's observed response (to the corresponding row of features). Must have same dtype as model_matrix.
model tfp.glm.ExponentialFamily-like instance which implicitly characterizes a negative log-likelihood loss by specifying the distribuion's mean, gradient_mean, and variance.
model_coefficients_start Optional (batch of) vector-shaped Tensor representing the initial model coefficients, one for each column in model_matrix. Must have same dtype as model_matrix. Default value: Zeros.
predicted_linear_response_start Optional Tensor with shape, dtype matching response; represents offset shifted initial linear predictions based on model_coefficients_start. Default value: offset if model_coefficients is None, and tf.linalg.matvec(model_matrix, model_coefficients_start) + offset otherwise.
l2_regularizer Optional scalar Tensor representing L2 regularization penalty, i.e., loss(w) = sum{-log p(y[i]|x[i],w) : i=1..n} + l2_regularizer ||w||_2^2. Default value: None (i.e., no L2 regularization).
dispersion Optional (batch of) Tensor representing response dispersion, i.e., as in, p(y|theta) := exp((y theta - A(theta)) / dispersion). Must broadcast with rows of model_matrix. Default value: None (i.e., "no dispersion").
offset Optional Tensor representing constant shift applied to predicted_linear_response. Must broadcast to response. Default value: None (i.e., tf.zeros_like(response)).
convergence_criteria_fn Python callable taking: is_converged_previous, iter_, model_coefficients_previous, predicted_linear_response_previous, model_coefficients_next, predicted_linear_response_next, response, model, dispersion and returning a bool Tensor indicating that Fisher scoring has converged. See convergence_criteria_small_relative_norm_weights_change as an example function. Default value: None (i.e., convergence_criteria_small_relative_norm_weights_change).
learning_rate Optional (batch of) scalar Tensor used to dampen iterative progress. Typically only needed if optimization diverges, should be no larger than 1 and typically very close to 1. Default value: None (i.e., 1).
fast_unsafe_numerics Optional Python bool indicating if faster, less numerically accurate methods can be employed for computing the weighted least-squares solution. Default value: True (i.e., "fast but possibly diminished accuracy").
maximum_iterations Optional maximum number of iterations of Fisher scoring to run; "and-ed" with result of convergence_criteria_fn. Default value: None (i.e., infinity).
l2_regularization_penalty_factor Optional (batch of) vector-shaped Tensor, representing a separate penalty factor to apply to each model coefficient, length equal to columns in model_matrix. Each penalty factor multiplies l2_regularizer to allow differential regularization. Can be 0 for some coefficients, which implies no regularization. Default is 1 for all coefficients. loss(w) = sum{-log p(y[i]|x[i],w) : i=1..n} + l2_regularizer ||w * l2_regularization_penalty_factor||_2^2 Default value: None (i.e., no per coefficient regularization).
name Python str used as name prefix to ops created by this function. Default value: "fit".

model_coefficients (Batch of) vector-shaped Tensor; represents the fitted model coefficients, one for each column in model_matrix.
predicted_linear_response response-shaped Tensor representing linear predictions based on new model_coefficients, i.e., tf.linalg.matvec(model_matrix, model_coefficients) + offset.
is_converged bool Tensor indicating that the returned model_coefficients met the convergence_criteria_fn criteria within the maximum_iterations limit.
iter_ int32 Tensor indicating the number of iterations taken.

Example

 import numpy as np import tensorflow as tf import tensorflow_probability as tfp tfd = tfp.distributions def make_dataset(n, d, link, scale=1., dtype=np.float32): model_coefficients = tfd.Uniform( low=np.array(-1, dtype), high=np.array(1, dtype)).sample(d, seed=42) radius = np.sqrt(2.) model_coefficients *= radius / tf.linalg.norm(model_coefficients) model_matrix = tfd.Normal( loc=np.array(0, dtype), scale=np.array(1, dtype)).sample([n, d], seed=43) scale = tf.convert_to_tensor(scale, dtype) linear_response = tf.tensordot( model_matrix, model_coefficients, axes=[[1], [0]]) if link == 'linear': response = tfd.Normal(loc=linear_response, scale=scale).sample(seed=44) elif link == 'probit': response = tf.cast( tfd.Normal(loc=linear_response, scale=scale).sample(seed=44) > 0, dtype) elif link == 'logit': response = tfd.Bernoulli(logits=linear_response).sample(seed=44) else: raise ValueError('unrecognized true link: {}'.format(link)) return model_matrix, response, model_coefficients X, Y, w_true = make_dataset(n=int(1e6), d=100, link='probit') w, linear_response, is_converged, num_iter = tfp.glm.fit( model_matrix=X, response=Y, model=tfp.glm.BernoulliNormalCDF()) log_likelihood = tfp.glm.BernoulliNormalCDF().log_prob(Y, linear_response) print('is_converged: ', is_converged.numpy()) print(' num_iter: ', num_iter.numpy()) print(' accuracy: ', np.mean((linear_response > 0.) == tf.cast(Y, bool))) print(' deviance: ', 2. * np.mean(log_likelihood)) print('||w0-w1||_2 / (1+||w0||_2): ', (np.linalg.norm(w_true - w, ord=2) / (1. + np.linalg.norm(w_true, ord=2)))) # ==> # is_converged: True # num_iter: 6 # accuracy: 0.804382 # deviance: -0.820746600628 # ||w0-w1||_2 / (1+||w0||_2): 0.00619245105309