View source on GitHub |
Compute quantiles of x along axis.
tfp.stats.quantiles( x, num_quantiles, axis=None, interpolation=None, keepdims=False, validate_args=False, name=None ) The quantiles of a distribution are cut points dividing the range into intervals with equal probabilities.
Given a vector x of samples, this function estimates the cut points by returning num_quantiles + 1 cut points, (c0, ..., cn), such that, roughly speaking, equal number of sample points lie in the num_quantiles intervals [c0, c1), [c1, c2), ..., [c_{n-1}, cn]. That is,
- About
1 / nfraction of the data lies in[c_{k-1}, c_k),k = 1, ..., n - About
k / nfraction of the data lies belowc_k. c0is the sample minimum andcnis the maximum.
The exact number of data points in each interval depends on the size of x (e.g. whether the size is divisible by n) and the interpolation kwarg.
Raises | |
|---|---|
ValueError | If argument 'interpolation' is not an allowed type. |
ValueError | If interpolation type not compatible with dtype. |
Examples
# Get quartiles of x with various interpolation choices. x = [0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.] tfp.stats.quantiles(x, num_quantiles=4, interpolation='nearest') ==> [ 0., 2., 5., 8., 10.] tfp.stats.quantiles(x, num_quantiles=4, interpolation='linear') ==> [ 0. , 2.5, 5. , 7.5, 10. ] tfp.stats.quantiles(x, num_quantiles=4, interpolation='lower') ==> [ 0., 2., 5., 7., 10.] # Get deciles of columns of an R x C data set. data = load_my_columnar_data(...) tfp.stats.quantiles(data, num_quantiles=10) ==> Shape [11, C] Tensor
View source on GitHub