Quantification Metrics

This module implements some performance metrics for distribution parameterization

class qp.metrics.metrics.Grid(grid_values, cardinality, resolution, hist_bin_edges, limits)

cardinality: Alias for field number 1

grid_values: Alias for field number 0

hist_bin_edges: Alias for field number 3

limits: Alias for field number 4

resolution: Alias for field number 2

qp.metrics.metrics.calculate_moment(p, N, limits, dx=0.01)[source]

Calculates a moment of a qp.Ensemble object

Parameters:

p: qp.Ensemble object: the collection of PDFs whose moment will be calculated
N: int: order of the moment to be calculated
limits: tuple of floats: endpoints of integration interval over which to calculate moments
dx: float: resolution of integration grid

Returns:

M: float: value of the moment

qp.metrics.metrics.calculate_distribution_moment_bias_score(p_true_list, p_estimated_list, limits, N=2, dx=0.01)[source]

Per-bin score for bias in estimated PDFs vs truth, using the mean or variance.

For each tomographic bin, computes the mean (N=1) or variance (N=2) of every PDF in the true and estimated qp.Ensemble objects on the same 1D grid (limits, dx). For N=2, variance uses the mean from calculate_moment(..., 1, ...) and calculate_moment(..., 2, ...) - mean**2. It then takes the difference (estimated minus true) across realizations, and combines:

within-bin spread: variance of those moment differences across PDFs in the bin;
between-bin structure: the mean squared moment bias across bins, added identically to every bin’s score before the square root.

Parameters:

p_true_listsequence of qp.Ensemble: True PDF ensembles, one per bin. Must match p_estimated_list in length.
p_estimated_listsequence of qp.Ensemble: Estimated PDF ensembles, one per bin.
limitstuple of float: Integration interval (low, high) passed to calculate_moment().
Nint, optional: 1 for mean, 2 for variance (central second moment). Only N <= 2 is supported.
dxfloat, optional: Grid spacing for moment integration (default 0.01).

Returns:

scoresndarray of shape (n_bins,): Non-negative score per bin (square root of within-bin variance plus the shared between-bin mean squared bias term).

Raises:

NotImplementedError: If N > 2.
ValueError: If p_true_list and p_estimated_list differ in length or either is empty.

qp.metrics.metrics.calculate_kld(p, q, limits, dx=0.01)[source]

Calculates the Kullback-Leibler Divergence between two qp.Ensemble objects.

Parameters:

p: Ensemble object: probability distribution closer to the truth
q: Ensemble object: probability distribution that approximates p
limits: tuple of floats: endpoints of integration interval in which to calculate KLD
dx: float: resolution of integration grid

Returns:

Dpq: float: the value of the Kullback-Leibler Divergence from q to p

Notes

TO DO: have this take number of points not dx!

qp.metrics.metrics.calculate_rmse(p, q, limits, dx=0.01)[source]

Calculates the Root Mean Square Error between two qp.Ensemble objects.

Parameters:

p: qp.Ensemble object: probability distribution function whose distance between its truth and the approximation of q will be calculated.
q: qp.Ensemble object: probability distribution function whose distance between its approximation and the truth of p will be calculated.
limits: tuple of floats: endpoints of integration interval in which to calculate RMS
dx: float: resolution of integration grid

Returns:

rms: float: the value of the RMS error between q and p

Notes

TO DO: change dx to N

qp.metrics.metrics.calculate_rbpe(p, limits=(inf, inf))[source]

Calculates the risk based point estimates of a qp.Ensemble object. Algorithm as defined in 4.2 of ‘Photometric redshifts for Hyper Suprime-Cam Subaru Strategic Program Data Release 1’ (Tanaka et al. 2018).

Parameters:

p: qp.Ensemble object: Ensemble of PDFs to be evalutated
limits: tuple: The limits at which to evaluate possible z_best estimates. If custom limits are not provided then all potential z value will be considered using the scipy.optimize.minimize_scalar function.

Returns:

rbpes: array of floats: The risk based point estimates of the provided ensemble.

qp.metrics.metrics.calculate_brier(p, truth, limits, dx=0.01)[source]

This function will do the following:

Generate a Mx1 sized grid based on limits and dx.
Produce an NxM array by evaluating the pdf for each of the N distribution objects in the Ensemble p on the grid.
Produce an NxM truth_array using the input truth and the generated grid. All values will be 0 or 1.
Create a Brier metric evaluation object
Return the result of the Brier metric calculation.

Parameters:

p: qp.Ensemble object: of N distributions probability distribution functions that will be gridded and compared against truth.
truth: Nx1 sequence: the list of true values, 1 per distribution in p.
limits: 2-tuple of floats: endpoints grid to evaluate the PDFs for the distributions in p
dx: float: resolution of the grid Defaults to 0.01.

Returns:

Brier_metric: float

qp.metrics.metrics.calculate_brier_for_accumulation(p, truth, limits, dx=0.01)[source]

qp.metrics.metrics.calculate_anderson_darling(p, scipy_distribution='norm', num_samples=100, _random_state=None)[source]

This function is deprecated and will be completely removed in a later version. Please use calculate_goodness_of_fit instead.

Returns:

logger.warning

qp.metrics.metrics.calculate_cramer_von_mises(p, q, num_samples=100, _random_state=None, **kwargs)[source]

This function is deprecated and will be completely removed in a later version. Please use calculate_goodness_of_fit instead.

Returns:

logger.warning

qp.metrics.metrics.calculate_kolmogorov_smirnov(p, q, num_samples=100, _random_state=None)[source]

This function is deprecated and will be completely removed in a later version. Please use calculate_goodness_of_fit instead.

Returns:

logger.warning

qp.metrics.metrics.calculate_outlier_rate(p, lower_limit=0.0001, upper_limit=0.9999)[source]

Fraction of outliers in each distribution

Parameters:

pqp.Ensemble: A collection of N distributions. This implementation expects that Ensembles are not nested.
lower_limitfloat, optional: Lower bound CDF for outliers, by default 0.0001
upper_limitfloat, optional: Upper bound CDF for outliers, by default 0.9999

Returns:

[float]: 1xN array where each element is the percent of outliers for a distribution in the Ensemble.

qp.metrics.metrics.calculate_goodness_of_fit(estimate, reference, fit_metric='ks', num_samples=100, _random_state=None)[source]

This method calculates goodness of fit between the distributions in the estimate and reference Ensembles using the specified fit_metric.

Parameters:

estimateEnsemble containing N distributions: Random variate samples will be drawn from this Ensemble
referenceEnsemble containing N or 1 distributions: The CDF of the distributions in this Ensemble are used in the goodness of fit calculation.
fit_metricstr, optional: The goodness of fit metric to use. One of [‘ad’, ‘cvm’, ‘ks’]. For clarity, ‘ad’ = Anderson-Darling, ‘cvm’ = Cramer-von Mises, and ‘ks’ = Kolmogorov-Smirnov, by default ‘ks’
num_samplesint, optional: Number of random variates to draw from each distribution in estimate, by default 100
_random_state_type_, optional: Used for testing to create reproducible sets of random variates, by default None

Returns:

output: [float]: A array of floats where each element is the result of the statistic calculation.

Raises:

KeyError: If the requested fit_metric is not contained in goodness_of_fit_metrics dictionary, raise a KeyError.

Notes

The calculation of the goodness of fit metrics is not symmetric. i.e. calculate_goodness_of_fit(p, q, ...) != calculate_goodness_of_fit(q, p, ...)

In the future, we should be able to do this directly from the PDFs without needing to take random variates from the estimate Ensemble.

The vectorized implementations of fit metrics are copied over (unmodified) from the developer branch of Scipy 1.10.0dev. When Scipy 1.10 is released, we can replace the copied implementation with the ones in Scipy.

This module implements metric calculations that are independent of qp.Ensembles

qp.metrics.array_metrics.quick_anderson_ksamp(p_random_variables, q_random_variables, **kwargs)[source]

Calculate the k-sample Anderson-Darling statistic using scipy.stats.anderson_ksamp for two CDFs. For more details see: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.anderson_ksamp.html

Parameters:

p_random_variablesnp.array: An array of random variables from the given distribution
q_random_variablesnp.array: An array of random variables from the given distribution

Returns:

[Result objects]: A array of objects with attributes statistic, critical_values, and significance_level.

qp.metrics.array_metrics.quick_kld(p_eval, q_eval, dx=0.01)[source]

Calculates the Kullback-Leibler Divergence between two evaluations of PDFs.

Parameters:

p_eval: numpy.ndarray, float: evaluations of probability distribution closer to the truth
q_eval: numpy.ndarray, float: evaluations of probability distribution that approximates p
dx: float: resolution of integration grid

Returns:

Dpq: float: the value of the Kullback-Leibler Divergence from q to p

qp.metrics.array_metrics.quick_moment(p_eval, grid_to_N, dx)[source]

Calculates a moment of an evaluated PDF

Parameters:

p_eval: numpy.ndarray, float: the values of a probability distribution
grid: numpy.ndarray, float: the grid upon which p_eval was evaluated
dx: float: the difference between regular grid points
N: int: order of the moment to be calculated

Returns:

M: float: value of the moment

qp.metrics.array_metrics.quick_rmse(p_eval, q_eval, N)[source]

Calculates the Root Mean Square Error between two evaluations of PDFs.

Parameters:

p_eval: numpy.ndarray, float: evaluation of probability distribution function whose distance between its truth and the approximation of q will be calculated.
q_eval: numpy.ndarray, float: evaluation of probability distribution function whose distance between its approximation and the truth of p will be calculated.
N: int: number of points at which PDFs were evaluated

Returns:

rms: float: the value of the RMS error between q and p

qp.metrics.array_metrics.quick_rbpe(pdf_function, integration_bounds, limits=(inf, inf))[source]

Calculates the risk based point estimate of a qp.Ensemble object with npdf == 1.

Parameters:

pdf_function, python function: The function should calculate the value of a pdf at a given x value
integration_bounds, 2-tuple of floats: The integration bounds - typically (ppf(0.01), ppf(0.99)) for the given distribution
limits, tuple of floats: The limits at which to evaluate possible z_best estimates. If custom limits are not provided then all potential z value will be considered using the scipy.optimize.minimize_scalar function.

Returns:

rbpe: float: The risk based point estimate of the provided ensemble.

class qp.metrics.brier.Brier(prediction, truth)[source]

Brier score based on https://en.wikipedia.org/wiki/Brier_score#Original_definition_by_Brier

Parameters:

prediction: NxM array, float: Predicted probability for N distributions to have a true value in one of M bins. The sum of values along each row N should be 1.
truth: NxM array, int: True values for N distributions, where Mth bin for the true value will have value 1, all other bins will have a value of 0.

evaluate()[source]

Evaluate the Brier score.

Returns:

float: The result of calculating the Brier metric, a value in the interval [0,2]

class qp.metrics.pit.PIT(qp_ens, true_vals, eval_grid=DEFAULT_QUANTS)[source]

Probability Integral Transform

Parameters:

qp_ensEnsemble: A collection of N distribution objects
true_vals[float]: An array-like sequence of N float values representing the known true value for each distribution
eval_grid[float], optional: A strictly increasing array-like sequence in the range [0,1], by default DEFAULT_QUANTS

Returns:

PIT object: An object with an Ensemble containing the PIT distribution, and a full set of PIT samples.

property pit_samps

Returns the PIT samples. i.e. CDF(true_vals) for each distribution in the Ensemble used to initialize the PIT object.

Returns:

np.array: An array of floats

property pit

Return the PIT Ensemble object

Returns:

qp.Ensemble: An Ensemble containing 1 qp.quant distribution.

calculate_pit_meta_metrics()[source]

Convenience method that will calculate all of the PIT meta metrics and return them as a dictionary.

Returns:

dictionary: The collection of PIT statistics

evaluate_PIT_anderson_ksamp(pit_min=0.0, pit_max=1.0)[source]

Use scipy.stats.anderson_ksamp to compute the Anderson-Darling statistic for the cdf(truth) values by comparing with a uniform distribution between 0 and 1. Up to the current version (1.9.3), scipy.stats.anderson does not support uniform distributions as reference for 1-sample test, therefore we create a uniform “distribution” and pass it as the second value in the list of parameters to the scipy implementation of k-sample Anderson-Darling. For details see: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.anderson_ksamp.html

Parameters:

pit_minfloat, optional: Minimum PIT value to accept, by default 0.
pit_maxfloat, optional: Maximum PIT value to accept, by default 1.

Returns:

array: A array of objects with attributes statistic, critical_values, and significance_level. For details see: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.anderson_ksamp.html

evaluate_PIT_CvM()[source]

Calculate the Cramer von Mises statistic using scipy.stats.cramervonmises using self._pit_samps compared to a uniform distribution. For more details see: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.cramervonmises.html

Returns:

array: A array of objects with attributes statistic and pvalue For details see: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.cramervonmises.html

evaluate_PIT_KS()[source]

Calculate the Kolmogorov-Smirnov statistic using scipy.stats.kstest. For more details see: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kstest.html

Returns:

array: A array of objects with attributes statistic and pvalue. For details see: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kstest.html

evaluate_PIT_outlier_rate(pit_min=0.0001, pit_max=0.9999)[source]

Compute fraction of PIT outliers by evaluating the CDF of the distribution in the PIT Ensemble at pit_min and pit_max.

Parameters:

pit_minfloat, optional: Lower bound for outliers, by default 0.0001
pit_maxfloat, optional: Upper bound for outliers, by default 0.9999

Returns:

float: The percentage of outliers in this distribution given the min and max bounds.