Core functionality
Ensemble
- class qp.Ensemble(the_class: Pdf_gen, data: Mapping, ancil: Mapping | None = None, method: str | None = None)[source]
An object comprised of one or more distributions with the same parameterization.
The Ensemble allows you to perform operations on the group of parameterizations as a whole. An Ensemble has three main data components, the last of which is optional:
The metadata: this contains information about the parameterization, and the coordinates of the parameterization.
The object data: this contains the data that is unique to each distribution, for example the values that correspond to the coordinates.
The ancillary data (optional): this contains data points where there is one data point for each distribution in the ensemble. There can be many of these columns or arrays in the ancillary data table.
- Parameters:
- the_class
Pdf_gensubclass The class to use to parameterize the distributions
- data
Mapping Dictionary with data used to construct the ensemble. The keys required vary for different parameterizations.
- ancil
Optional[Mapping] Dictionary with ancillary data, by default None
- method
Optional[str] The key for the creation method to use, by default None
- the_class
Examples
>>> import qp >>> import numpy as np >>> data = {'bins': [0,1,2,3,4,5], ... 'pdfs': np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ancil = {'ids': [105, 108]}} >>> ens = qp.Ensemble(qp.hist,data,ancil) >>> ens.metadata {'pdf_name': array([b'hist'], dtype='|S4'), 'pdf_version': array([0]), 'bins': array([[0, 1, 2, 3, 4, 5]])}
- property gen_func
Return the function used to create the distribution object for this ensemble
- property gen_class
Return the class used to generate distributions for this ensemble
- property dist
Return the
scipy.stats.rv_continuousobject that generates distributions for this ensemble
- property kwds
Return the kwds associated to the frozen object for this ensemble
- property gen_obj
Return the
scipy.stats.rv_continuousobject that generates distributions for this ensemble
- property frozen
Return the
scipy.stats.rv_frozenobject that encapsulates the distributions for this ensemble
- x_samples(min: float = 0.0, max: float = 5.0, n: int | None = 1000) ndarray[float][source]
Return an array of x values that can be used to plot all the distributions in the Ensemble.
This is meant to plot the characteristic distribution for an Ensemble of discrete data. For example, for an ensemble of histograms that would be the PDF, and for an ensemble of quantiles that would be the CDF.
Analytic parameterizations like
mixmodorscipy.stats.normwill just return anp.linspace(min,max,n), and it’s recommended you input the values as the defaults are the same for all analytic distributions.- Parameters:
- min
float, optional The minimum x value to be used if the parameterization doesn’t have an
x_samplesmethod or is analytic, by default 0.- max
float, optional The maximum x value to be used if the parameterization doesn’t have an
x_samplesmethod or is analytic, by default 5.- n
Optional[int], optional The number of points to be used if the parameterization doesn’t have an
x_samplesmethod or is analytic, by default 1000
- min
- Returns:
- xs
np.ndarray[float] The array of points to use.
- xs
- convert_to(to_class: Pdf_gen, **kwargs: str) Ensemble[source]
Convert this ensemble to the given parameterization class. To see the available conversion methods for the your chosen parameterization and their required arguments, check the docstrings for
qp.to_class. If the parameterization class doesn’t have a conversion methods table, then it will not be possible to convert to that class.- Parameters:
- to_class
Pdf_gensubclass Parameterization class to convert to
- **kwargs
Keyword arguments that are passed to the output class constructor
- to_class
- Returns:
- ens
Ensemble Ensemble of distributions of type class_to using the data from this object
- ens
- Other Parameters:
- method
str Optional argument to specify a non-default conversion algorithm
- method
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]])) >>> ens_i = ens_h.convert_to(qp.interp, xvals=np.linspace(0,5,10)) >>> ens_i.metadata {'pdf_name': array([b'interp'], dtype='|S6'), 'pdf_version': array([0]), 'xvals': array([0. , 0.55555556, 1.11111111, 1.66666667, 2.22222222, 2.77777778, 3.33333333, 3.88888889, 4.44444444, 5. ]))}
- update(data: Mapping, ancil: Mapping | None = None) None[source]
Update the frozen distribution object with the given data, and set the ancillary data table with
ancilif given.- Parameters:
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([0,0.1,0.1,0.4,0.2])) >>> ens_h.update(data={'bins': np.array([1,2,3,4,5]), 'pdfs': np.array([0.1,0.1,0.4,0.2])}) >>> ens_h.metadata {'pdf_name': array([b'hist'], dtype='|S4'), 'pdf_version': array([0]), 'bins': array([[1, 2, 3, 4, 5]])}
- update_objdata(data: Mapping, ancil: Mapping | None = None) None[source]
Updates the objdata in the frozen distribution, and sets the ancillary data table if given.
- Parameters:
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([0,0.1,0.1,0.4,0.2])) >>> ens_h.objdata {'pdfs': array([0. , 0.125, 0.125, 0.5 , 0.25 ])} >>> ens_h.update_objdata(data={'pdfs': np.array([0.05,0.09,0.2,0.3,0.15])}) >>> ens_h.objdata {'pdfs': array([[0.06329114, 0.11392405, 0.25316456, 0.37974684, 0.18987342]])}
- property metadata: Mapping
Return the metadata for this ensemble. Metadata are elements that are the same for all the distributions in the ensemble. These include the name and version of the distribution generation class
- Returns:
- metadata
Mapping The dictionary of the metadata.
- metadata
- property objdata: Mapping
Return the data for this ensemble. These are the elements that differ for each distribution in the ensemble. For example, the data points that correspond to each of the coordinates given in the metadata.
- Returns:
- objdata
Mapping The object data
- objdata
Notes
If the distribution normalized the data (which many do by default), this will return the normalized data and not the original input data.
- set_ancil(ancil: Mapping) None[source]
Set the ancillary data dictionary. The arrays in this dictionary must have one row for each of the distributions, which means that the length of these arrays (or the first dimension) must be the same as the number of distributions in the ensemble.
- Parameters:
- ancil
Mapping The ancillary data dictionary.
- ancil
- Raises:
IndexErrorIf the length of the arrays in ancil does not match the number of distributions in the Ensemble.
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]])) >>> ancil = {'ids': np.array([5,7])} >>> ens_h.set_ancil(ancil) >>> ens_h.ancil {'ids': array([5, 7])}
- add_to_ancil(to_add: Mapping) None[source]
Add additional columns to the ancillary data dictionary. The ancil dictionary must already exist. If it does not, use
set_ancil.If any of these columns have the same name as already existing ancillary data columns, the new columns will overwrite the old ones.
- Parameters:
- to_add
Mapping The columns to add to the ancillary data dict
- to_add
- Raises:
IndexErrorIf the length of the arrays in to_add does not match the number of distributions in the Ensembles
Examples
>>> import qp >>> import numpy as np >>> ancil = {'ids': np.array([5,7])} >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]), ancil=ancil) >>> ens_h.add_to_ancil({'means':np.array([0.2,0.25])}) >>> ens_h.ancil {'ids': array([5, 7]), 'means': array[0.2,0.25]}
- append(other_ens: Ensemble) None[source]
Append another ensemble to this ensemble. The ensembles must be of the same parameterization, or this will not work. They must also have the same metadata, so for example if they are both histograms they must also have the same bins.
Both ensembles must have an ancillary data dictionary in order for them to be appended to each other. If one ensemble has an ancillary data dictionary and the other does not, this will set the ancillary data dictionary to
None.- Parameters:
- other_ens
Ensemble The ensemble to append to this one.
- other_ens
- Raises:
KeyErrorRaised if the two ensembles do not have matching metadata.
Examples
>>> import qp >>> import numpy as np >>> ens_1 = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([0,0.1,0.1,0.4,0.2])) >>> ens_2 = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([0.5,0.15,0.25,0.45,0.1])) >>> ens_1.append(ens_2) >>> ens_1.npdf 2
- build_tables(encode: bool = False, ext: str | None = None) Mapping[source]
Returns a dictionary of dictionaries of numpy arrays for the meta data, object data, and the ancillary data (if it exists) for this ensemble.
- Parameters:
- Returns:
- data
Mapping,tables_io.TableDict-like The dictionary with the data. Has the keys:
metafor metadata,datafor object data, and optionallyancilfor ancillary data.
- data
- norm()[source]
Normalizes the input distribution data if it represents a PDF and can be normalized.
- Raises:
AttributeErrorRaised if the parameterization doesn’t have a normalization method.
- mode(grid: ArrayLike) ArrayLike[source]
Return the mode of each ensemble distribution, evaluated on the given grid.
- gridded(grid: ArrayLike) tuple[ArrayLike, ArrayLike][source]
Build, cache and return the PDF values at the given grid points. If the given grid matches the already cached grid, then this just returns the cached value.
- write_to(filename: str) None[source]
Write this ensemble to a file.
The file type can be any of the those supported by tables_io. File type is indicated by the suffix of the file name given. Allowed formats are: ‘hdf5’,’h5’,’hf5’,’hd5’,’fits’,’fit’,’pq’,’parq’,’parquet’
If writing to parquet files, a file will be written for the metadata, the object data, and the ancillary data if it exists, where the identifying key is added to the filename.
- Parameters:
- filename
str
- filename
Examples
>>> import qp >>> import numpy as np >>> ens_1 = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([0,0.1,0.1,0.4,0.2])) >>> ens_1.write_to("hist-ensemble.hdf5")
- pdf(x: ArrayLike) ArrayLike[source]
Evaluates the probability density function (PDF) for each of the distributions in the ensemble
- Parameters:
- x
ArrayLike Location(s) at which to evaluate the PDF for each distribution.
- x
- Returns:
- pdf
ArrayLike The PDF value(s) at the given location(s).
- pdf
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.pdf(np.linspace(3,6,6)) array([[0.5 , 0.5 , 0.25 , 0.25 , 0. , 0. ], [0.37974684, 0.37974684, 0.18987342, 0.18987342, 0. , 0. ]])
- logpdf(x: ArrayLike) ArrayLike[source]
Evaluates the log of the probability density function (PDF) for each of the distributions in the ensemble.
- Parameters:
- x
ArrayLike Location(s) at which to do the evaluations
- x
- Returns:
- logpdf
ArrayLike The log of the PDF at the given location(s)
- logpdf
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.logpdf(np.linspace(3,6,6)) array([[-0.69314718, -0.69314718, -1.38629436, -1.38629436, -inf, -inf], [-0.96825047, -0.96825047, -1.66139765, -1.66139765, -inf, -inf]])
- cdf(x: ArrayLike) ArrayLike[source]
Evaluates the cumulative distribution function (CDF) for each of the distributions in the ensemble.
- Parameters:
- x
ArrayLike Location(s) at which to do the evaluations
- x
- Returns:
- cdf
ArrayLike The CDF at the given location(s)
- cdf
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.cdf(np.linspace(3,6,6)) array([[0.25 , 0.55 , 0.8 , 0.95 , 1. , 1. ], [0.43037975, 0.65822785, 0.84810127, 0.96202532, 1. , 1. ]])
- logcdf(x: ArrayLike) ArrayLike[source]
Evaluates the log of the cumulative distribution function (CDF) for each of the distributions in the ensemble.
- Parameters:
- x
ArrayLike Location(s) at which to do the evaluations
- x
- Returns:
- cdf
ArrayLike The log of the CDF at the given location(s)
- cdf
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.logcdf(np.linspace(3,6,6)) array([[-1.38629436, -0.597837 , -0.22314355, -0.05129329, 0. , 0. ], [-0.84308733, -0.41820413, -0.16475523, -0.03871451, 0. , 0. ]])
- ppf(q: ArrayLike) ArrayLike[source]
Evaluates the percentage point function (PPF) for each of the distributions in the ensemble..
- Parameters:
- q
ArrayLike Location(s) at which to do the evaluations
- q
- Returns:
- ppf
ArrayLike The PPF at the given location(s)
- ppf
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.ppf(0.5) array([[3.5 ], [3.18333333]])
- sf(q: ArrayLike) ArrayLike[source]
Evaluates the survival fraction (SF) for each of the distributions in the ensemble.
- Parameters:
- q
ArrayLike Location(s) at which to evaluate the distributions
- q
- Returns:
- sf
ArrayLike The SF at the given location(s)
- sf
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.sf(0.5) array([[1. ], [0.96835443]])
- logsf(q: ArrayLike) ArrayLike[source]
Evaluates the log of the survival function (SF) for each of the distributions in the ensemble.
- Parameters:
- q
ArrayLike Location(s) at which to evaluate the distributions
- q
- Returns:
- sf
ArrayLike The log of the SF at the given location(s)
- sf
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.logsf(0.5) array([[ 0. ], [-0.03215711]])
- isf(q: ArrayLike) ArrayLike[source]
Evaluates the inverse of the survival fraction (SF) for each of the distributions in the ensemble.
- Parameters:
- q
ArrayLike Location(s) at which to evaluate the distributions
- q
- Returns:
- sf
ArrayLike The inverse of the survival fraction at the given location(s)
- sf
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.isf(0.5) array([[3.5 ], [3.18333333]])
- rvs(size: int = 1, random_state: None | int | Generator = None) ArrayLike[source]
Generate samples from the distributions in this ensemble.
The returned samples are of shape (npdf, size), where size is the number of samples per distribution.
- Parameters:
- size
int, optional Number of samples to return, by default 1.
- random_state
int,numpy.random.Generator,None, optional The random state to use. Can be provided with a random seed for consistency. By default None.
- size
- Returns:
- samples
ArrayLike The array of samples for each distribution in the ensemble, shape (npdf,size)
- samples
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.rvs(size=2) array([[3.12956247, 3.72090937], [4.96783836, 3.24016123]])
- stats(moments: str = 'mv') tuple[ArrayLike, ...][source]
Return some statistics for each of the distributions in this ensemble.
The moments to be returned are determined by the string given to
moments, where each letter represents a specific moment. The options are: “m” = mean, “v” = variance, “s” = (Fisher’s) skew, “k” = (Fisher’s) kurtosis.- Parameters:
- moments
str, optional Which moments to include, by default “mv”
- moments
- Returns:
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.stats() (array([[3.375 ], [3.01898734]]), array([[0.859375 ], [1.23698125]]))
- median() ArrayLike[source]
Return the median for each of the distributions in this ensemble.
- Returns:
- medians
ArrayLike The median for each distribution, returns a float if there is only one distribution, or the shape of the array is (npdf, 1)
- medians
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.median() array([[3.5 ], [3.18333333]])
- mean() ArrayLike[source]
Return the mean for each of the distributions in this ensemble.
- Returns:
- means
ArrayLike The mean for each distribution, returns a float if there is only one distribution, or the shape of the array is (npdf, 1)
- means
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.mean() array([[3.375 ], [3.01898734]])
- var() ArrayLike[source]
Return the variance for each of the distributions in this ensemble.
- Returns:
- variances
ArrayLike The variance for each distribution, returns a float if there is only one distribution, or the shape of the array is (npdf, 1)
- variances
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.var() array([[0.859375 ], [1.23698125]])
- std() ArrayLike[source]
Return the standard deviation for each of the distributions in this ensemble.
- Returns:
- stds
ArrayLike The standard deviations for each distribution, the shape of the array is (npdf, 1)
- stds
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.std() array([[0.92702481], [1.11219659]])
- moment(n: int) ArrayLike[source]
Return the nth moment for each of the distributions in this ensemble.
- Parameters:
- n
int The order of the moment
- n
- Returns:
- moments
ArrayLike The nth moment for each distribution, the shape of the array is (npdf, 1)
- moments
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.moment(2) array([[12.25 ], [10.35126582]])
- entropy() ArrayLike[source]
Return the differential entropy for each of the distributions in this ensemble.
- Returns:
- entropy
ArrayLike The entropy for each distribution, the shape of the array is (npdf, 1)
- entropy
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.entropy() array([[1.21300757], [1.45307405]])
- interval(alpha: ArrayLike) tuple[ArrayLike, ...][source]
Return the intervals corresponding to a confidence level of
alphafor each of the distributions in this ensemble.- Parameters:
- alpha
ArrayLike The array of values to return intervals for. These should be the probability that a random variable will be drawn from the returned range. Each value should be in the range [0,1].
- alpha
- Returns:
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.interval(alpha=[0,0.5,0.9]) (array([[1.4 , 3. , 3.5 ], [0.79 , 2.2875 , 3.18333333]]), array([[3.5 , 4. , 4.8 ], [3.18333333, 3.84166667, 4.73666667]]))
- histogramize(bins: ArrayLike) tuple[ArrayLike][source]
Computes integrated histogram bin values for all distributions in the ensemble.
- Parameters:
- bins
ArrayLike Array of N+1 endpoints of N bins
- bins
- Returns:
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]]) >>> ens_h.histogramize(bins=np.array([1,2,3,4,5])) (array([1, 2, 3, 4, 5]), array([[0.125 , 0.125 , 0.5 , 0.25 ], [0.11392405, 0.25316456, 0.37974684, 0.18987342]]))
- integrate(limits: tuple[float | ArrayLike, float | ArrayLike]) ArrayLike[source]
Computes the integral under the probability distribution functions (PDFs) of the distributions in the ensemble between the given limits.
- Parameters:
- limits
tuple[Union[float,ArrayLike],Union[float,ArrayLike]] A tuple with the limits of integration, where the first object in the tuple is the lower limit, and the second object is the upper limit. The limit objects can be floats or arrays, where the number of limits is the length of those arrays, or
nlimits.
- limits
- Returns:
- integral:
ArrayLike Value of the integral(s), with the shape (npdf, nlimits)
- integral:
- mix_mod_fit(comps=5)[source]
Fits the parameters of a given functional form to an approximation
- Parameters:
- Returns:
- self.mix_mod:
list[qp.Composite] List of
qp.Compositeobjects approximating the PDFs
- self.mix_mod:
Notes
Currently only supports mixture of Gaussians
- moment_partial(n: int, limits: tuple, dx: float = 0.01) ArrayLike[source]
Return the nth moment over a particular range for each of the distributions in this ensemble.
- Parameters:
- Returns:
ArrayLikeArray of the moments for each of the distributions, with shape (npdf,)
- plot(key: int | slice = 0, **kwargs: str)[source]
Plot the selected distribution as a curve.
- Parameters:
- Returns:
- axes
Axes The plot axes
- axes
- Other Parameters:
- plot_native(key: int | slice = 0, **kwargs: str)[source]
Plot the selected distribution in the default format for this parameterization. To find what arguments are required for specific parameterizations, you can check the docstrings of
qp.[parameterization].plot_native, where[parameterization]is the parameterization class for the current ensemble.
- initializeHdf5Write(filename: str, npdf: int, comm=None) tuple[dict[str, File | Group], File][source]
Set up the output write for an ensemble, but set size to npdf rather than the size of the ensemble, as the “initial chunk” will not contain the full data
- Parameters:
- Returns:
- group
dict[str,h5py.File|h5py.Group] A dictionary of the groups to write to.
- fout
h5py.File The output file object that has been created.
- group
- writeHdf5Chunk(fname: h5py.File' | 'h5py.Group, start: int, end: int) None[source]
Write a chunk of the ensemble data to file. This will write the data for the distributions in the slice from [start:end] to the file. This includes the ancillary data table.
- Parameters:
- fname
h5py.File|h5py.Group The file or group object to write to
- start
int Starting index of data to write in the h5py file
- end
int Ending index of data to write in the h5py file
- fname
- finalizeHdf5Write(filename: h5py.File' | 'h5py.Group) None[source]
Write ensemble metadata to the output file and close the file.
- Parameters:
- filename
h5py.File|h5py.Group The file or group object to complete writing and close.
- filename
Factory
- class qp.factory.Factory
Factory that creates and manages Ensembles of distributions.
- add_class(the_class: Pdf_gen) None
Add a parameterization class to the factory dictionary, so that it is included in the set of known parameterization classes. It includes an entry both for the actual class name, which ends in
_gen, and the parameterization name that is also aliased to the class.- Parameters:
- the_class
Pdf_gensubclass The parameterization class we are adding, which must inherit from
Pdf_gen.
- the_class
- create(class_name: str | Pdf_gen, data: Mapping, method: str | None = None, ancil: Mapping | None = None) Ensemble
Make an Ensemble of a particular type of distribution. The
datadictionary will need different keys depending on what parameterization you have chosen.If you are unsure which keys are required, try
qp.[parameterization].create_ensemble?, where [parameterization] is the class of ensemble you wish to create. This will output a docstring with the necessary inputs (and this function can also be used to create an Ensemble).- Parameters:
- class_name
strorclass The name of the parameterization to make a distribution from.
- data
Mapping Dictionary of values passed to the parameterization create function.
- method
str|None, optional Used to select which creation method to invoke if there are multiple.
- ancil
Mapping, optional Dictionary with ancillary data, by default
None
- class_name
- Returns:
- ens
Ensemble The newly created Ensemble
- ens
Examples
>>> import qp >>> import numpy as np >>> data = {'bins': [0,1,2,3,4,5], ... 'pdfs': np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]])} >>> ens_h = qp.create('hist', data=data) >>> ens.metadata {'pdf_name': array([b'hist'], dtype='|S4'), 'pdf_version': array([0]), 'bins': array([[0, 1, 2, 3, 4, 5]])}
- from_tables(tables: Mapping, decode: bool = False, ext: str | None = None) Ensemble
Build this Ensemble from a dictionary of tables, where the metadata has key
meta, and the data has keydata. If there is an ancillary data table, it should have the keyancil.The function will create the ensemble with the parameterization given in the
metatable, and will use any other information in themetatable necessary to figure out how to construct the ensemble (i.e. construction method).- Parameters:
- Returns:
- ens
Ensemble The ensemble constructed from the data in the tables.
- ens
Examples
>>> import qp >>> import numpy as np >>> meta = {'pdf_name': np.array(['hist'.encode()]), 'pdf_version': np.array([0]), ... 'bins':np.array([0,1,2,3,4,5])} >>> data = {'pdfs': np.array([[0. , 0.1 , 0.1 , 0.4 , 0.2 ], ... [0.05, 0.09, 0.2 , 0.3 , 0.15]])} >>> ens = qp.from_tables({'meta': meta, 'data': data}) >>> ens.metadata {'pdf_name': array([b'hist'], dtype='|S4'), 'pdf_version': array([0]), 'bins': array([[0, 1, 2, 3, 4, 5]])}
- read_metadata(filename: str) Mapping
Read an ensemble’s metadata from a file, without loading the full data. The file must have multiple tables, one of which is called ‘meta’.
- Parameters:
- filename
str The full path to the file.
- filename
- Returns:
- meta
Mapping Returns the metadata table as a dictionary of numpy arrays.
- meta
Examples
>>> import qp >>> qp.read_metadata("hist-ensemble.hdf5") {'pdf_name': array([b'hist'], dtype='|S4'), 'pdf_version': array([0]), 'bins': array([[0, 1, 2, 3, 4, 5]])}
- is_qp_file(filename: str) bool
Test if a file is a
qpfile. Must have at least a table called ‘meta’ in the file, and that ‘meta’ table must have a property ‘pdf_name’.Examples
>>> import qp >>> qp.is_qp_file("test-qpfile.hdf5") True
- read(filename: str, fmt: str | None = None, read_slice: slice | None = None) Ensemble
Read this ensemble from a file. The file must be a
qpfile.The function will create the ensemble with the parameterization given in the metadata table, and will use any other information in the metadata table necessary to figure out how to construct the ensemble (i.e. construction method).
- Parameters:
- Returns:
- ens
Ensemble The ensemble constructed from the data in the file.
- ens
Examples
>>> import qp >>> ens = qp.read("test-qpfile.hdf5")
- data_length(filename: str) int
Get the size of data in a file. The file must be a
qpfile, which means it must contain an Ensemble with a metadata table.- Parameters:
- filename
str The path to the file with the data.
- filename
- Returns:
- nrows
int The length of the data, or the number of distributions in the data.
- nrows
Examples
>>> import qp >>> qp.data_length("hist-ensemble.hdf5") 2
- iterator(filename: str, chunk_size: int = 100000, rank: int = 0, parallel_size: int = 1) Iterator[int, int, Ensemble]
Iterates through a given Ensemble file and yields a chunk of the ensemble data at a time. This means that the returned Ensemble contains the distributions from the returned start index to the returned stop index. If there is an ancillary data table, the Ensemble will also contain any ancillary data for those distributions.
- Parameters:
- Yields:
- Raises:
Examples
To iterate through an HDF5 Ensemble file, we can use the following code:
>>> data_file = "./test.hdf5" >>> for start, end, ens_chunk in qp.iterator(data_file, chunk_size=11): ... print(f"Indices are: ({start}, {end})") ... print(ens_chunk) Indices are: (0, 11) Ensemble(the_class=mixmod,shape=(11, 3)) Indices are: (11, 22) Ensemble(the_class=mixmod,shape=(11, 3)) Indices are: (22, 33) Ensemble(the_class=mixmod,shape=(11, 3)) Indices are: (33, 44) Ensemble(the_class=mixmod,shape=(11, 3)) Indices are: (44, 55) Ensemble(the_class=mixmod,shape=(11, 3)) Indices are: (55, 66) Ensemble(the_class=mixmod,shape=(11, 3)) Indices are: (66, 77) Ensemble(the_class=mixmod,shape=(11, 3)) Indices are: (77, 88) Ensemble(the_class=mixmod,shape=(11, 3)) Indices are: (88, 99) Ensemble(the_class=mixmod,shape=(11, 3)) Indices are: (99, 100) Ensemble(the_class=mixmod,shape=(1, 3))
- convert(in_dist: Ensemble, class_name: str, **kwds) Ensemble
Convert an ensemble to a different parameterization. Keyword arguments are required to convert to a different parameterization, but the specific keyword arguments required will vary. To check the available conversion methods and their associated arguments refer to the docstrings for
qp.class_nameof the parameterization you are converting to. If the class does not have a conversion methods table, then it will not be possible to convert to that parameterization.- Parameters:
- Returns:
- ens
Ensemble The ensemble we converted to
- ens
Examples
The following example demonstrates converting from a histogram parameterization to an interpolation parameterization. The arguments given will not be the same when converting between other parameterizations.
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0,0.1,0.1,0.4,0.2],[0.05,0.09,0.2,0.3,0.15]])) >>> ens_i = qp.convert(ens_h, "interp", xvals=np.linspace(0,5,10)) >>> ens_i.metadata {'pdf_name': array([b'interp'], dtype='|S6'), 'pdf_version': array([0]), 'xvals': array([0. , 0.55555556, 1.11111111, 1.66666667, 2.22222222, 2.77777778, 3.33333333, 3.88888889, 4.44444444, 5. ]))}
- pretty_print(stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>) None
Print a level of the conversion dictionary in a human-readable format
- Parameters:
- stream
stream The stream to print to
- stream
- static concatenate(ensembles: list[Ensemble]) Ensemble
Concatenate a list of Ensembles into one Ensemble. The Ensembles must be of the same parameterization and have the same metadata.
- Parameters:
- Returns:
- ens
Ensemble The output
- ens
Examples
>>> import qp >>> import numpy as np >>> ens_1 = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([0,0.1,0.1,0.4,0.2])) >>> ens_1.npdf 1 >>> ens_2 = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([[0.05,0.09,0.2,0.3,0.15]])) >>> ens_2.npdf 1 >>> ens_all = qp.concatenate([ens_1, ens_2]) >>> ens_all.npdf 2
- static write_dict(filename: str, ensemble_dict: Mapping[str, Ensemble], **kwargs)
Writes out a dictionary of Ensembles to an HDF5 file. Each Ensemble in the dictionary will be written to a group, and within each Ensemble group there will be subgroups for the metadata, data, and (optional) ancillary data tables.
- Parameters:
- Raises:
ValueErrorRaised if the dictionary contains any values that are not Ensembles.
Examples
>>> import qp >>> import numpy as np >>> ens_h = qp.hist.create_ensemble(bins= np.array([0,1,2,3,4,5]), ... pdfs = np.array([0,0.1,0.1,0.4,0.2])) >>> ens_i = qp.interp.create_ensemble(xvals= np.array([0,1,2,3,4]), ... yvals = np.array([[0.05,0.09,0.2,0.3,0.15]])) >>> qp.write_dict("qp-ensembles.hdf5",{"ens_h": ens_h, "ens_i": ens_i})
- static read_dict(filename: str) Mapping[str, Ensemble]
Reads in one or more Ensembles from an HDF5 file to a dictionary of Ensembles. The file should contain one top-level group per ensemble. Each Ensemble group should have subgroups that are the metadata, data, and (optional) ancillary data tables.
- Parameters:
- filename
str The path to the
HDF5file to read in.
- filename
- Returns:
Examples
>>> import qp >>> ens_dict = qp.read_dict("qp-ensembles.hdf5")