API Reference#
Modifies PyMC ModelBuilder to work with the Rating model class
- class ratingcurve.ratingmodel_builder.RatingModelBuilder(**kwargs)[source]#
Parent class for other rating models.
Sets not implemented PyMC ModelBuilder class functions. Additionally, tweaks other ModelBuilder functions for better application in rating curve fitting.
Update ModelBuilder.__init__ to only configure the model.
Configuring the sampler now occurs in the .fit() call.
- Parameters:
**kwargs (dict) – Keyword arguments to pass to the model configuration.
- fit(h: ArrayLike, q: ArrayLike, q_sigma: ArrayLike = None, method: str = 'advi', progressbar: bool = True, random_seed: RandomState = None, **kwargs: Any) InferenceData [source]#
Update ModelBuilder.fit to accept q_sigma.
ModelBuilder takes two inputs: x and y, so redefine it to accept sigma. Fit a model using the data and algorithm passed as a parameter. Sets attrs to inference data of the model.
- Parameters:
h (array_like) – Input training array of gage height (h) observations.
q (array_like) – Target discharge (q) values.
q_sigma (array_like, optional) – Discharge uncertainty in units of discharge.
method (str, optional) – The method (algorithm) used to fit the data, options are ‘advi’ or ‘nuts’.
progressbar (bool, optional) – Specifies whether the fit progressbar should be displayed.
random_seed (RandomState, optional) – Provides sampler with initial random seed for obtaining reproducible samples.
**kwargs (dict) – Algorithm settings can be provided in form of keyword arguments.
- Returns:
self – Arviz InferenceData object containing posterior samples of model parameters of the fitted model.
- Return type:
InferenceData
- get_default_sampler_config(n: int = 200000, abs_tol: float = 0.002, rel_tol: float = 0.002, adam_learn_rate: float = 0.001, draws: int = None, tune: int = 2000, chains: int = 4, target_accept: float = 0.95, **kwargs) dict [source]#
Create sampler configuration dictionary.
Generates a sampler_config dictionary with all the required sampler configuration parameters needed to sample/fit the model. It will be passed to the class instance when fitting. Any sampler parameters not specified to the .fit() call, will be set to the defaults.
- Parameters:
n (int) – The number of iterations. (Only used in ADVI algorithm.)
abs_tol (float) – Convergence criterion for algorithm. Termination of fitting occurs when the absolute tolerance between two consecutive iterates is at most abs_tol. (Only used in ADVI algorithm.)
rel_tol (float) – Convergence criterion for algorithm. Termination of fitting occurs when the relative tolerance between two consecutive iterates is at most rel_tol. (Only used in ADVI algorithm.)
adam_learn_rate (float) – The learning rate for the ADAM Optimizer. (Only used in ADVI algorithm.)
draws (int) – The number of samples to draw. (Used in both algorithms.)
tune (int) – Number of iterations to tune. Samplers adjust the step sizes, scalings or similar during tuning. Tuning samples are discarded. (Only used in NUTS algorithm.)
chains (int) – The number of chains to sample. (Only used in NUTS algorithm.)
target_accept (float) – The step size is tuned such that we approximate this acceptance rate. (Only used in NUTS algorithm.)
- Returns:
sampler_config – A dictionary containing all the required sampler configuration parameters.
- Return type:
dict
- classmethod load(fname: str)[source]#
Update ModelBuilder.load() to accept q_sigma.
ModelBuilder takes two inputs: x and y, so redefine it to accept sigma. Creates a ModelBuilder instance from a file and loads inference data for the model.
- Parameters:
fname (str) – File name with path to saved model.
- Return type:
An instance of ModelBuilder.
- property output_var: str#
Name of the output of dependent variable.
- predict_posterior(X_pred: ArrayLike, extend_idata: bool = True, combined: bool = True, **kwargs) ArrayLike [source]#
Update ModelBuilder.predict_posterior to remove data validation.
Generate posterior predictive samples on unseen data. Exclude any data validation as it requires X_pred to be a 2D array-like object.
- Parameters:
X_pred (array_like) – The input data used for prediction.
extend_idata (bool, optional) – Determines whether the predictions should be added to inference data object. Defaults to True.
combined (bool, optional) – Combine chain and draw dims into sample. Won’t work if a dim named sample already exists. Defaults to True.
**kwargs (dict) – Additional arguments to pass to pymc.sample_posterior_predictive.
- Returns:
y_pred – Posterior predictive samples for each input X_pred. Shape of array is (n_pred, chains * draws) if combined is True, otherwise (chains, draws, n_pred).
- Return type:
ndarray
- residuals() ArrayLike [source]#
Compute residuals of rating model.
- Returns:
residuals – Log residuals of rating model.
- Return type:
array_like
- sample_model(**kwargs) InferenceData [source]#
Update ModelBuilder.sample_model with other fitting algorithms.
- Parameters:
**kwargs (dict) – Additional keyword arguments to pass to the PyMC sampler.
- Returns:
self – Arviz InferenceData object containing posterior samples of model parameters of the fitted model.
- Return type:
InferenceData
- sample_posterior_predictive(X_pred, extend_idata, combined, **kwargs)[source]#
Update ModelBuilder.sample_posterior_predicitve to output untransformed q.
- sample_prior_predictive(X_pred, y_pred=None, samples: int = None, extend_idata: bool = False, combined: bool = True, **kwargs)[source]#
Update ModelBuilder.sample_prior_predicitve to output untransformed q.
- table(h: ArrayLike = None, step: float = 0.01, extend: float = 1.1) pd.DataFrame [source]#
Return stage-discharge rating table.
- Parameters:
h (array_like, optional) – Stage values to compute rating table. If None, then use the range of observations.
step (float, optional) – Step size for stage values.
extend (float, optional) – Extend range of discharge values by this factor.
- Returns:
rating_table – Rating table with columns stage, mean discharge, median discharge, and gse (geometric standard error [1]).
- Return type:
DataFrame
References
Streamflow rating models using PyMC.
- class ratingcurve.ratings.PowerLawRating(**kwargs)[source]#
Multi-segment power law rating using Heaviside parameterization.
Update ModelBuilder.__init__ to only configure the model.
Configuring the sampler now occurs in the .fit() call.
- Parameters:
**kwargs (dict) – Keyword arguments to pass to the model configuration.
- build_model(h: ArrayLike, q: ArrayLike, q_sigma: ArrayLike = None, **kwargs)[source]#
Creates the PyMC model.
- Parameters:
h (array_like) – Input array of gage height (h) observations.
q (array_like) – Input array of discharge (q) observations.
q_sigma (array_like, optional) – Input array of discharge uncertainty in units of discharge.
- static get_default_model_config(segments: int = 2, prior: dict = {'distribution': 'uniform'}, **kwargs) dict [source]#
Create model configuration dictionary.
Generate a model_config dictionary with all the required model configuration parameters needed to build the model. It will be passed to the class instance on initialization, in case the user doesn’t provide any model configuration settings of their own.
- Parameters:
segments (int) – Number of segments in the rating.
prior (dict) –
Prior knowledge of breakpoint locations. Must contain the key distribution, which can either be set to a ‘uniform’ or ‘normal’ distribution. If a normal distribution, then the mean mu and width sigma must be given as well.
Examples: (with any segment value)
prior = {'distribution': 'uniform'}
or (with segments = 2) ``prior = {‘distribution’: ‘normal’, ‘mu’: [1, 2],’sigma’:[1, 1]}``
or (with segments = 4) ``prior = {‘distribution’: ‘normal’, ‘mu’: [1, 2, 5, 9],
’sigma’:[1, 1, 1, 1]}``
Note that the number of normal distribution means and widths must be the same as the number of segments. Additionally, the first mean must be less than the lowest observed stage as it defines the stage of zero flow.
- Returns:
model_config – A dictionary containing all the required model configuration parameters.
- Return type:
dict
- class ratingcurve.ratings.SplineRating(**kwargs)[source]#
Natural spline rating.
Update ModelBuilder.__init__ to only configure the model.
Configuring the sampler now occurs in the .fit() call.
- Parameters:
**kwargs (dict) – Keyword arguments to pass to the model configuration.
- build_model(h: ArrayLike, q: ArrayLike, q_sigma: ArrayLike = None, **kwargs)[source]#
Creates the PyMC model.
- Parameters:
h (array_like) – Input array of gage height (h) observations.
q (array_like) – Input array of discharge (q) observations.
q_sigma (array_like, optional) – Input array of discharge uncertainty in units of discharge.
- static get_default_model_config(mean: float = 0, sd: float = 1, df: int = 5, **kwargs) dict [source]#
Create model configuration dictionary.
Generate a model_config dictionary with all the required model configuration parameters needed to build the model. It will be passed to the class instance on initialization, in case the user doesn’t provide any model configuration settings of their own.
- Parameters:
mean (float) – Mean of the normal prior for the spline coefficients.
sd (float) – Standard deviation of the normal prior for the spline coefficients.
df (int) – Degrees of freedom for the spline coefficients.
- Returns:
model_config – A dictionary containing all the required model configuration parameters.
- Return type:
dict
Plotting functions
- class ratingcurve.plot.PlotMixin[source]#
Mixin class for plotting rating models.
- plot(ax: Axes = None) None [source]#
Plots gagings and fit rating curve.
- Parameters:
ax (Axes, optional) – Pre-defined matplotlib axes.
- plot_gagings(ax: Axes = None) None [source]#
Plot gagings with uncertainty.
- Parameters:
ax (Axes, optional) – Pre-defined matplotlib axes.
- class ratingcurve.plot.PowerLawPlotMixin[source]#
Mixin class for plotting power law rating models.
- class ratingcurve.plot.RatingMixin[source]#
Parent class for other rating-related mixins.
- summary(var_names: list = None) DataFrame [source]#
Summary of rating model parameters.
- Parameters:
var_names (list of str, optional) – List of variables to include in summary. If no names are given, then a summary of all variables is returned.
- Returns:
df – DataFrame summary of rating model parameters.
- Return type:
DataFrame
Data transformations to improve optimization
- class ratingcurve.transform.Dmatrix(stage: ArrayLike, df: int, form: str = 'cr')[source]#
Transform for spline design matrix
Create a Dmatrix object.
Create a design matrix for a natural cubic spline, which is a cubic spline that is additionally constrained to be linear at the boundaries. Due to this constraint, the total degrees of freedom equals the number of knots minus 1.
- Parameters:
stage (array_like) – Stage data
df (int) – Degrees of freedom
form (str) – Spline form
- class ratingcurve.transform.LogZTransform(x: ArrayLike)[source]#
Log transform then takes z-score.
Create a LogZTransform for x.
- Parameters:
x (array_like) – Data that defines the transform.
- class ratingcurve.transform.Transform(x)[source]#
Transformation class
All children of Transform must have transfom and untransform methods
Create empty Transform object
- class ratingcurve.transform.UnitTransform(x: ArrayLike)[source]#
Transforms data to the unit (0 to 1) interval.
Create UnitTransform of array
- Parameters:
x (array_like) – Data that defines the transform.
- class ratingcurve.transform.ZTransform(x: ArrayLike)[source]#
Z-transforms data to have zero mean and unit variance
Create a ZTransform object
- Parameters:
x (array_like) – Data that defines the transform.
- ratingcurve.transform.compute_knots(minimum: float, maximum: float, n: int) ArrayLike [source]#
Return list of spline knots
- Parameters:
minimum (float) – Minimum and maximum stage (h) observations.
maximum (float) – Minimum and maximum stage (h) observations.
n (int) – Number of knots.
- Returns:
List of spline knots.
- Return type:
ArrayLike
Example datasets for rating curve analysis.
- ratingcurve.data.describe(name) str [source]#
Describes a tutorial dataset
- Parameters:
name (str) – Name of the dataset. e.g., ‘green channel’
- Returns:
Description of the dataset
- Return type:
str