econometrics — Statistical Testing
Statistical testing and regression tools. Run these tests before building any price model to ensure appropriate model selection.
Standard pre-modelling test sequence:
adfTest+kpssTest→ stationarity decisionljungBox→ serial correlation checkgrangerCausality→ proxy/factor selectionols→ factor regression and grade surfaces
Key Functions
ols(y, X)OLS regression with robust (HC3) standard errors. Returns
{coefficients, intercept, rSquared, tStats, pValues, residuals}.adfTest(series)Augmented Dickey-Fuller unit root test. Returns
{adfStat, pValue, isStationary}.isStationary=True→ use OU/ARMA;False→ use GBM or difference the series.kpssTest(series)KPSS stationarity test (null = stationary). Returns
{kpssStat, pValue, isStationary}. Use together with ADF: ADF rejects unit root AND KPSS fails to reject → strong stationarity evidence.grangerCausality(y, x, maxlag=4)Tests whether
xGranger-causesy. Returns{fStat, pValue, granger_causes}per lag. Use for proxy selection: choose proxies that Granger-cause the target series.ljungBox(series, lags=10)Ljung-Box autocorrelation test. Returns
{qStats, pValues}at each lag. Significant autocorrelation → use ARMA or include lagged terms in regression.whiteTest(y, X)White’s heteroskedasticity test. Returns
{testStat, pValue, isHomoskedastic}.breuschPaganTest(y, X)Breusch-Pagan heteroskedasticity test.
durbinWatson(residuals)Durbin-Watson autocorrelation statistic. Values near 2.0 indicate no autocorrelation.
import sipQuant as sq
import numpy as np
prices = np.array([182.0, 184.5, 187.0, 186.0, 185.5, 183.0, 187.5, 189.0])
# Stationarity tests
adf = sq.econometrics.adfTest(prices)
kpss = sq.econometrics.kpssTest(prices)
print(f"ADF stationary: {adf['isStationary']} (p={adf['pValue']:.4f})")
print(f"KPSS stationary: {kpss['isStationary']} (p={kpss['pValue']:.4f})")
# Autocorrelation check
returns = np.diff(np.log(prices))
lb = sq.econometrics.ljungBox(returns, lags=5)
print(f"Ljung-Box p-values: {lb['pValues'].round(4)}")
# Proxy selection via Granger causality
# e.g. test if CME Corn Granger-causes local feed grain price
corn_prices = np.array([420.0, 425.0, 422.0, 430.0, 428.0, 435.0, 432.0, 440.0])
gc = sq.econometrics.grangerCausality(prices, corn_prices, maxlag=2)
print(f"Corn Granger-causes hay? {gc[1]['granger_causes']} (p={gc[1]['pValue']:.4f})")
# Grade surface regression
# Regress adjusted price on grade factors
moisture = np.array([14.0, 15.0, 13.0, 16.0, 14.5, 15.5, 13.5, 14.0])
X = np.column_stack([moisture])
result = sq.econometrics.ols(prices, X)
print(f"Moisture coefficient: {result['coefficients'][0]:.4f}")
print(f"R-squared: {result['rSquared']:.4f}")