Linear regression
Extensive Linear regression module offers unusually rich commented text and graphical diagnostics for many types of linear regression models including polynomials, multivariate Taylor models, general transformed regreesion. The module goes beyond least squares regression, it offers several robust methods like Lp, M-estimates, Quantile regression, Bounded Influence regression, All possible subsets regression, and much more. Diagnostics includes diagnostics of model, data and regression method, vhich will make results as reliable as possible.

PDF Linear regression - Pdf manual

  • Regression model coefficients
  • Curve fitting
  • Variable prediction
  • Find most suitable model (Stepwise, All possible subsets)
  • Calibration, validation
  • Response surface optimization function
  • Robust methods for "real world" data
  • Extensive diagnostic tools will find any problems and pecularities¬†in data or model

  • Plain linear model
  • Polynomial models
  • Taylor quadratic hypersurfaces
  • User-defined models
  • Weighted models
  • Quasilinear response correction
  • Implicit models
  • All posible subset, find best model
  • Least squares
  • Rank correction/regularization
  • Quantile regression
  • Lp-norm regression
  • Least median of Squares (LMS)
  • M-estimates
  • Robust/resistant BIR method
  • Stepwise/All model selection

Text output:
  • Basic analysis
  • Correlations in X
  • Multicollinearity
  • Eigenvalues analysis
  • Analysis of variance (ANOVA)
  • Regression coefficients statistics
  • Confidence intervals
  • Statistical residual analysis
  • R, RSS, AIC, MEP, etc.
  • Classical residuals
  • Dependences in residuals
  • Model/Data testing
  • Tests for data and residuals
  • Tests of model
  • Predicted statistics
  • Influential points analysis
  • Jackknife residuals
  • Hat Matrix
  • Cook distance
  • Atkinson distance
  • Andrews-Pregibon statistics
  • Likelihood distances
  • Prediction
Graphical output:
  • Regression curve
  • Y-prediction
  • Residuals vs. prediction
  • Absolute residuals
  • Squared residuals
  • Residual QQ-plot
  • Autocorrelation plot
  • Heteroscedasticity plot
  • Jackknife residuals
  • Predicted residuals
  • Partial regression plots
  • Partial residual plots
  • Hat matrix diagonal plot
  • Predicted residuals QQ-plot
  • Pregibon statsitic plot
  • Williams statistic plot
  • McCulloh statistic plot
  • L-R plot
  • Cook distance
  • Atkinson distance
  • Studentized residuals
  • Andrews plot
  • Jackknife residual QQ-plot

Main panel of Linear regression
Linear regression

Output specification panel
Linear regression

User-defined model panel
Linear regression

Example outputs:
Regression line with identified cases (numbers instead of points) and confidence band (red):
Linear regression

Comparison of two models - model building example (how important is the regression diagniostics):

First model consiered was derived theoretically, has 5 parameters and fits the data well.
Model: [Rate] ~ Abs + [pH-value] + [pH-value]^2 + Ln([pH-value]) + Exp([pH-value])
Its prediction capability is low however, and moreover three out of the five parameters proved insignificant, which makes this model unusable.
Linear regression

Variable Estimate Std dev Conclusion P-value Lower C.I. Upper C.I.
Abs 2.1789 0.47815 Significant 2.00E-005 1.226 3.131
[pH-value] -0.251 0.64328 Insignificant 0.69727 -1.532 1.03
[pH-value]^2 0.00641 0.03973 Insignificant 0.872108 -0.072 0.085
Ln([pH-value]) 1.5023 1.1092 Insignificant 0.17974 -0.707 3.712
Exp([pH-value]) 0.000261 3.1980627E-005 Significant 6.29E-012 0.00019 0.0003
Residual variability 12.21360278
F-statistic 359.6264707
Multiple correlation R 0.9752305217
Determination coefficient R^2 0.9510745705
Predicted correlation coef. Rp 0.9418693616
Mean error of prediction MEP 0.1836906899
Akaike information criterion -137.4849057

In the Second model we dropped one of the insignificant parameters. Model still fits the data still very well.
Model: [Rate] ~ Abs + [pH-value] + [pH-value]^2 + Exp([pH-value])
Its prediction capability outside the interval of measured data has improved. Morover all the model parameters are now significant, so we can compute physical constants from the data. Notice much higher F-statistic, which means overall significance of the model.
Linear regression

Variable Estimate Std Dev Conclusion p-value Lower C.I. Upper C.I.
Abs 1.61910 0.241700 Significant 3.3881E-009 1.13761 2.1005
[pH-value] 0.605230 0.118673 Significant 2.4763E-006 0.36881 0.8416
[pH-value]^2 -0.04453 0.0128498 Significant 0.000877 -0.07013 -0.018
Exp([pH-value]) 0.000290 2.356E-005 Significant 0 0.000243 0.00033
Residual variability 12.51634119
F-statistic 473.622371
Multiple correlation R 0.9746085658
Determination coefficient R^2 0.9498618565
Predicted correlation coef. Rp 0.9417364687
Mean error of prediction MEP 0.1841106266
Akaike information criterion -137.5506086

All possible subsets regression
This method can search up to 8000 regression submodels to select the best one to describe the given data. Criteria for the selection are F-statistic, Mean error of prediction (MEP) and Akaike infiormation statistic (AIC)
Linear regression Linear regression
Quantile regression
This methods finds the regression quantile curve with a given probability ot data below it. This is very important for example when modelling reliability.
Linear regression
Quantile 15%: Y=1.950+1.961*X-0.121*X^2
Linear regression
Quantile 50%: Y=1.074+2.106*X-0.136*X^2
Linear regression

Quantile 90%: Y=-0.382+2.079*X-0.126*X^2

Robust methods
Robust methods are useful when the data may contain gross-errors, bad measurements, etc.
Linear regression

Ordinary Least Squares regression (wrong)

Linear regression

Robust M-estimate regression (correct)

Rich diagnostic plots and statistics
Linear regression Linear regression
Linear regression Linear regression
Linear regression Linear regression
Linear regression Linear regression