Science Inventory

Advances in variable selection methods I: Causal selection methods versus stepwise regression and principal component analysis on data of known and unknown functional relationships

Citation:

Ssegane, H., E. W. Tollner, Y. M. MOHAMOUD, T. C. Rasmussen, AND J. F. Dowd. Advances in variable selection methods I: Causal selection methods versus stepwise regression and principal component analysis on data of known and unknown functional relationships. JOURNAL OF HYDROLOGY. Elsevier Science Ltd, New York, NY, 438-439(17):16-25, (2012).

Impact/Purpose:

see description

Description:

Hydrological predictions at a watershed scale are commonly based on extrapolation and upscaling of hydrological behavior at plot and hillslope scales. Yet, dominant hydrological drivers at a hillslope may not be as dominant at the watershed scale because of the heterogeneity of watershed characteristics. With the availability of quantifiable watershed data (watershed descriptors and streamflow indices), variable selection can provide insight into the dominant watershed descriptors that drive different streamflow regimes. Stepwise regression and principal components analysis have long been used to select descriptive variables for relating runoff to climate and watershed descriptors. Questions have remained regarding the robustness of the selected descriptors. This paper evaluates five new approaches: Grow-Shrink, GS; a variant of Incremental Association Markov Boundary, interIAMBnPC; Local Causal Discovery, LCD2; HITON Markov Blanket, HITON-MB; and First-Order Utility, FOU. We demonstrate their performance by quantifying their accuracy, consistency and predictive potential compared to stepwise regression and principal component analysis on two known functional relationships. The results show that the variables selected by HITON-MB and the first-order utility are the most accurate while variables selected by Stepwise regression, although not accurate have a high predictive potential. Therefore, a model with high predictive power may not necessary represent the underlying hydrological processes of a watershed system.

Record Details:

Record Type:DOCUMENT( JOURNAL/ PEER REVIEWED JOURNAL)
Product Published Date:05/01/2012
Record Last Revised:09/30/2013
OMB Category:Other
Record ID: 232675