# Grey Box

The main objective of the thesis is to combine these two different approaches. This is carried out by formulating the model such that it includes the significant dynamic behaviour of the physically-based model, but enables statistical methods and tools for estimating the model parameters and assessing the uncertainties in the model structure. Hydrological systems are most often highly complex where usually - due to the law of conservation - many interconnected processes are needed in order to describe the system evolution in both time and space. Hydrological processes are physical phenomena, where the system dynamics are often best formulated in continuous time by considering ordinary differential equations, or partial differential equations. In order for hydrological models to efficiently describe the dynamics in a deterministic hydrological system, a thorough understanding of the system and all influential subprocesses is required. This is obtained by a rather detailed description of all the processes involved. Consequently, the number of parameters that need to be estimated in the model is typically large. A mathematical framework that is based on such a formulation is very deterministic,and often referred to as white box models.

However, in real world applications the ideal hydrological model,where the system and all subprocesses are well described, does not exist. All hydrological models are only approximations of the true process, but the model is considered to adequately describe the system behaviour when the residual series, the difference between the model predictions and the measurements, is minimised and observed as a series of white noise terms. If the deterministic model does not include all the necessary influential factors, the residual series will render a systematic pattern, and cause the model to depart from the measured output. The model will then become incompetent for its purpose. Detailed white box hydrological models are usually applied for simulation purposes, where the objective is to determine the long-term effects of the system response on predefined input sequences.

On the other hand, statistical models are desirable for short-term predictions, since the statistical methods make it possible to use rigorous stochastic dynamical models that provide a measure of the inherent uncertainty for the model predictions. However, statistical models are discrete time models that do not normally contain any physical knowledge regarding the system, and the physical parameters are partly hidden in the discrete time parameterisation. Thus, for long-term predictions of physical phenomena, which is solely based on statistical models, the model output is not adjusted towards the physical drift embedded in the physical knowledge of the system. Due to the lack of physics in the model structure, where only the input-output data and statistical methods are used to formulate the model, statistical models are called black box models.

To maintain the physical interpretation of the model, it would be suitable to use formulation and apply an estimation method, where the parameterisation is kept in continuous time. The model proposed in the thesis is based on the most important physical knowledge of the hydrological process, but includes an additional stochastic term to cope with uncertainties in the model formulation and in the observations. The parameters in the model are physically interpretable and estimated by applying statistical methods. The model approach is called grey box model, since the basic model structure is inherited from the white box models, usually in the form of ordinary differential equations, but the parameter estimation and the uncertainty assessment is obtained using statistical methods. The grey box concept is illustrated in the diagram below, showing the contributions from the white box and the black box approaches.

Considering the proposed grey box approach, the recommendation is to adapt a simple model to describe the system dynamics. The reason for this recommendation is that the law of parsimony tells us that the simplest adequate models are preferred in order to obtain a model and parameters that are identifiable from data. Such simple model can then be extended - based on the estimation results, where the parameter estimates and their variances play a central role in identifying the lack of fit in the system formulation. The parameter estimation also contains estimation for the uncertainty of the parameters. This indicates that an improved model structure can be obtained by considering the lack of fit in the deterministic part of the model, where extensions call for more deterministic equations in order to remove the unwanted uncertainty. On the other hand, the uncertainty can also be formulated in accordance with the knowledge regarding the hydrological system. The former method has been applied in many fields, e.g., chemical engineering (Kristensen et al., 2004), dynamic models for air temperature (Soegaard, 1993) and heat dynamics of buildings (Bacher and Madsen, 2011). However, the latter approach has not received much attention until recently (Moeller et al., 2010a,b, Philipsen et al., 2010). This thesis is dedicated to the uncertainty part of the grey box model, where simple models are adopted to represent the dynamic behaviour of the hydrological system. The uncertainty is formulated to obtain reasonable prediction intervals for the model output. Thus, the grey box approach provides adequate and operational models for the system. In that way the models are not only physically interpretable, but do also depend on real time measurements and, therefore, useful for both short-term and long-term predictions in connection with online control and optimisation.

- Grey Box Modeling: