Summary

To reduce operational costs and to ensure security of supply, gas distribution networks require accurate forecasts of the demand for gas. Among domestic and commercial customers, demand relates primarily to the weather and patterns of life and work. Public holidays have a pronounced effect which often spreads into neighbouring days. We call this spread the ‘proximity effect’. Traditionally, the days over which the proximity effect is felt are prespecified in fixed windows around each holiday, allowing no uncertainty in their identification. We are motivated by an application to modelling daily gas demand in two large British regions. We introduce a novel model which does not fix the days on which the proximity effect is felt. Our approach uses a four-state, non-homogeneous hidden Markov model, with cyclic dynamics, where the classification of days as public holidays is observed, but the assignment of days as ‘pre-holiday’, ‘post-holiday’ or ‘normal’ days is unknown. The number of days to the preceding and succeeding holidays guide transitions between states. We apply Bayesian inference and illustrate the benefit of our modelling approach. A version of the model is now being used by one of the UK's regional distribution networks.

1 Introduction

The energy sector in the UK is changing. To comply with the UK Climate Change Act 2008, greenhouse gas emissions must be reduced to 80% of their 1990 levels by the year 2050 (Her Majesty's Parliament, 2008). Pragmatically, these targets must be met in a manner which does not compromise the provision of affordable prices to consumers or the competitiveness of UK industry. At present, natural gas provides a comparatively low cost source of energy which, unlike some renewable energy sources such as wind power, can also offer reliability in supply and storability (Chu and Majumdar, 2012). In the future, decarbonization of the gas network is likely to involve connecting carbon neutral biomethane plants to the transmission system or converting the distribution network to transport hydrogen rather than natural gas (Dodds and McDowall, 2013). Therefore, both now and over the years to come, gas has a vital role to play in the energy mix and it is more important than ever that the gas distribution network operates as efficiently as possible.

National Grid is the sole owner and operator of the gas transmission infrastructure in the UK. Gas in the national transmission system leaves the network at high pressure at 49 points across the country. After being odorized for safety it is transported, ultimately at lower pressure, through eight regional distribution networks to individual customers. National Grid works closely with the regional distribution networks to ensure that the local supply of gas meets the demand at all times.

Demand forecasts, over a range of horizons, are required by both National Grid and the distribution companies for reasons including safety and security of supply and investment and operational planning (National Grid, 2016). The method that is used by distribution companies involves forecasts A^u,j for the annual demand Au, j in year u from a subset j of customers, j = 1,…, J. These are supported by a report, produced by an external company, taking into account economic and other external factors. Forecasts for demand Y~t,j for subset j on day t are then rescaled to match this annual forecast; see National Grid (2016). Thus what is required is inference about a set of scale factors exp (kt, j) for the days of the year so that the forecast mean of Y~t,j, for example, will be E{exp(kt,j)}A^u,j, since the scale factors may reasonably be treated as independent of the annual demand level.

In this paper we describe a model for demand using the transformation Yt,j=ln(Y~t,j)=cu(t),j+kt,j+et,j where u(t) is the year in which day t falls. Here et, j represents random fluctuation and the value of the additive constant cu(t), j can be adjusted in forecasts to match A^u(t),j, without affecting the log-scale-factors kt, j. It is these factors, incorporating weather, seasonal and calendar effects, about which inference is required.

There is a large body of work in the scientific literature concerned with modelling and forecasting the demand for natural gas; see, for example, Soldo (2012) for an extensive review. In broad terms, consumers of gas can be divided into three types: residential, commercial and industrial. We focus on the first two of these groups in this paper. Although economic factors like gas prices and national income are important drivers of gas consumption by big industrial users, their effects are generally less important for residential and commercial customers for whom gas is predominantly used for heating (including water) and cooking. As a result, the demand by these groups is strongly related to the weather and patterns of life and work. Models for residential and commercial gas consumption therefore generally allow for weather-related predictors and seasonal and calendar effects. Sometimes models also incorporate interactions between the two to allow the effect of the weather and, in particular, temperature to vary with periodic changes in fixed or timed heating schedules. After modelling the weather, seasonal and calendar effects, it is common for remaining errors to exhibit considerable auto-correlation. This is generally modelled directly, for instance by assuming auto-regressive moving average models for the residuals (Lyness, 1984; Akouemo and Povinelli, 2016).

Weather-related predictors in models for gas consumption generally include primitive variables, like temperature and wind speed, or derived variables, constructed to have a simple relationship with demand. In the literature, considerable attention has been devoted to modelling the non-linear effect of temperature which manifests at low and high temperatures when, for example, many customers switch their heating on or off. In some cases this is achieved by fitting non-linear regression models (for example see Brabec et al. (2009) and Gascón and Sánchez-Úbeda (2018)) whereas, in others, the effect is captured through bespoke, derived variables such as ‘heating degree days’ (e.g. Akouemo and Povinelli (2016)). The latter measure the temperature difference from some fixed threshold but adopt 0-values when the temperature exceeds that threshold, so that the temperature has no effect when heating is not needed.

Seasonal and calendar effects often include a smoothed effect for the day of the year (e.g. Brabec et al. (2009) and Gascón and Sánchez-Úbeda (2018)) and fixed effects to represent the day of the week. These are needed because gas demand follows a weekly cycle, with clear differences between weekdays and weekends (Lyness, 1984). Similarly, because demand is affected markedly by public holidays, models usually include a holiday factor (Brabec et al., 2015) or treat public holidays like days of the weekend (e.g. Charlton and Singleton (2014)). In addition to effects on the days of the holidays themselves, models for energy consumption sometimes include a protracted effect which extends into neighbouring days. This allows for the influence of changes in cooking and travel arrangements, and for the commercial and industrial slow-down that typically occurs around holidays. For example, in their model for gas consumption in the Czech Republic, Brabec et al. (2009) included Christmas and Easter effects which apply over a fixed range of consecutive days around Christmas Day and Good Friday. Taking a different approach, Brabec et al. (2010) defined a ‘daytype’ factor with five levels corresponding to different classifications of the current, previous and following days as working days or otherwise. This allows public holidays and neighbouring days to have different effects depending on where they fall in the week. In their models for electricity consumption, Hor et al. (2006) and Pardo et al. (2002) allowed demand to differ on proximity days within a short fixed window around each holiday.

Some public holidays, particularly those around Christmas and often Easter, occur during periods where the demand for gas is at its highest. At such times, it is essential to have accurate forecasts and a correct quantification of uncertainty. Yet, in terms of allowing for a proximity effect, an aspect that is common to all of the approaches above is that the dates of the proximity days are fixed, often arbitrarily.

In this paper, we are motivated by an application involving daily gas consumption in the two large geographical areas in northern England served by the regional operator Northern Gas Networks (NGN). We propose a novel model for daily gas demand which, to our knowledge, is the first approach which does not fix the days on which the proximity effect of a public holiday is felt. The results of a preliminary version of this model are already being used by the company in its annual medium-term forecasting exercise. Our approach is based on a four-state, non-homogeneous hidden Markov model with cyclic dynamics. In this model the classification of days as public holidays is observed, but the assignment of days as ‘pre-holiday’, ‘post-holiday’ or ‘normal’ days is unknown but guided by the number of days from the preceding and to succeeding public holidays. We allow for auto- and cross-correlation in the bivariate time series by modelling the logarithm of gas demand in the two regions, conditionally on the states, using a bivariate auto-regression of order 1, with a symmetric auto-regressive coefficient matrix, that allows stationarity to be imposed through simple constraints on the parameter space. Taking a Bayesian approach to inference, we use a hierarchical structure to encapsulate structural prior information about similarities between the two regions. In our application, we illustrate the benefit of allowing for a proximity effect of unknown magnitude and duration.

The remainder of this paper is structured as follows. The data for our analysis are described in Section 2. In Section 3 we propose our model and describe it in detail. Section 4 gives the structure of our prior distribution and Section 5 discusses computation of the posterior distribution. We give the results of applying our model and inferential procedures in Section 6 and draw some conclusions in Section 7.

2 The data

NGN, one of the eight regional distribution networks in the UK, is responsible for gas distribution to 2.7 million homes and businesses across a 25000-km2 region in the north-east of England, northern Cumbria and Yorkshire.

In the UK, medium- and long-term demand forecasts are produced for each of 13 local distribution zones (LDZs). NGN is responsible for generating the demand forecasts for the neighbouring Northern, NO, and North-East, NE, LDZs. The former covers northern Cumbria and the north-east of England and the latter encompasses much of Yorkshire. Within each LDZ, every point at which gas is taken from the network by supply pipe to a consumer is categorized according to its load band, which is based on how much gas it uses. Depending on whether or not meters are read daily, load bands are classified as daily metered or non-daily metered (NDM). Daily metered load bands comprise large industrial premises which typically have the highest demand for gas. NDM load bands are subdivided into four categories, roughly containing domestic, commercial and small and medium-sized industrial customers. In this paper we focus on load bands 1–3, the definitions of which are shown in Table 1.

Table 1

Classification of NDM load bands 1–3

IndexLoad bandExample
(MWh year−1)
10–73A single house
273–732A large block of flats or commercial premises
3732–5860Small industrial premises
IndexLoad bandExample
(MWh year−1)
10–73A single house
273–732A large block of flats or commercial premises
3732–5860Small industrial premises
Table 1

Classification of NDM load bands 1–3

IndexLoad bandExample
(MWh year−1)
10–73A single house
273–732A large block of flats or commercial premises
3732–5860Small industrial premises
IndexLoad bandExample
(MWh year−1)
10–73A single house
273–732A large block of flats or commercial premises
3732–5860Small industrial premises

For both LDZs and each of the three NDM load bands, daily gas consumption data are available for the 9-year period from January 1st, 2008, to February 18th, 2017. The accompanying weather data take the form of a derived variable called the composite weather variable (CWV), defined by National Grid, which takes a single daily value for each LDZ. The CWV is based primarily on air temperature. However, to strengthen the linear association between this temperature-based variable and gas consumption, its construction accommodates adjustments at high and low temperatures and allowance for the effects of other weather variables, such as wind speed. Further details can be found in National Grid (2016).

3 A non-homogeneous hidden Markov model

Denote by Y~t,j the gas demand in tenths of a gigawatt-hour pertaining to a given NDM load band on day t in region j, where j = 1 and j = 2 correspond to the Northern NO and North-East NE LDZs respectively. Rather than modelling the raw demand, we instead work on a log-scale, defining Yt,j=ln(Y~t,j) and Yt = (Yt,1, Yt,2) for t = 1,…, T. This helps to make the variance more stable across seasons and gives fixed effects in our additive model a multiplicative effect on the original scale. The CWV on day t is denoted by wt = (wt,1, wt,2). We introduce further explanatory variables ntN0={0,1,2,} and ptN0 which indicate the number of days to the next and since the previous public holiday respectively. Clearly day t is a public holiday, known colloquially in England as a bank holiday, if and only if nt=pt=0. Finally, we introduce the categorical explanatory variable rt{1,2,3} which indicates the ‘type’ of the nearest public holiday according to the calendar season in which it falls. The labelling is given in Table 2. Note that if a holiday, such as Christmas Day, falls at the weekend, it is a neighbouring weekday that will be observed as the corresponding bank holiday.

Table 2

Categorization of UK public holidays

TypeNameDays
1‘Easter’Good Friday and Easter Monday
2‘Other’May day, spring bank holiday, summer bank holiday and one-off holidays such as the Queen's Jubilee
3‘Christmas’Christmas Day, Boxing Day and New Year's Day
TypeNameDays
1‘Easter’Good Friday and Easter Monday
2‘Other’May day, spring bank holiday, summer bank holiday and one-off holidays such as the Queen's Jubilee
3‘Christmas’Christmas Day, Boxing Day and New Year's Day

If day t is equidistant between two holidays of different types, the type rt of the later holiday is used.

Table 2

Categorization of UK public holidays

TypeNameDays
1‘Easter’Good Friday and Easter Monday
2‘Other’May day, spring bank holiday, summer bank holiday and one-off holidays such as the Queen's Jubilee
3‘Christmas’Christmas Day, Boxing Day and New Year's Day
TypeNameDays
1‘Easter’Good Friday and Easter Monday
2‘Other’May day, spring bank holiday, summer bank holiday and one-off holidays such as the Queen's Jubilee
3‘Christmas’Christmas Day, Boxing Day and New Year's Day

If day t is equidistant between two holidays of different types, the type rt of the later holiday is used.

As discussed in Section 1, public holidays often have a protracted effect on gas consumption, which extends into neighbouring days. The simplest way of modelling this effect would be to classify days within a fixed window around each public holiday deterministically as proximity days. However, this approach is inflexible, relying on an arbitrary decision about the existence, size and position of the windows. Instead, we allow the number of proximity days on either side of each public holiday to be unknown. This is achieved by introducing a discrete-valued stochastic process {St:t = 0,1,…, T} with StSs={1,2,3,4} in which state 2 can apply to public holidays only and is observable, whereas states 1, 3 and 4 are ‘hidden’ and cannot be observed. Of these, states 1 and 3 allow for pre-holiday and post-holiday proximity days, whereas state 4 is a baseline for normal days. To support this we define a distribution over the states so that state 1 can be entered only from state 4 and left via state 2, whereas state 3 can be entered only from state 2.

The directed acyclic graph in Fig. 1 illustrates the assumptions of conditional independence in our hidden Markov model. A non-homogeneous Markov chain for the states will be described in Section 3.1, whereas the model for gas demand, conditional on the states, will be discussed in Section 3.2.

Directed acyclic graph illustrating the factorization of the joint density of the observations and the hidden states conditional on the time series of explanatory variables (nt,pt,rt,wt), for t = 1,2,…: in a directed acyclic graph random variables are represented by nodes; these are connected by directed arrows which indicate the order of conditioning when factorizing their joint probability density
Fig. 1

Directed acyclic graph illustrating the factorization of the joint density of the observations and the hidden states conditional on the time series of explanatory variables (nt,pt,rt,wt), for t = 1,2,…: in a directed acyclic graph random variables are represented by nodes; these are connected by directed arrows which indicate the order of conditioning when factorizing their joint probability density

3.1 Modelling the states

The challenge in modelling the joint distribution of the states arises because state 2, representing public holidays, is observable and the dates of public holidays are always known in advance. In the hidden Markov model framework, we can build this information into our prior distribution for the states St in two ways. The more direct approach would be for the states to evolve according to a non-homogeneous Markov chain with transition probabilities that vary over time. Transition into state 2 can occur with certainty if day t is a public holiday, and transition into the pre-holiday state (state 1) can become more likely in the days leading to a holiday. Alternatively we can specify the joint prior distribution for the states by updating an ‘initial’ prior, representing a homogeneous Markov chain, by conditioning on the observation that St = 2 if day t is a public holiday or St ≠ 2 otherwise, for t = 0,…, T. Of course, the resulting process is no longer Markovian.

We choose to adopt the more direct approach, based on a non-homogeneous Markov chain, and model the transition probabilities as functions of the number of days to the next (nt), and since the previous (pt), public holiday. This approach allows a flexible, non-geometric distribution for the pre- and post-holiday state sojourn times which could, for example, rapidly decay after 2 days, in keeping with views from the literature (see Section 1) and the expert judgement of engineers at NGN (see Section 3.1.2). Moreover, use of covariates nt and pt in the non-homogeneous model enables straightforward experimentation with the joint prior induced for the states, which aids in constructing a distribution with a sensible concentration of prior mass (see Section 6.1).

3.1.1 Public holidays

Suppose that the states St follow a first-order, non-homogeneous Markov chain with
for t = 1,2,…, T, and initial distribution
Since state 2 is observable, it follows that, if day 0 is a public holiday, then n0 = p0 = 0 and l2(0,0) = 1. Similarly, for any jSs, if day t is a public holiday, then nt=pt=0 and λj,2(0,0) = 1, as depicted in Table 3.
Table 3

Transition matrix for the state process when day t is (a) not a public holiday and (b) a public holiday

St − 1St
1234
(a) Not public holiday
11000
200λ2, 3(·)1 − λ2, 3(·)
3001 − λ3, 4(·)λ3, 4(·)
4λ4, 1(·)001 − λ4, 1(·)
(b) Public holiday
10100
20100
30100
40100
St − 1St
1234
(a) Not public holiday
11000
200λ2, 3(·)1 − λ2, 3(·)
3001 − λ3, 4(·)λ3, 4(·)
4λ4, 1(·)001 − λ4, 1(·)
(b) Public holiday
10100
20100
30100
40100
Table 3

Transition matrix for the state process when day t is (a) not a public holiday and (b) a public holiday

St − 1St
1234
(a) Not public holiday
11000
200λ2, 3(·)1 − λ2, 3(·)
3001 − λ3, 4(·)λ3, 4(·)
4λ4, 1(·)001 − λ4, 1(·)
(b) Public holiday
10100
20100
30100
40100
St − 1St
1234
(a) Not public holiday
11000
200λ2, 3(·)1 − λ2, 3(·)
3001 − λ3, 4(·)λ3, 4(·)
4λ4, 1(·)001 − λ4, 1(·)
(b) Public holiday
10100
20100
30100
40100

3.1.2 Other days

In the remainder of this section we consider the initial distribution Pr(S0=k|n0,p0) and transition probabilities Pr(St=k|St1=j,nt,pt) characterizing the state process if day t is not a public holiday. In such cases (nt,pt)N02\{(0,0)} and state 2 cannot be assigned. Dropping the t-subscript for brevity, it follows that l2(n, p) = 0 and λj,2(n, p) = 0 for all jSs. To complete our initial distribution we assign lk(n,p)=13 for kSs\{2}. Thereafter, we impose cyclic dynamics on the evolution of the state by assuming that all the remaining transition probabilities λj, k(n, p) for n > 0, where jSs and kSs\{2}, are 0 except the transitions from the normal state to the pre-holiday state, λ4, 1(n, p), transitions from the holiday state to the post-holiday or normal states, λ2, 3(n, p) and λ2, 4(n, p), transitions from the post-holiday state to the normal state, λ3, 4(n, p), and the self-transitions λ1, 1(n, p), λ3, 3(n, p) and λ4, 4(n, p). Necessarily this implies that λ2, 4(n, p) = 1 − λ2, 3(n, p), λ3, 3(n, p) = 1 − λ3, 4(n, p), λ4, 4(n, p) = 1 − λ4, 1(n, p) and λ1, 1(n, p) = 1, leaving us to define λ2, 3(n, p), λ3, 4(n, p) and λ4, 1(n, p). The resulting transition matrix is depicted in Table 3. It follows from the formulation of our model that an uninterrupted spell of proximity days between two public holidays will be deemed post-, rather than pre-, holiday days. This is to avoid difficulties in identifying a time of transition from the post- to the pre-holiday state.

Expert judgement from engineers at NGN suggested that a proximity effect, if it was felt, was likely to last for 1 or 2 days on either side of a public holiday. Longer periods of protracted holiday behaviour were thought to be very unlikely. If a pair of public holidays was separated by a small number of ordinary days, transitions into the proximity state, and transitions sustaining it, were expected to be more likely. To build the first of these ideas into our model, we allow the probability of transition into the pre-holiday state, λ4, 1(n, p), to decay quickly to 0 as n becomes large by modelling the logit of the transition probability as
(1)
We expect that ν4, 1, 2 < 0. Here the offset allows the parameter ν4, 1, 1 to be interpreted as the logit probability of transition into the pre-holiday state when the following day is a holiday. Taking the square root of n − 1 helps to prevent the prior variance of λ4, 1(n, p) from becoming too large as n grows. In turn, this prevents an implausible U-shaped prior for any feasible n. Given the values that are taken by n, the division of (n − 1)1/2 by 10 enables the coefficient ν4, 1, 2 to be interpreted on a scale that is comparable with that of a binary predictor, which will be introduced when modelling the other two transition probabilities, λ3, 4(n, p) and λ2, 3(n, p).
For the probability of transition out of the post-holiday state, one could assume that λ3, 4(n, p) = λ3, 4 for all (n,p)N02\{(0,0)} and appeal to the property of (homogeneous) first-order Markov chains that the sojourn time in state 3 would have a geometric distribution with mean λ3,41. However, a geometric distribution of the sojourn time in the post-holiday state, having only one parameter, is not sufficiently flexible to be reconciled with the expert judgement of NGN engineers. We therefore use the information that is encoded in the explanatory variable p, the number of days since the previous holiday, and model the logit of the transition probability as
(2)
in which I(n=k) takes the value 1 if n = k and 0 otherwise. Clearly the transition probability λ3, 4(n, p) will approach 1 as p becomes large if ν3, 4, 2 > 0. The offset in the square-root term enables the parameter ν3, 4, 1 to be interpreted as the logit probability of transition out of the post-holiday state after a stay of 1 day. In the UK, there is at least one weekend per year—the Easter weekend—with public holidays on the neighbouring Friday and Monday. We include the indicator term to allow the transition probability to differ over such weekends, in accordance with the views of industry experts at NGN. An analogous term is included in the logit probability of transition into the post-holiday state, which we model as
(3)

We have used a logit link function. As with any choice of a functional form in a regression model, there are other possibilities. However, in our application we found that the posterior means of the transition probabilities tended either to be very small or to lie in a fairly central range, roughly between 0.25 and 0.65, over which the probit link function, for example, can match the logit function closely and even an asymmetric link, such as the complementary log–log-function, can match quite closely. The greatest discrepancy between these three functions would then occur in the case of the complementary log–log-function deviating from the other two for larger probabilities, which tended not to occur. Nevertheless, there would be scope for future examination of this choice, given sufficient data.

3.2 Modelling the conditional demand for gas

Analysis involving an earlier version of the model, where observations were assumed to be conditionally independent given the states, revealed substantial auto-correlation between residuals, causing difficulties in identifying the three unknown states. Tunnicliffe-Wilson et al. (2016), section 3.10.4, also noted a carry-over effect in gas demand data. We therefore model the logarithm Yt of the demand for gas, conditionally on the state on the current and previous days, as a first-order vector auto-regression, or VAR(1), with density p(yt|yt1,St1=st1,St=st,nt,pt,rt,wt) for t = 2,…, T and, at time t = 1,
(4)
We constrain our conditional model for gas demand to be conditionally stationary, i.e. we can regard the vectors Yt as being the results of transforming values from a stationary VAR(1) process. The transformation applied on day t depends on the day of the year, day of the week, composite weather variable and state. Given the relatively short timescales of interest, such a stationarity assumption is reasonable and, in fact, non-stationarity of a non-negligible degree seems implausible in this application.
The conditional model for (Yt|Yt1=yt1,St1=st1,St=st) is given by
(5)
for t = 2,…, T, whereas, for the initial distribution of (Y1|S1=s1) in equation (4), we write
(6)
In both cases the dependence on the states comes through both the time-dependent mean μt and the precision matrix Ωt. The terms ε1,…,εT form a sequence of independent bivariate normal random vectors with zero mean, ΨSψ is a 2 × 2 real-valued matrix, with elements Ψj, k, constrained to satisfy the stationarity condition of a bivariate AR(1) process, and each Ωt is a 2 × 2 symmetric, positive definite matrix. The stationarity region Sψ is defined as the subset of 2 × 2 matrices with real-valued entries whose eigenvalues are less than 1 in modulus (Tunnicliffe-Wilson et al., 2016). The term V(Ψ,Ω1) is the stationary variance of the mean-centred errors {Ytμt:t = 1,2,…} that would prevail if the precision matrix remained equal to Ω1 at all future times, i.e. the matrix V which solves the equation V=ΨVΨ+Ω11. A solution, which is symmetric and positive definite, is guaranteed by our assumptions on the support of Ψ and Ω1. Specification of a well-behaved prior distribution over the stationarity region Sψ is very challenging because of its complex geometric constraints. Fortunately, the problem simplifies greatly if we assume that Ψ1, 1 = Ψ2, 2 = Ψon and Ψ1, 2 = Ψ2, 1 = Ψoff, in which case the necessary and sufficient conditions for stationarity are that |Ψon+Ψoff|<1 and |Ψon-Ψoff|<1. We therefore adopt this simplification to the model.
The time-dependent mean μt in expressions (5) and (6) includes terms to allow for the influence of the CWV and various seasonal- and calendar-related effects. As we are working on the logarithmic scale, each has a multiplicative effect on the demand for gas. Conditionally on the state, St = st, the mean for LDZ j is given by
(7)
for j = 1,2, where αj provides an intercept. Because of interactions with the temperature and differences in consumer habits, the effect of a public holiday may differ according to whether it falls over the Christmas period, at Easter or over the summer. We therefore allow the three types of public holiday, which were defined earlier in Table 2, to have different effects on μt, j, represented by βj,1, βj,2 and βj,3. The mean depends on the state on day t through the term Bt,j,st which controls whether or not a holiday effect is included. It is defined by
where ρβ,j(0,1) so that the holiday effect βj,rt is scaled down on proximity days (states 1 and 3) by a factor which decays to 0 with increasing separation. Since the other terms in equation (7) do not depend on the state on day t, the mean μt, j in the pre- or post-holiday state is a weighted average of the mean in the normal and holiday states. The weights depend on the number of days separating day t from the next or closest public holiday.

In equation (7), the covariate w~t,j is defined by w~t,j=wt,jmd(t),j, where md(t), j is a smoothed average for the CWV in region j on day t where d(t) ∈ {1,2,…,366}. We include both the mean-centred CWV, w~t,j, and its interaction with the raw CWV, wt, j, to allow the effect of above or below average temperatures to differ according to absolute weather conditions. The structure of this term is supported by the on-line supplementary Fig. S1 which shows the effect of wt, j on the relationship between log(gas demand) and the mean-centred CWV.

The term Δt, j is constructed to give a day of the week effect, whereas Γt, j provides a seasonal component, after allowing for the CWV. Both terms are composed by using Fourier series with
(8)
and
(9)
The six unconstrained parameters δj=(δj,1,1,,δj,2,3)R6 in equation (8) provide fixed effects for each day of the week which, by construction, sum to 0. The advantages of this parameterization are that the six elements of δj are unconstrained and, by treating them as exchangeable a priori, we can induce a prior for the seven fixed effects which is symmetric with respect to the day of the week. This parameterization and prior do not appear to have been used before in the gas demand literature. The Fourier series for the seasonal term in equation (9) is truncated at a modest number Kγ of harmonics beyond which we assume that their contribution is negligible. We note that an alternative representation of Γt, j using cyclic cubic regression splines proved to be less successful at explaining the seasonal variation when using a small number of degrees of freedom. The choice of Kγ will be discussed further in Section 6.
For the precision matrix Ωt in equations (5) and (6) we adopt an approach which we believe is novel in the gas demand literature. We allow Ωt to depend on St and hence to vary over time, with different values for the holiday and normal states when St = 2 and St = 4. Adopting an approach that is similar to that taken for the mean μt, we model the precision matrix on proximity days (states 1 and 3) by interpolating between these two values, with weights that depend on the number of days to or from the neighbouring public holiday. This is most natural when working in R3, rather than the constrained space of 2 × 2 covariance matrices. We therefore reparameterize the precision matrix Ωt in terms of its square-root-free Cholesky decomposition Ωt=TΩtDΩt1TΩt (Pourahmadi, 1999), in which TΩt is a unit lower triangular matrix with (2,1)-element − ϕt and DΩt=diag(τt,11,τt,21). The new parameters have a convenient interpretation in terms of the auto-regression of εt,2 on εt,1, with ϕtR representing the auto-regressive coefficient and τt,2 > 0 the associated conditional precision. The parameter τt,1 > 0 represents the marginal precision of εt,1. For the coefficient, ωt,1 = ϕt, and the log-precisions, ωt,2 = ln (τt,1) and ωt,3 = ln (τt,2), we adopt linear models with, for i = 1,2,3,
(10)
in which ηi is an intercept and θi allows for the effect of a public holiday. Again, we assume that days before and after a holiday have the same effect and define
where ρθ(0,1), which scales down the holiday effect θi on proximity days. The term Kt, i in equation (10) provides a seasonal component. A seasonal component was deemed necessary after analysis with an earlier version of the model, in which Kt, i had been omitted, revealed seasonal variation in the residual variance, which caused confounding between the state allocation and low frequency seasonal change. As with the seasonal component Γt, j in the time-varying mean, we use a truncated Fourier series to represent Kt, i, taking
(11)
The choice of truncation point Kκ will be discussed further in Section 6.

4 Prior distribution

Denoting the unknown parameters of the Markov model for state evolution by Λ and the parameters of the conditional model for demand by Π, we adopt a prior distribution in which Λ and Π are independent. The two independent components of our prior are described in the sections which follow.

4.1 Transition probabilities

The parameters Λ=(ν4, 1, 1, ν4, 1, 2, ν3, 4, 1, ν3, 4, 2, ν3, 4, 3, ν2, 3, 1, ν2, 3, 2) comprise the linear coefficients from the logit probabilities (1)–(3). In the absence of a justification for correlation in either direction in our prior beliefs about the values of these parameters, we adopt a prior with independence between the νj, k, i and take
(12)
The symmetric logit normal distribution, created by assigning a normal N(0, v) distribution to the logit transformation of a probability-valued random variable, becomes bimodal when v > 1. U-shaped priors for the transition probabilities λj, k(n, p) were not consistent with our prior beliefs and so we moderated our choice of prior variances vj, k, i to avoid bimodality. To avoid the complicated likelihood leading to implausible sets of parameter values having undue weight in the posterior, a process of trial and improvement, guided by the interpretation of the νj, k, i, was then used to choose the hyperparameters. For a recent discussion of such issues see, for example, Gelman et al. (2017).

4.2 Parameters of the conditional demand model

For region j = 1,2, let βj = (βj,1,…, βj,3), ζj = (ζj,1, ζj,2), γj=(γj,1,1,,γj,1,Kγ,γj,2,1,,γj,2,Kγ) and, similarly, δj = (δj,1,1,…, δj,2,3). Define β=(β1,β2), with corresponding concatenated vectors for the other region-specific parameters. In the model for the time-varying precision matrix, let η=(η1, η2, η3), θ=(θ1, θ2, θ3), κi=(κi,1,1,,κi,2,Kκ) and κ=(κ1,κ2,κ3). Finally, let ρ=(ρβ,1, ρβ,2, ρθ) for the parameters governing the rate of decay of the holiday effects in the time-varying mean and precision matrix. The model parameters then comprise Π={Ψ,α,β,γ,δ,ρ,ζ,η,θ,κ}. We impose prior independence between these parameter blocks.

As discussed in Section 3.2 we assume that the auto-regressive coefficient matrix Ψ is composed of a common diagonal element Ψon and a common off-diagonal element Ψoff so that the stationarity condition reduces to |Ψon+Ψoff|<1 and |Ψon-Ψoff|<1. Our novel approach makes it easier to impose this constraint. For this, we first define χ1 = Ψon + Ψoff and χ2 = Ψon−Ψoff and then reparameterize Ψ in terms of new parameters ξi=(χi+1)/2(0,1) for i = 1,2. To construct our prior, we make ξ1 and ξ2 independent and give them beta distributions, ξi∼beta(aξ, i, bξ, i).

For the parameters in the time-varying mean μt governing the intercept, influence of the CWV, and the seasonal, day of the week and public holiday effects, we adopt hierarchical priors which allow information to be shared between the NO (j = 1) and NE (j = 2) LDZs. For the intercept we take αj|μαIIDN{μα,(1-rα)vα}, for j = 1,2, with μαN(mα, rγvα), in which the correlation between sites is rα, the marginal variance is vα and the marginal mean is mα. Similarly, independently for the coefficients of the mean-centred CWV (i = 1) and its interaction with the raw value (i = 2), we choose ζj,i|μαIIDN{μζ,i,(1-rζ,i)vζ,i}, for j = 1,2, with μζ, iN(mζ, i, rζ, ivζ, i). For the day of the week effects, independently for m = 1,2 and k = 1,2,3 we choose δj,m,k|μδ,m,kIIDN{μδ,m,k,(1-rδ)vδ}, for j = 1,2, with μδ, m, kN(0, rδvδ). For the seasonal effects, independently for m = 1,2 and k = 1,…, Kγ, we take γj,m,k|μγ,m,kIIDN{μγ,m,k,(1-rγ)vγ,k}, for j = 1,2, with μγ, m, kN(0, rγvγ, k), in which vγ,1vγ,2vγ,Kγ. We note that the assignment of independent, zero-mean normal distributions N(0, vγ, k) to a pair of Fourier coefficients γj,1, k and γj,2, k is equivalent to assigning a uniform distribution to the phase of the kth harmonic and, independently, a Rayleigh distribution to its amplitude, with scale parameter √vγ, k. Our prior therefore conveys the idea that the size of the seasonal harmonics will decay as their frequency increases.

For the public holiday effects, we adopt a prior of the form βj|μβIIDN3{μβ,(1-rβ)Vβ}, For j = 1,2, with μβN3(0, rβVβ). Here Vβ is a 3 × 3 compound symmetric matrix with non-zero off-diagonal elements, allowing positive correlation between the effects of each type of public holiday so that information can be pooled across types. This offers a compromise between an impractical assumption that the βj, i are independent for each site j, which would limit the information that is available for inference, and an inflexible assumption that they are all equal.

The factor ρβ, j acts on proximity days to scale down the holiday effect in the time-varying mean μt, j for LDZ j, j = 1,2. Similarly, the factor ρθ scales down the holiday effect in our reparameterization of the time-varying precision matrix, Ωt, of the errors εt. The factors ρβ,1 and ρβ,2 in the means for the NO and NE LDZs have the same function and we expect them to be very similar. Although we would also expect the ρβ, j to be informative about ρθ, we believe that they will carry less information because they relate to the rate of decay of parameters which play different roles. We therefore construct an asymmetric hierarchical prior by taking ρ~·=logit(ρ·)=ln{ρ·/(1ρ·)}, and then choosing ρ~β,j|ρ~βIIDN{ρ~β,(1-rρ~,1)vρ~} in the means for LDZs j = 1,2 and then ρ~β|μρ~N{μρ~,(1rρ~,2)rρ~,1vρ~} and ρ~θ|μρ~N{μρ~,(1rρ~,2)rρ~,1vρ~} and finally μρ~N(mρ~,rρ~,1rρ~,2vρ~). The marginal prior moments for the logit parameters are E(ρ~β,j)=E(ρ~θ)=mρ~, var(ρ~β,j)=vρ~, corr(ρ~β,j,ρ~θ)=rρ~,1rρ~,2, for j = 1,2, and var(ρ~θ)=rρ~,1vρ~ and corr(ρ~β,1,ρ~β,2)=rρ~,1 where rρ~,1,rρ~,2(0,1). We can therefore choose the correlation between the ρ~β,j to be greater than their correlation with ρ~θ.

Finally, the intercept and holiday effects in our reparameterized time-varying precision matrix are given independent normal distributions, taking ηiN(mη, i, vη, i) and θiN(0, vθ, i) for i = 1,2,3. Similarly, the seasonal effects are given independent normal priors, with κi, m, kN(0, vκ, i, k), m = 1,2, k = 1,…, Kκ, for i = 1,2,3.

Our choices for the hyperparameters in each component of our prior distribution are detailed in Section 6.1.

5 Posterior inference

The posterior distribution of the model parameters follows from Bayes theorem as
in which y represents the complete time series, here y=y1:T, where the general notation xi:j for i<j indicates a sequence (xi, xi+1,…, xj − 1, xj). The term p(y|Π,Λ,n,p,r,w) is the observed data likelihood given by
in which the sum is taken over all possible sequences of states, s=s0:T. It can be calculated efficiently by using a filtering algorithm which computes Pr(St=k,y1:t|Π,Λ,n,p,r,w), kSs, recursively for t = 0,…, T and finally
The lagged dependence of Yt+1 on St in the directed acyclic graph in Fig. 1 complicates recursive algorithms for quantifying posterior uncertainty in the hidden states. However, the model can be reformulated to simplify its conditional independence structure by defining an augmented state on day t by S~t=(St1,St) for t = 1,…, T. The demand Yt is then conditionally independent of the augmented state on the previous day S~t1 given the augmented state on the current day S~t and the previous observation Yt − 1. As some transitions in our original state process have zero probability, there are only 11, rather than 42=16, possible values for the augmented state S~t and we denote this set of values by Ss~. The mapping from Ss×Ss to Ss~ is detailed in Table S1 of the on-line supplementary materials. The transition matrix for the new hidden process is sparse and can be represented in terms of the original transition probabilities λj, k(n, p), as indicated in supplementary Table S2. The initial distribution Pr(S~1|Λ,p0:1,n0:1) can similarly be formed by taking an appropriate product of terms, lj(n0, p0) and λj, k(n1, p1). Filtering algorithms for hidden Markov models of this simple form are standard. See, for example, chapter 2 of MacDonald and Zucchini (1997). Our algorithm is given in section S2.2 of the supplementary materials.

The posterior distribution π(Π,Λ|y,n,p,r,w) cannot be evaluated in closed form and so we build a numerical approximation using a Hamiltonian Monte Carlo (HMC) sampling scheme. We note that the likelihood p(y|Π,Λ,n,p,r,w) is not symmetric with respect to the labels of the hidden states. As a result the state-specific parameters are all identifiable in the posterior, and so methods to force an identified sample, such as those discussed in Stephens (2000), are not required.

5.1 Approximating the posterior for the model parameters

HMC sampling is a special case of the Metropolis algorithm; for example, see Neal (2011) or Betancourt (2017) for an introduction. It relies on the introduction of auxiliary variables that are interpreted as the momentum of a particle whose position in space is represented by the parameter values. This enables efficient proposals to be generated by exploiting Hamiltonian dynamics and modelling the movement of the sampler around the joint posterior of the momentum and position variables as the motion of the particle through an unbounded, frictionless space. We use rstan (Stan Development Team, 2016), the R interface to the Stan software (Carpenter et al., 2017), to implement the HMC algorithm. Stan requires users to write a program in the probabilistic Stan modelling language, the role of which is to provide instructions for computing the logarithm of the kernel of the posterior density function. The Stan software then automatically tunes and runs a Markov chain simulation to sample from the resulting posterior.

5.2 Approximating the posterior for the hidden states

Although the hidden states St are not sampled as part of the HMC scheme, their marginal posteriors can be approximated by Rao–Blackwellization through
(13)
for t = 0,…, T where Π[i] and Λ[i] denote the ith posterior samples of the model parameters Π and Λ for i = 1,…, M. The full sample smoothed probabilities Pr(S~t=k|y,Π,Λ,n,p,r,w) can be computed in a backward recursion, starting at time t = T, using the forward probabilities Pr(S~t=k,y1:t|Π,Λ,n,p,r,w) calculated when computing the observed data likelihood. We can then marginalize over St − 1 to compute the probabilities in our original four-state model Pr(St=k|y,Π,Λ,n,p,r,w), kSs, for use in expression (13). The complete algorithm is given in section S2.3 of the on-line supplementary materials.

6 Application to daily demand data

The daily gas consumption data from NGN were introduced in Section 2. Using the algorithm that was described in Section 5, we fitted our hierarchical model to data from each of the three NDM load bands. When including a large number of Fourier components in the model for mean log-demand, the amplitude (γj,1,k2+γj,2,k2)1/2 of the kth harmonic in expression (9) for each site j was negligible when k > 6 and so we truncated the series at Kγ = 6. Applying similar reasoning, we truncated the Fourier series in expression (11) for the seasonally varying precision at Kκ = 12.

6.1 Prior specification

The structure of the prior distribution for the model parameters was outlined in Section 4. It began with an assumption of independence between the parameters Λ of the marginal model for the hidden states and the parameters Π of the conditional model for gas demand given the states. The moments in the prior for Π were chosen according to the specification in section S3.1 of the on-line supplementary materials and the densities were generally very flat.

Completing our prior for Λ requires choices of means and variances in the normal distributions (12) for the logit coefficients. Our general procedure was described in Section 4.1. By several iterations of calculation and inspection we arrived at the specification in section S3.2 of the supplementary materials which yielded the distribution for the states shown in Fig. 5 later in Section 6.3.2 over a representative year. The pointwise prior mode for the entire state sequence is displayed in supplementary Fig. S2.

6.2 Hamiltonian Monte Carlo implementation

The Stan program representing our hierarchical model is available from

https://academic.oup.com/jrsssa/issue.

For the data from each load band, using the rstan interface to the Stan software, we ran four HMC chains initialized at different starting points for 10000 iterations, half of which were discarded as burn-in. It was convenient to compute the posterior samples for the smoothed probabilities Pr(St=l|y,Π,Λ,n,p,r,w) in expression (13) on line. To reduce the associated storage overheads, we therefore thinned the output to retain every 10th posterior draw. The usual graphical and numerical diagnostics gave no evidence of any lack of convergence and the effective sample size was at least 1565 for every parameter.

6.3 Posterior inference

6.3.1 Parameter inference

Fig. 2 shows the fixed effects βj, k in the mean μt, j (7) for each type of public holiday in each LDZ. As expected, there is evidence that the mean takes larger values on public holidays in load band 1, which represents domestic customers, and smaller values in load bands 2 and 3, which broadly represent commercial and small industrial customers, whose business activity typically stops on holidays. The effect seems to be the largest among industrial customers (load band 3) and smallest among domestic users. In general, the effect of a public holiday is less pronounced during the public holidays around Easter. For each load band, if we compare the effects across the two LDZs, we can see that the holiday effects in load band 1 are nearly identical, which suggests that any differences in load bands 2 and 3 arise due to differences in the mix of customers that they represent. Different types of, for example, commercial, educational and industrial customers may have different patterns of closure. In fact some, such as shops, may remain open on certain public holidays. The mix of such, especially industrial, customers will differ between the two LDZs. Moreover, there may be differences in customary closure patterns between, say, the West Yorkshire conurbation (NE) and Tyneside (NO). It is not surprising, therefore, that we observe differences between LDZs in the holidays effects in load bands 2 and 3, most notably the ‘Easter’ effect in load band 2 and the ‘other’ effect in load band 3. This is exemplified in the on-line supplementary Fig. S3 which plots the marginal posterior densities of the differences, β1, kβ2, k, in the effects of each type, k = 1,2,3, of public holiday for each load band.

Marginal posterior densities for the fixed effects βj, k in the mean μt, j for each type, k = 1,2,3, of public holiday in each LDZ, j = 1,2 for (a) Easter, (b) other and (c) Christmas, and (d) marginal posterior density for the parameter ρβ, j governing the rate of decay of the holiday effect on proximity days in each LDZ, j = 1,2: , load band 1, NO; , load band 1, NE; , load band 2, NO; , load band 2, NE; , load band 3, NO; , load band 3, NE
Fig. 2

Marginal posterior densities for the fixed effects βj, k in the mean μt, j for each type, k = 1,2,3, of public holiday in each LDZ, j = 1,2 for (a) Easter, (b) other and (c) Christmas, and (d) marginal posterior density for the parameter ρβ, j governing the rate of decay of the holiday effect on proximity days in each LDZ, j = 1,2: graphic, load band 1, NO; graphic, load band 1, NE; graphic, load band 2, NO; graphic, load band 2, NE; graphic, load band 3, NO; graphic, load band 3, NE

Fig. 2(d) shows the marginal posterior density for the parameter ρβ, j which governs the rate of decay of the holiday effect in the pre- and post-holiday states. The posteriors for each LDZ in load band 3 seem to support larger values than those for load bands 1 and 2, suggesting a slower rate of decay.

The corresponding plots for the fixed effects θi in the parameters ωt, i (10) of the square-root-free Cholesky decomposition of the precision matrix Ωt are shown in Fig. 3. For all three load bands, there is strong evidence of a positive effect on the auto-regressive coefficient ωt,1 = ϕt, suggesting a stronger positive correlation between the residuals in the two LDZs on public holidays. For the logarithms of the marginal error precision ωt,2 = ln (τt,1) and the conditional error precision ωt,3 = ln (τt,2), there is little evidence of a public holiday effect, except for a negative relationship with the log-conditional-precision in load band 3. This suggests that public holidays are associated with higher residual variance for log(gas demand) by industrial customers.

Marginal posterior densities for the fixed effects θi in the parameters ωt, i of the square-root-free decomposition of the precision matrix Ωt, ωt,1 = ϕt, (a) ωt,3 =  ln (τt,2), the auto-regressive component, (b) ωt,2 =  ln (τt,1) the log-marginal precision, (c) and (d) the parameter ρθ governing the rate of decay of the holiday effect on proximity days: , load band 1; , load band 2; , load band 3
Fig. 3

Marginal posterior densities for the fixed effects θi in the parameters ωt, i of the square-root-free decomposition of the precision matrix Ωt, ωt,1 = ϕt, (a) ωt,3 = ln (τt,2), the auto-regressive component, (b) ωt,2 = ln (τt,1) the log-marginal precision, (c) and (d) the parameter ρθ governing the rate of decay of the holiday effect on proximity days: graphic, load band 1; graphic, load band 2; graphic, load band 3

6.3.2 State inference

For load bands 1–3 respectively, Figs 4(a)–4(c) show the pointwise posterior mode for the state sequence S1,…, ST. Different patterns are evident across the different load bands. In load band 1, there is little evidence of a proximity effect, with the posterior probability of assignment to the pre- and post-holiday states, 1 and 3, being less than 0.5 in the neighbourhood of most public holidays. For load bands 2 and 3, Figs 4(b) and 4(c) provide clear evidence of a proximity effect around the Christmas period and over the Easter weekend, with additional evidence of a post-holiday effect after the late spring and summer public holidays in load band 2. The sojourn times in the proximity states are longest around Christmas, most notably before Christmas Day in load band 3 and after Boxing Day in load band 2.

Pointwise posterior mode for the state sequence St for load bands (a) 1, (b) 2 and (c) 3: , state 1; , state 2; , state 3; , state 4
Fig. 4

Pointwise posterior mode for the state sequence St for load bands (a) 1, (b) 2 and (c) 3: graphic, state 1; graphic, state 2; graphic, state 3; graphic, state 4

Clearly the pointwise posterior mode for the state sequence cannot give any indication of the uncertainty in the state allocation. Therefore, using load band 3 as an example, Fig. 5 shows the posterior distribution for the states St in a representative year (2015), with the prior distribution overlaid. There is a marked difference between the prior and posterior in the period around a public holiday. Although the prior treats all public holidays as if they are (nearly) exchangeable, there is strong evidence of differences between public holidays in the posterior. This shows the advantage of allowing the extent of proximity days to be unknown rather than periods of common, fixed duration around each public holiday. The corresponding plots for load bands 1 and 2 are given in Figs S4 and S5 of the on-line supplementary materials. Whereas load band 2 shows similar patterns to load band 3, there is very little difference between the prior and posterior for load band 1, though the prior does assign slightly more mass to the pre- and post-holiday states in the neighbourhood of public holidays. This suggests that there is little evidence about the presence, or otherwise, of a proximity effect in the data for load band 1.

Prior and posterior for the state sequence St for load band 3 in an illustrative year (2015): , state 1, prior; , state 1, posterior; , state 2, prior; , state 2, posterior; , state 3, prior; , state 3, posterior; , state 4, prior; , state 4, posterior
Fig. 5

Prior and posterior for the state sequence St for load band 3 in an illustrative year (2015): graphic, state 1, prior; graphic, state 1, posterior; graphic, state 2, prior; graphic, state 2, posterior; graphic, state 3, prior; graphic, state 3, posterior; graphic, state 4, prior; graphic, state 4, posterior

6.3.3 Posterior predictive inference

Consider a simplified version of the model that was described in Section 3 which ignores the proximity effect and omits the pre- and post-holiday states (states 1 and 3) so that each day is classified either as a public holiday (state 2) or not (state 4). As discussed in Section 1, many models from the literature are structured in this way, allowing for public holidays but not any protracted effect on neighbouring days. To illustrate the benefit of incorporating the proximity effect, we use the framework of posterior predictive checks (Gelman et al., 2013) to compare inferences under our four-state non-homogeneous Markov model with those computed under the simplified two-state model. In this framework, the basic idea is to measure the extent to which a model captures some data summary of interest by comparing its posterior predictive distribution with the value that was observed. In our case the (approximate) posterior predictive distribution is computed numerically on the basis of a Markov chain Monte Carlo sample from the posterior of the model parameters by simulating replicated data sets in one-to-one correspondence with the posterior draws. If the model can capture adequately the aspect of the data summary of interest, the observed value will look plausible under its posterior predictive distribution.

On public holidays and days which are not in the neighbourhood of a public holiday, both the two-state and the four-state models seem to provide a good fit to the data for all load bands. For instance, in load band 3, for all the days during the observation period which were public holidays, Fig. 6 compares the posterior predictive distribution for log(gas demand) under each model with the values that were observed. Similarly, Fig. S6 in the on-line supplementary materials provides an analogous comparison for days which were 10 days away from a public holiday. In these plots, points lying above the unit diagonal indicate dates for which predicted log-demand exceeds observed log-demand, and conversely for points below the diagonal. Good model fit is indicated by the majority of points lying close to the unit diagonal, suggesting that most observations fall within the main body of their posterior predictive distribution. This is broadly true for both models in each LDZ, with only 1.15–6.67% of observations lying outside the central 95% of their associated posterior predictive distribution. Interestingly, in Fig. 6, most of the observations that are flagged as lying in the (upper) tail of their posterior predictive distribution arise from just two dates: the Royal wedding on April 29th, 2011, and the Diamond Jubilee of Queen Elizabeth II on May 6th, 2012. These were one-off public holidays, rather than annual events, and it appears that they were associated with less of a drop in the demand for gas in load band 3 than in other public holidays occurring around that time of the year. This may have been because the holidays were not observed by all industrial customers. In contrast, in load bands 1 and 2, the demand for natural gas on these public holidays was modelled well.

For load band 3 and (a) LDZ NO and (b) LDZ NE, posterior predictive means versus observed log(gas demand) for each public holiday in the observation period (upper panels refer to the simple two-state model; lower panels to the four-state non-homogeneous Markov model): , , 2.5% and 97.5% points in the posterior predictive distributions; , observation inside the central 95% of the posterior predictive distribution; , observation outside the central 95% of the posterior predictive distribution
Fig. 6

For load band 3 and (a) LDZ NO and (b) LDZ NE, posterior predictive means versus observed log(gas demand) for each public holiday in the observation period (upper panels refer to the simple two-state model; lower panels to the four-state non-homogeneous Markov model): graphic, graphic, 2.5% and 97.5% points in the posterior predictive distributions; graphic, observation inside the central 95% of the posterior predictive distribution; graphic, observation outside the central 95% of the posterior predictive distribution

The advantage of allowing a proximity effect becomes clear when we consider days that are close to a public holiday. For example, Fig. 7 shows the corresponding posterior predictive checks for all the days which were 1 day from a public holiday in load band 3. For each LDZ, it appears that the two-state model systematically overestimates the demand for gas, with 9.63% and 10.67% of observations lying outside (and mostly below) the central 95% of their posterior predictive distributions in the NO and NE LDZs respectively. In contrast, the posterior predictive densities under the four-state model generally support smaller values and are more diffuse, reflecting the additional uncertainty in the demand for gas that typically accompanies the period around public holidays. Correspondingly, the observations appear more plausible under the posterior predictive distributions, with only 5.19% and 4.44% lying outside the central 95% of their posterior predictive distributions in the two LDZs. A similar effect, albeit less marked, is observed for days which are 2 or 3 days from a public holiday. Very similar conclusions can be drawn based on the analysis of the data from load band 2, plots for which are provided in Figs S7–S9 of the on-line supplementary materials. Not surprisingly, there is no notable difference between the two- and four-state models for load band 1, where our analysis suggested little evidence of a proximity effect. Supplementary Table S3 contains a summary of the results that are described in this section for all load bands.

For load band 3 and (a) LDZ NO and (b) LDZ NE, posterior predictive means versus observed log(gas demand) for each day in the observation period which was 1 day from a public holiday (upper panels are based on the simple two-state model; lower panels on the four-state non-homogeneous Markov model): , , 2.5% and 97.5% points in the posterior predictive distributions; , observation inside the central 95% of the posterior predictive distribution; , observation outside the central 95% of the posterior predictive distribution
Fig. 7

For load band 3 and (a) LDZ NO and (b) LDZ NE, posterior predictive means versus observed log(gas demand) for each day in the observation period which was 1 day from a public holiday (upper panels are based on the simple two-state model; lower panels on the four-state non-homogeneous Markov model): graphic, graphic, 2.5% and 97.5% points in the posterior predictive distributions; graphic, observation inside the central 95% of the posterior predictive distribution; graphic, observation outside the central 95% of the posterior predictive distribution

7 Discussion

The energy sector in the UK is changing to harness ecologically more sound technologies in the face of growing economic, environmental, societal and public health concerns. As a comparatively low cost source of energy, natural gas has an important role to play in this changing energy mix. Yet, clearly, its contribution is contingent on efficient operation of the gas industry. One of the ways in which operational efficiency and decision making, more generally, can be enhanced is through improved forecasting of the demand for gas.

We have presented a novel model for daily gas demand which allows for a protracted effect of public holidays which extends into neighbouring days. A key feature of the model is that we allow the existence, duration and location of the proximity days to be unknown. This is achieved by modelling the data by using a four-state non-homogeneous Markov model with cyclic dynamics, and states that represent pre- and post-holiday days, as well as (observable) public holidays and normal days. The dates of public holidays are always known in advance and we allow for this by making our model non-homogeneous, with transition probabilities that depend on the number of days to the next, and since the previous, public holiday. We establish an interpretable parameterization for the logit transition probabilities and illustrate how this can be used to assign a prior distribution for the states which represents a flexible compromise between a model which does not allow for a proximity effect and a model which fixes the dates over which a proximity effect is felt.

Conditionally on the states, we model the natural logarithm of gas demand over two large geographical regions by using a first-order vector auto-regression, which is conditionally stationary. A novel feature of this model is the assumption of a symmetric, though non-diagonal, auto-regressive coefficient matrix which greatly simplifies the geometry of the stationarity region, allowing it to be expressed as a unit square. Other features of our model which are, to the best of our knowledge, novel in the gas demand literature include the symmetric treatment of the day of the week effect and the incorporation of seasonal and holiday effects in the variance of the residuals.

We applied the model to data from NGN, which is responsible for the distribution of gas to most of the north-east of England, northern Cumbria and Yorkshire. An assessment of model fit by using posterior predictive checks highlighted the improvement that is gained by allowing for a proximity effect among commercial and small industrial customers. Similarly, the posterior distribution for the states revealed different patterns in the identification of proximity days between public holidays and across different groups of customers. This highlights the advantage of allowing uncertainty in the identification of proximity days compared with an inflexible approach that treats them as periods of common fixed duration around each public holiday. The results of a preliminary version of the model are already being used by NGN in their annual medium-term forecasting exercise, with plans to extend the approach to consider the demand for gas by larger industrial customers.

Possible enhancements to the model include allowing the transition probabilities to depend, not just on the proximity of the preceding and following holidays, but also on the type of holiday and seasonal or temperature effects. With the limited number of observed holidays in our data, inference for the additional parameters would be difficult but this remains a topic for future research.

Acknowledgements

The authors thank NGN for providing the data and for many helpful discussions and comments during the research project. We also extend our gratitude to Matt Linsley, Manager of the Industrial Statistics Research Unit at Newcastle University, for his important role in organizing and facilitating meetings with NGN. We are also grateful to two reviewers for comments which have led to improvements in the paper.

References

Akouemo
,
H. N.
and
Povinelli
,
R. J.
(
2016
)
Probabilistic anomaly detection in natural gas time series
.
Int. J. Forecast.
,
32
,
948
956
.

Betancourt
,
M.
(
2017
)
A conceptual introduction to Hamiltonian Monte Carlo
. Preprint arXiv:1701.02434. Applied Statistics Center, Columbia University, New York.

Brabec
,
M.
,
Konár
,
O.
,
Malý
,
M.
,
Kasanický
,
I.
and
Pelikán
,
E.
(
2015
)
Statistical models for disaggregation and reaggregation of natural gas consumption data
.
J. Appl. Statist.
,
42
,
921
937
.

Brabec
,
M.
,
Konár
,
O.
,
Malý
,
M.
,
Pelikán
,
E.
and
Vondráček
,
J.
(
2009
)
A statistical model for natural gas standardized load profiles
.
Appl. Statist.
,
58
,
123
139
.

Brabec
,
M.
,
Malý
,
M.
,
Pelikán
,
E.
and
Konár
,
O.
(
2010
) Statistical model of segment-specific relationship between natural gas consumption and temperature in daily and hourly resolution. In
Natural Gas
(ed.
P.
Potocnik
), pp.
393
416
.
London
:
Intech Open
.

Carpenter
,
B.
,
Gelman
,
A.
,
Hoffman
,
M. D.
,
Lee
,
D.
,
Goodrich
,
B.
,
Betancourt
,
M.
,
Brubaker
,
M. A.
,
Guo
,
J.
,
Li
,
P.
and
Riddell
,
A.
(
2017
)
Stan: a probabilistic programming language
.
J. Statist. Softwr.
,
76
,
1
32
.

Charlton
,
N.
and
Singleton
,
C.
(
2014
)
A refined parametric model for short term load forecasting
.
Int. J. Forecast.
,
30
,
364
368
.

Chu
,
S.
and
Majumdar
,
A.
(
2012
)
Opportunities and challenges for a sustainable energy future
.
Nature
,
488
,
294
303
.

Dodds
,
P. E.
and
McDowall
,
W.
(
2013
)
The future of the UK gas network
.
En. Poly
,
60
,
305
316
.

Gascón
,
A.
and
Sánchez-Úbeda
,
E. F.
(
2018
)
Automatic specification of piecewise linear additive models: application to forecasting natural gas demand
.
Statist. Comput.
,
28
,
201
217
.

Gelman
,
A.
,
Carlin
,
J. B.
,
Stern
,
H. S.
,
Dunson
,
D. B.
,
Vehtari
,
A.
and
Rubin
,
D. B.
(
2013
)
Bayesian Data Analysis
, 3rd edn.
Boca Raton
:
Chapman and Hall–CRC
.

Gelman
,
A.
,
Simpson
,
D.
and
Betancourt
,
M.
(
2017
)
The prior can often only be understood in the context of the likelihood
.
Entropy
,
19
, article
555
.

Her Majesty's Parliament
(
2008
)
The Climate Change Act 2008
.
London
:
Stationery Office
.

Hor
,
C.-L.
,
Watson
,
S. J.
and
Majithia
,
S.
(
2006
)
Daily load forecasting and maximum demand estimation using ARIMA and GARCH. 9th Int. Conf. Probabilistic Methods Applied to Power Systems, Stockholm, June 11th–15th.

Lyness
,
F. K.
(
1984
)
Gas demand forecasting
.
Statistician
,
33
,
9
21
.

MacDonald
,
I. L.
and
Zucchini
,
W.
(
1997
)
Hidden Markov and Other Models for Discrete-valued Time Series
.
London
:
Chapman and Hall
.

National Grid
(
2016
)
Gas demand forecasting methodology
. National Grid, Warwick.

Neal
,
R. M.
(
2011
) MCMC using Hamiltonian dynamics. In
Handbook of Markov Chain Monte Carlo
(eds
S.
Brooks
,
A.
Gelman
,
G.
Jones
and
X.-L.
Meng
), pp.
113
162
.
Boca Raton
:
Chapman and Hall–CRC
.

Pardo
,
A.
,
Meneu
,
V.
and
Valor
,
E.
(
2002
)
Temperature and seasonality influences on Spanish electricity load
.
En. Econ.
,
24
,
55
70
.

Pourahmadi
,
M.
(
1999
)
Joint mean-covariance models with applications to longitudinal data: unconstrained parameterisation
.
Biometrika
,
86
,
677
690
.

Soldo
,
B.
(
2012
)
Forecasting natural gas consumption
.
Appl. En.
,
92
,
26
37
.

Stan Development Team
(
2016
)
RStan: the R interface to Stan
. R Package Version 2.12.1.

Stephens
,
M.
(
2000
)
Dealing with label switching in mixture models
.
J. R. Statist. Soc.
B,
62
,
795
809
.

Tunnicliffe-Wilson
,
G.
,
Reale
,
M.
and
Haywood
,
J.
(
2016
)
Models for Dependent Time Series
.
Boca Raton
:
CRC Press
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)