Skip to main content

An adaptive model for the autonomous monitoring and management of water end use

Abstract

Most pattern classification systems are usually developed based on the training of historical data, and as a result, the performance of these models relies heavily on the amount of collected information. However, in many cases, such data collection process is relatively costly, which eventually limits the efficiency as well as the widespread implementation of the final developed model. In this context, the paper focuses on presenting an advanced universal water management system, which could interface with both water consumers and utilities via smart phone and web application. Originally, Autoflow©, a prototype tool that is used to disaggregate total water consumption into each end-use category was developed, which achieved an accuracy ranging from 74 to 94%. However, a drawback of this model was that it was trained with data collected from only Australia; therefore, accuracy reductions would likely be observed when this system is implemented in different countries having very different water using appliances and behaviour patterns. To avoid the costly data collection process for model calibration when operating in new regions, this research study introduces an enhanced model, namely AutoflowU (i.e. U stands for Universal). This new tool can be applied in residential properties globally to autonomously disaggregate water consumption into the seven main water end-use categories, namely: shower, toilet, tap, clothes washer, dishwasher, evaporative air cooler and irrigation, without the need for collecting new regional end-use data for model calibration. In order to develop this new tool, Decision Trees, Dynamic Time Warping (DTW), Self Organising Map (SOM) and Hidden Markov Model (HMM) techniques were utilised. The test results obtained from 230 properties in both Australia and the US showed that the AutoflowU achieved 72–93% accuracy.

Introduction

Urbanised regions in Australia have been facing a series of complex problems involving the supply and demand management of water resources (Sahin et al. 2014a). In the last few years, many urban areas (e.g. Melbourne Sydney, Southeast Queensland or Adelaide) have endured severe droughts in conjunction with high population growth. In response, water demand management programs and policies were instituted to ensure the current urban water demand could be met and also provide a solution to the problem of sustaining the water resource for future use. However, the limited understanding of how water was used has remarkably reduced the effectiveness of the applied water demand management schemes (Sahin et al. 2014b). Recent advancement in sensor technology has allowed for the development of an intelligent water management system that presents new approaches to enhance water security through capturing, analysing, providing real-time water consumption data and creating an inter-connection between water utilities and customers (Stewart et al. 2013; Beal and Stewart 2011), as well as changing customers’ water use behaviour (Stewart et al. 2013).

Existing autonomous water end use classification applications

A number of different mathematical models have been proposed to autonomously assign unclassified water consumption patterns into appropriate end use categories. The first-generation approaches (e.g. Trace Wizard and Identiflow) that employed simple Decision Tree method were resource intensive, which required significant data analysis based on manually populated templates to achieve the disaggregation task (Stewart et al. 2010). The second-generation approach (e.g. HydroSense) required a sensor network on all water end use appliances achieved higher accuracy. However, this approach was cost intensive and intrusive as it requires many sensing devices to be attached to water appliances at the properties (Froehlich et al. 2009, 2011) and can artificially influence water use behaviour. The latest third-generation approach (e.g. Nguyen et al. 2011), requiring only one smart meter installed at the property boundary helps overcome the deficiencies of the first two. This approach utilises intelligent machine learning algorithms for the end use disaggregation task. Some of notable studies includes (Pastor-Jabaloyes et al. 2018), (Cardell-Oliver et al. 2016), BuntBrainForEndUses® by (Arregui 2015), REU2016 by (Vitter and Webber 2016, 2018), SmartH20 by (Cominola et al. 2015, 2018) and Autoflow by (Nguyen et al. 2015; Stewart et al. 2018). The latter mentioned Autoflow software uses a combination of different pattern recognition and data mining techniques including HMM, ANN, DTW, histogram analysis and time-of-day probability function to automate the end use analysis process. Further, Autoflow©, a software tool was developed and used to provide a user-friendly platform to aid this process. The novelty of this application in comparison with former software tools in this area, including Trace Wizard, Identiflow and HydroSense is presented in Table 1.

Table 1 Residential end use analysis techniques comparative analysis matrix

The next generation of water management system - Autoflow U

The most advanced autonomous water end use classification system, namely Autoflow© (version 2.1), was first developed by Nguyen et al. (2013a, 2013b, 2013c, 2014, 2015), which is able to disaggregate the overall water consumption into eight different end-use categories, including shower, faucet, clothes washer, dishwasher, evaporative air cooler, toilet, irrigation and bathtub with an average accuracy of 92%. Recently, a more advanced Autoflow© version (v3.1) was developed (Yang et al. 2018) to have improve the obtained accuracy to 93.5%. To develop Autoflow©, water consumption data was collected from different regions across Australia with up to approximately 1000 households. Rigorous testing has shown that the system has been effectively operated for regions where the collected data was used for model development, but noticeable accuracy drop has been observed when employ it to the US and some Australian remote regions where data has not been presented to the model before. Therefore, innovative techniques are required in order to overcome the dependence on the existing prototype resources. In that context, this study proposed the development of an enhanced model, which can be applied to residential dwellings across the world without the need of collecting new data for model calibration. The overall water consumption will be disaggregated into seven distinctive categories, namely; shower, tap, irrigation, clothes washer, dishwasher, toilet, and evaporative air cooler. The graphical user interface of this new application, AutoflowU, is presented in (Fig. 1).

Fig. 1
figure 1

Smart water monitoring application (AutoflowU)

AutoflowU was developed with the purpose of being a universal tool, which is capable of assigning most major end-use events available in a residential household into appropriate categories without relying on any pre-trained model. The basic differences between Autoflow©, which includes two major versions, v.2.1 and v.3.1 (Yang et al. 2018), and AutoflowU are presented in Table 2.

Table 2 Basic differences between Autoflow© and AutoflowU

Overview of applied techniques

The overall classification process used in AutoflowU is presented in Fig. 2. The process starts with the disaggregation of all mechanised end-uses in the order of clothes washer, dishwasher, evaporative air cooler and toilet using Dynamic Time Warping algorithm. Once all events of these categories have been correctly classified, the last step is to assign the remaining end-uses as shower, irrigation and tap using Decision Tree and Self Organising Map (SOM) methods. The application of these techniques into the overall classification process is described in the following sections.

Fig. 2
figure 2

Classification process in AutoflowU

Dynamic Time Warping algorithm (DTW)

One of the primary mathematical tools applied in this study is the Dynamic Time Warping (DTW) algorithm, which is popularly used for measuring the similarity between two time series of different lengths. In general, this task is performed by finding an optimal alignment between two series with certain restrictions. The sequences are extended or shortened in the time dimension to determine a measure of their similarity independent of certain non-linear variations in the time dimension (Myers and Rabiner 1981). This technique has been widely applied in prototype selection (e.g. Nguyen et al. 2011), pattern recognition (e.g. Myers and Rabiner 1981; Muller 2007; Rabiner and Juang 1993; Sakoe and Chiba 1978; and Marquez 2001) or word image searching (Manmatha and Rath 2002). All technical details and applications of DTW in the development of Autoflow© system were presented in (Nguyen et al. 2015), while the adaption of DTW in classifying clothes washer, dishwasher, toilet and evaporative air coolers events in AutoflowU system are presented in “Classification process for each end use category” section.

Self Organising Map (SOM)

SOM is the main technique applied for the classification of user-dependent categories including shower, irrigation and tap. In general, this technique is commonly used to group patterns having similar characteristics together. In the context of this study, unlike DTW where the grouping process was performed based on the analysis of actual shape pattern of each event, SOM relies on the physical features, such as volume, duration and flow rate, to gather similar events. The adoption of SOM for this task is due to the fact that although tap, shower and irrigation categories could possess an infinite number of end-use patterns, there are still significant differences in terms of volume and flow rate between these end uses. Technical overview as well as the application of this technique in to the task of grouping similar patterns together are described in detail below.

There are two different models for the self-organising map including Willshaw-von der Malsburg model and Kohonen model. In both models the output neurons are placed in a 2D lattice. They are different in the way input is given. In terms of Willshaw-von der Malsburg model, the input is also a 2D lattice of equal number of neurons, while in the Kohonen model there is not any input lattice but an array of input neurons. In the context of this study, the Kohonen model was adopted (Kohonen 1982). Given a set of M water events that need to be grouped into N clusters, the adaption of SOM into the task of clustering of these events is as follows:

  • Step 1: Each cluster weights Wj(O) are initialised randomly.

  • Step 2: An input vector from the data set is chosen randomly. In this study, each input vector contains three values describing three physical features of a water event, including volume, duration and maximum flowrate and is denoted by:

    • X = [w1, w2, w3]

    • There are N clusters in the grouping process, and hence N neurons required in the SOM model. For any neuron, the synaptic weight vector is denoted by:

    • \( {\boldsymbol{\mathsf{W}}}_{\boldsymbol{\mathsf{j}}}=\left[{\mathit{\mathsf{w}}}_{\mathit{\mathsf{j}}\mathsf{1}},{\mathit{\mathsf{w}}}_{\mathit{\mathsf{j}}\mathsf{2}},{\mathit{\mathsf{w}}}_{\mathit{\mathsf{j}}\mathsf{3}}\right],j=1,2,3,\dots N \)

  • Step 3: Every cluster is examined to calculate which one’s weights (Wj) are most like the input water event X. In order to achieve this, Euclidean distance was chosen as the main criterion. If the index vector g(x) is used to identify cluster that is closest to the input vector X, then g(x) can be determined as:

    $$ g(x)=\arg\ \mathrm{m} in\left\Vert X-{W}_j\right\Vert, j=1,2,\dots, N $$
    (1)
  • Step 4: the wining cluster locates the centre of a topological neighbourhood of cooperating cluster. The neighbourhood of this winning cluster, denoted by hj,g calculated by:

    $$ {h}_{j,g(x)}=\exp \left(-\frac{d_{j,g}^2}{2{\sigma}^2}\right) $$
    (2)
    $$ {d}_{j,g}^2={\left||{r}_j-{r}_g|\right|}^2 $$
    (3)

    Where rj is the position of the excited cluster j is, rg is the position of the winning cluster g, σ is the width of neighbourhood function.

  • Step 5: Adjust the weights of the winning cluster and its neighbours, using the following rule:

    $$ {w}_j\left(n+1\right)={w}_j(n)+\eta (n){h}_{j,g(x)}(n)\left(x-{w}_j(n)\right) $$
    (4)

    where η is the learning rate parameter of the algorithm which should be time varying and linearly decreases to zero.

    $$ \eta (n)={\eta}_o\exp \left(-\frac{n}{\delta}\right) $$
    (5)

In this study, the initial learning rate ηo and time constant δ are set as 0.1 and 1000 respectively.

In addition to DTW and SOM, a clear understanding of physical features of each end-use category plays an important role in deciding the overall accuracy of the developed model. “Overview of basic characteristics of the end-use categories” section presents an overview of all particular characteristics of major end uses available in this study including clothes washer, dishwasher, toilet, evaporative air cooler, tap, shower and irrigation. The typical shape patterns of these end uses are clearly illustrated through the figures presented in (Nguyen et al. 2013c).

Overview of basic characteristics of the end-use categories

Clothes washer and dishwasher

Different clothes washers and dishwasher have different trace characteristics. Some models, which permit adjustment for load size, can exhibit various trace characteristics at different times. The only common feature of all clothes washers was the operation within the cycle, including the wash, spin and rinse cycle. However, the number of each cycle within each operation is not fixed and depends on the clothes washer models.

Evaporative air conditioner

Several end-use studies (Beal and Stewart 2011; Gan and Redhead 2013) have revealed that there are two types of evaporative air cooler patterns. Type 1 evaporative cooler is the most common one covering a wide range of event durations (from 2 to 20 mins) with a low flow rate of less than 1 L/min. This type of cooler can be identified easily at the beginning of the classification process by searching for any event that possesses these characteristics with an accuracy of up to 95%.

For Type 2 evaporative cooler, the classification process is much more complicated, as this end-use has similar patterns to the clothes washer and the dishwasher events, and also operates in cycle sequence. The only characteristic that can help distinguishing Type 2 evaporative cooler from the other mechanised end-use events is the number of cycles in each operation, which is significantly more than that of clothes washer and dishwasher.

Toilet

Toilet is one of the mechanised categories, which possesses a relatively consistent pattern. The two popular toilet cisterns available on the Australian market are the half-flush and the full-flush cisterns. Their volumes usually range from 3–6 l and 6–12 l, respectively. The categorisation of these fixtures can be used with great confidence in relation to their distinct flow pattern and restricted volume.

Shower

Generally, there are two types of shower events; (1) occur in separate shower stalls, and (2) occur in shower/bathtub combinations. Each type of shower event produces a distinct pattern. A shower event from a shower/bath tub combo typically begins with a high flow rate. Then, the diverter valve is tripped and the flow is throttled down to a much lower rate for the duration of the shower. A standard shower stall event does not exhibit the high flow rate at the beginning or end of the shower event; also, it tended to have a relatively consistent flow rate throughout the duration of the shower event. Volume and the most frequent flow rate (mode) property are two critical features for identifying a shower event.

Tap

Tap usage was the most common water use encountered while undertaking the residential analysis. Because humans determine the flow rate and the duration of each tap event, a wide variety of tap uses were encountered with each trace. Fortunately, kitchen and bathroom taps were seldom capable of flow rates above 13.2 L/min, while most tap usages were relatively brief. Small volume and short duration are the key points to identify this end use.

Irrigation

Irrigation is a complicated category to be identified due to its various patterns. However, there are mainly two different types with two significant different volume ranges. Short manual irrigation event is similar to a tap event, which usually has a volume ranging from 2 to 10 l, and long irrigation event using automatic sprinkler which often has volume over 150 l. In the context of this study, all short manual irrigation event will be considered as tap.

Classification process for each end use category

Clothes washer event classification

As illustrated in Fig. 2, clothes washer event classification is the first step of the overall process. To enable the clothes washer event classification using the technique proposed in this study, Table 3 is presented that covers all possible features a clothes washer wash could have (Fig. 3).

Table 3 Possible features a clothes washer wash
Fig. 3
figure 3

Clothes washer event classification procedure

From the collected data, four basic physical parameters of each event, including volume (L), duration (second), maximum flow rate (L/min) and most frequent flow rate (L/min) can be determined. From these obtained features, the classification process is as follows:

  • Step 1: Select all samples that concurrently satisfy the following constraints (i) Flowrate from 4 to 35 (litre/minute), (ii) Duration from 0.1 to 10 min, and (iii) Volume from 0.5 to 120 l. The selected samples from Step 1 are presented in Fig. 4. This step aims to remove all events that are not likely belonging to clothes washer.

  • Step 2: With the time associated with each recorded sample, the main task in Step 2 is to disaggregate all samples in Fig. 4 into different clusters in which the time interval of any two consecutive samples does not exceed 2 h (i.e. based on Feature 5, the maximum time interval between two consecutive cycles should be less than 2 h). This step aims to ensure that a full clothes washer operation will be entirely contained within one cluster. At the end of this step, 32 clusters were established, three of which were randomly selected, and are presented in Fig. 5.

  • Step 3: In step 3, all samples within a cluster that have approximate flow rate were grouped together. However, only groups that satisfy Feature 1 (i.e. the overall volume of that group must be at least 20 l) were retained. Figure 6 displays one of the groups extracted from Cluster 2 presented in Fig. 5. All other samples that could not be grouped together were removed. It should be noted that the utilisation of Feature 1 can also help eliminate most of the typical dishwasher events. In case there are groups of dishwasher samples that satisfy all requirements in Step 3, the identification of these groups is presented in Step 5.

  • Step 4: At step 4, all groups that contain samples with approximate flow rate from all clusters will be put together in one set (Fig. 7). a Set 1 of groups containing samples of approximate flow rate. b Set 2 of groups containing samples of approximate flow rate

    Figure 7 presents two of four sets containing groups of samples with a similar flow rate obtained from Step 3 (i.e. Set 1 and 2 were formed by all groups that contain events with the most frequent flow rate ranging from 11.5–13.5 and 6.5–8.5 L/min respectively). From the visual judgement, it can be seen that the events in Set 1 are very likely to belong to clothes washer while those in Set 2 are from different user-dependent categories. However, more evidence based on mathematical analysis is required to confirm the above statement.

  • Step 5: Given the fact that time interval between each cycle in a clothes washer operation usually follows a certain time pattern, the identification of this type of end use can be done by searching for sets obtained in Step 4 whose time sequence of events within each group follows a particular trend (i.e. time sequence shows the time interval between consecutive events in a group). This task is undertaken using DTW to estimate the similarity of time sequence of each group. Figure 8 shows the time pattern of all groups contained in Set 1 (Fig. 7a) and Set 2 (Fig. 7b).

    To determine whether the time patterns of the selected set follows any certain trend, the sample grouping process presented in “Overview of applied techniques” section using DTW was applied. If more than 50% of the groups exhibited a similar time pattern, then that set was assigned to clothes washer. The threshold value of 50% was decided to allow the variation in time sequence due to the selection of different washing settings by the users (e.g. normal wash, hand wash or quick wash modes of the same washing machine will result in different time sequence). The actual verification process has also indicated that with the set that contains a group of user-dependent samples, the grouping rate is almost less than 10%. The process has shown that three out of four time sequences (75%) shown in Fig. 8a were grouped together, which confirms the samples contained in Set 1 to be from clothes washer category.

  • Step 6: In this step all clothes washers will be identified by adding all samples in the selected set together. Clothes washer events of the tested homes in this example are displayed in Fig. 9.

  • Step 7: The final step is to refine the classified clothes washer events achieved in Step 6. This task is undertaken by applying the sample grouping technique using DTW as mentioned in “Classification process for each end use category” section. By carrying out this process, all clothes washer events that have similar patterns will be grouped together and those that are ungrouped will be removed as they belong to other categories. Figure 9 has shown that the first event in the top-right group in Fig. 7a was removed as it has a different pattern from the remaining samples.

Fig. 4
figure 4

Selected samples from Step 1

Fig. 5
figure 5

Example of three randomly extracted clusters from 32 achieved clusters. a Cluster 1. b Cluster 2. c Cluster 25

Fig. 6
figure 6

One selected group from Cluster 2

Fig. 7
figure 7

Combing groups of similar flow rate. a Set 1 of groups containing samples of approximate flow rate. b Set 2 of groups containing samples of approximate flow rate

Fig. 8
figure 8

Time interval between any two consecutive samples. a Corresponding time sequences of all groups in Set 1. b Corresponding time sequences of all groups in Set 2

Dishwasher event classification

The classification of dishwasher is undertaken after classifying all clothes washer events. The features presented in Table 4 cover almost all dishwasher models available on the market. To deal with this end use category, the same techniques as in clothes washer event classification were applied. However, it should be noted that as a dishwasher is not present in all households, it is important to identify the existence of this end use in the tested property. This task can be achieved at Step 5 of the classification procedure. At this step, if there is no set that contain groups of similar samples which follow a certain time pattern can be identified, it can be confirmed that there is no dishwasher in the currently tested home.

Table 4 Basic features of dishwasher event

Evaporative air cooler classification

Evaporative air conditioner is an uncommon category that is just present in a few regions in Australia (e.g. Melbourne, Adelaide, Perth, etc.). As mentioned in “Overview of basic characteristics of the end-use categories” section, there are two typical types of evaporative air cooler, where Type 2 is the main subject of the analysis in this study because it possesses similar characteristics to clothes washer and dishwasher. This section aims to categorise this type of evaporative cooler events that still remain unclassified after the disaggregation of clothes washer and dishwasher where they exist. The same method as in clothes washer event classification is applied; however modification is required as presented below.

The analysis starts with the disaggregation of the remaining samples into clusters where time interval between two consecutive samples in each cluster should not exceed 30 min (Step 1) (Gan and Redhead 2013). In Step 2, all events that have similar flow rate, volume and duration will be grouped together as almost 95% of Type 2 evaporative cooler events possess these characteristics. Step 3 aims to gather all extracted groups in Step 2 that have similar flow rates together into different sets. At this step, most of the evaporative cooler events will be visually identified. However, there is still a possibility that a set of toilet events also exists, as this category also has similar flow rate, duration and volume. The disaggregation of these two end uses, in case they appear together, is undertaken in Step 4 by finding the time sequence pattern of samples in each group. Sets containing more groups whose samples follow a certain time pattern will be assigned to evaporative cooler and the other one will be classified as toilet. The last step of the overall classification process is to refine all samples that do not belong to evaporative air cooler category through a process as presented in Step 7 of clothes washer classification.

Toilet event classification

Toilet event classification is the next step in the overall process when clothes washer, dishwasher and evaporative air cooler events have been identified. As toilet is a mechanised category whose volume and pattern are relatively deterministic, the categorisation of this end use can be undertaken in these four steps and illustrated through the following examples.

  • Step 1: From the remaining unclassified samples, search for all events whose volumes are in between 2 and 20 l (Fig. 10)

  • Step 2: Apply the sample grouping technique developed in “Classification process for each end use category” section to group all events that have similar patterns together. At the end of this step, there could be many groups created. Figure 11 below shows that events having similar patterns have been put together in two different groups.

  • Step 3: Select groups that contain events having approximate volumes. In this example, volumes of events within each group are approximate, which indicates that events in these two groups all belong to toilet category.

  • Step 4: The last step of toilet event classification is to remove any event coming from other categories that have been misplaced into the toilet group. This task can be performed by determining the most frequent volume of each group (i.e. the typical volume of toilet event). Any sample whose volume is different from this typical volume by 2 l will be removed. In this example, no such an event exists in both groups and all events in both groups are eventually classified as toilet.

Fig. 9
figure 9

Classified clothes washer events of the tested home

Fig. 10
figure 10

Events that are likely belonging to the toilet category

Shower, irrigation and tap event classification

Once all mechanised end-uses have been successfully classified, the next task is to deal with user-dependent events including shower, irrigation and tap as explained in the next sub-sections. In this study, all irrigation events having volume less than 15 l will be considered as tap.

Tap event classification

Tap is a typical user-dependent category whose patterns could vary unpredictably; however, it can be easily picked up thanks to the small volume and short duration. The tap classification process can be performed by selecting any events whose volumes are less than 15 l from the remaining samples. At the end of this step, all remaining events will belong to shower and irrigation as presented in Fig. 12.

Fig. 11
figure 11

Similar events grouping process

Shower and irrigation event classification

In order to strip apart shower and irrigation, SOM was employed as the main technique. In this study, two main features including volume and flow rate were used as the input in the SOM model for the similar event grouping process. Figure 13 presented two subgroups that were obtained after the separation process using SOM. The final task is to appropriately assign each subgroup to shower and irrigation. Given the fact that irrigation events usually have significantly larger and lower occurrence frequency than shower, the classification of these two categories can be determine through the following steps:

  • Step1: Determine most frequent volume of each group

  • Step 2: Determine average daily occurrence of each group

  • Step 3: Assign group that has larger most frequent volume and lower daily occurrence into irrigation, and the remaining group to shower.

Fig. 12
figure 12

Remaining events for shower and irrigation classification

Model verification

Data collection for model verification

In this study, two verification processes have been conducted. To estimate the efficiency of the proposed model on different data patterns, the first testing was carried out on 200 homes from Australia using Autoflow© and AutoflowU to provide an evaluation on the efficiency of the new model in comparison with the former one. This dataset was collected using a smart meter (72 pulse/litre) and data logger with sampling interval of 10 s from residential dwellings located in Melbourne and the urban south east corner of the State of Queensland, Australia in 2011. In the second testing, data from 30 homes from the United State of America (USA) was sourced from 110 homes in the Denver SF Flow trace Data Set. Each home in this USA data set contains data of approximately 2 weeks in length and provide flow at a resolution of 0.01 gal on a ten second interval. With the availability of raw flow trace data, end-use analysis was manually conducted by using both water audits and diaries from participants to obtain accurate labelled end use data for model verification.

Discussion

It has shown in Table 5 that this model has outperformed the Autoflow© due to its independence on the collected database. The first testing aims to compare AutoflowU with the original one using data collected from 200 homes in Melbourne Australia. Achieved accuracy from Table 4 has showed that the new model obtained an average accuracy of above 89.9%, with the maximum of 92.3% for clotheswasher and minimum of 72.1% for irrigation. In comparison with the classic Autoflow©, the new application has resulted in slightly lower accuracies with 92.3% compared to 92.5% for clothes washer, 91.8% compared to 92.5% for dishwasher, 90.1% compared to 91.2% for toilet, 91.9% compared to 93.5% for shower, and 72.1% compared to 76.3% for irrigation.

Table 5 First model testing on 200 homes

The reason for these slight accuracy drops is due to the fact that Autoflow© performs end-use classification based on pre-trained models with data collected across Australia, as a result, the maximum model efficiency would be obtained when it is tested against Australian data. On the other hand, the working mechanism in AutoflowU is independent of any existing collected data, which may lead to some misclassification errors when dealing with properties having complicated flow rate patterns. However, the most significant advantage of this new AutoflowU package is the classifying Type 2 Evaporative Cooler pattern as the achieved accuracy is higher than that using HMM and ANN (85.1% compared to 82.6%). The verification process has pointed out that due to the variant patterns of this complicated end use, which can be similar to tap, toilet, dishwasher and clothes washer, the efficiency of the advanced HMM and ANN model in Autoflow© dropped considerably when analysing any home whose evaporative cooler patterns possess similar features to the above mentioned end uses (i.e. this technique performs the classification based on assessing the flow rate pattern and physical characteristics of each event to make decision without inspecting the time pattern). In summary, the first testing has clearly indicates that the proposed enhanced system is able to address most of the end-use classification problems without the need of collecting data for model development and calibration.

The second verification test was conducted to further illustrate the analysis capability of the proposed system where water consumption data from the US was utilised. In this test, apart from the Shower events, the achieved accuracies from AutoflowU were higher than that of the classic model in all remaining categories as its working mechanism is location independence (see Table 6). This result indicates that the proposed method is very promising when applied in regions where new data has not been collected for model training and calibration. Comparing AutoflowU with Autoflow©, an accuracy increase was recorded for all water end use categories, such as, 6.6% (clothes washer), 5.5% (dishwasher), 9.4% (evap-cooler), 7.4% (toilet), 1.2% (tap), 1.6% (shower) and 1.7% (irrigation).

Table 6 Model testing on 30 homes from USA with complicated untrained patterns

It was also found in this verification process that most of the errors were recorded during the separation of clothes washer, dishwasher and evaporative air cooler. The main reason is due to the working mechanism of these machines sometimes having similar cycle times. More specifically, in terms of clothes washer, 5.3% of the events were misclassified to dishwasher, 2.1% were misclassified to evaporative air cooler, and the remaining 1.4% were misclassified to tap and toilet. With dishwasher and evaporative air cooler, the misclassification rate of these two end uses to clothes washer were 8.5 and 7.9%, respectively. Figure 14 provides two different metrics to present the entire set of model testing results. In Fig. 14a, a confusion matrix was presented to provide in detail the total number of events of each category used for testing as well as the correctly classified event for each end use. In Fig. 14b, two more sophisticated accuracy measurement indices, namely precision and recall were adopted to express the obtained results in Fig. 14a in a more scientific manner. For example, with clothes washer, a precision value of 0.91 and a recall value of 0.92 imply that within 1729 actual clothes washer events present in the validation process, 92% of them were correctly classified (recall), and within 1760 events that classified as clothes washer, 91% of them were the actual clothes washer (precision). It should also be noted that as the proposed model in this study allows for a very wide range of different characteristics of all mechanised end use categories available on the market, it is very unlikely that the model calibration is required when deploying it on different regions.

Fig. 13
figure 13

Final classification. a Classified irrigation events. b Classified shower events

Fig. 14
figure 14

Overall testing result. a Confusion matrix of testing results. b Precision and recall of testing results

Conclusion

The establishment of an integrated water management system, which employs smart water metering, in conjunction with a series of intelligent algorithms to automate the flow trace analysis process, is becoming feasible due to the development of Autoflow©. The first version of this software tool offers a robust pattern recognition procedure through the hybrid combination of customised HMM and ANN algorithms, which have successfully assigned most of the unclassified samples into appropriate categories, with the average accuracies ranging from 76 to 93% when tested on over 200 homes in Australia. However, it is expected that when applying the model in an international context, the unlimited number of untrained patterns will be the main cause to the significant efficiency reduction of the existing combined HMM and ANN model in Autoflow©; therefore it is crucial to develop a dynamic tool that can work with all different data patterns without any reliance on the pre-trained model. This study has been conducted to achieve that goal through the development of AutoflowU which is especially designed for residential single dwelling, that can classify common categories autonomously based on their physical pattern features and working mechanisms. The achieved accuracy of 74–92% when testing the model on a large number of complicated untrained samples has shown its superior performance in comparison with the existing HMM-ANN combined model and its promising efficiency in dealing with these types of end-uses in the worldwide scenario.

The final analytical stage to be completed in future research is to further extend AutoflowU’s capability to deal with commercial property. New techniques are required to allow this model to analyse data on a real-time basis without relying on the previously trained knowledge. The high level of accuracy obtained through the verification process on USA data is a promising outcome for this study. However, in order for AutoflowU to become a highly accurate, adaptable and autonomous software that has worldwide commercial application, further training, testing and validation, using samples from independent homes from various urban areas within different countries, will also need to be carried out to confirm the accuracy level of this application for various situational contexts (e.g. country, region, etc.).

References

  • Arregui F (2015) New software tool for water end-uses studies. Proceedings of 8th IWA International Conference on Water Efficiency and performance Assessment of Water Services, Cincinnati, p 20–24

  • Beal, C. & Stewart, R.A., (2011). South East Queensland residential end use study: final report. Technical Report No. 47 for Urban Water Security Research Alliance. Griffith University and Smart Water Research Centre, January 2012

  • Cardell-Oliver R, Wang J, Gigney H (2016) Smart meter analytics to pinpoint opportunities for reducing household water use. J of Water Resour Plann & Manage. ASCE 142(6):223–234

    Article  Google Scholar 

  • Cominola A, Giuliani M, Castelletti A, Rosenberg DE, Abdallah AM (2018) Implications of data sampling resolution on water use simulation, end-use disaggregation, and demand management. Environ Model Softw 102:199–212

    Article  Google Scholar 

  • Cominola A, Giuliani M, Piga D, Castelletti A, Rizzoli AE (2015) Benefits and challenges of using smart meters for advancing residential water demand modeling and management: a review. Environ Model Softw 72:198–214

    Article  Google Scholar 

  • Froehlich J, Larson E, Saba E, Campell T, Atlas L, Fogarty J, Patel S (2011) A longitudinal study of pressure sensing to infer real-world water usage events in the home. In: Lyons K, Hightower J, Huang EM. (eds) Pervasive Computing. Pervasive 2011. Lecture notes in computer science, vol 6696. Springer, Berlin

    Google Scholar 

  • Froehlich JE, Larson E et al (2009) HydroSense: Infrastructure-mediated single-point sensing of whole-home water activity. Prococeedings of UbiComp 2009, Orlando, pp 235–244

    Google Scholar 

  • Gan K, Redhead M (2013) Melbourne residential water use studies

    Google Scholar 

  • Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43(1):59–69

    Article  MathSciNet  Google Scholar 

  • Manmatha R, Rath TM (2002) Word image matching using dynamic time warping. Multi-Media Indexing and Retrieval Group, Center for Intelligent Information Retrieval, University of Massachusetts, USA, Technical report

  • Marquez JP (2001) Pattern recognition: concepts, methods and applications. Springer ISBN: 3-540-422978

  • Muller M (2007) Information retrieval for music and motion, chapter 4. Springer ISBN 978-3-540-74047-6

  • Myers CS, Rabiner LR (1981) A comparative study of several dynamic time-warping algorithms for connected word recognition. Bell Syst Tech J 60(2):1389–1409

    Article  Google Scholar 

  • Nguyen KA, Stewart RA, Zhang H (2013a) Development of an intelligent model to categorise residential water end use events. J Hydro Environ Res 7(3):182–201

    Article  Google Scholar 

  • Nguyen KA, Stewart RA, Zhang H (2013c) Development of an autonomous and intelligent system for residential water end-use classification, PhD Thesis. Griffith University, Australia

  • Nguyen KA, Stewart RA, Zhang H (2014) An autonomous and intelligent expert system for residential water end-use classification. J Exp Syst Appl 41(2):342–356

    Article  Google Scholar 

  • Nguyen KA, Stewart RA, Zhang H, Jones C (2015) Intelligent autonomous system for residential water end use classification: autoflow. Appl Soft Comput 31(4):118–131

    Article  Google Scholar 

  • Nguyen KA, Zhang H, Stewart RA (2011) Application of dynamic time warping algorithm in prototype selection for the disaggregation of domestic water flow data into end use events. Proceeding of the 34th World Congress of the Int. Ass. for Hydro-Environment Engineering and Research, Brisbane, pp 2137–2144 26 June–1 July, 2011

    Google Scholar 

  • Nguyen KA, Zhang H, Stewart RA (2013b) Intelligent pattern recognition model to automate the categorisation of residential water end-use events. J Environ Model Softw 47(5):108–127

    Article  Google Scholar 

  • Pastor-Jabaloyes L, Arregui FJ, Cobacho (2018) Water end use disaggregation using soft computing techniques. Water 10(1):46

    Article  Google Scholar 

  • Rabiner L, Juang B. (1993). Fundamentals of speech recognition. Prentice-Hall, Inc., Chapter 4. Rabiner LR. 1990. A tutorial on hidden Markov models and selected applications in speech recognition. Readings in speech recognition. Morgan Kaufmann Publishers Inc, USA.

  • Sahin O, Siems RS, Stewart RA, Porter MG (2014b) Paradigm shift to enhanced water supply planning through augmented grids, scarcity pricing and adaptive factory water: a system dynamics approach. Environ Model Softw 32(1):45–56

    Google Scholar 

  • Sahin O, Stewart RA, Porter MG (2014a) Water security through scarcity pricing and reverse osmosis: a system dynamics approach. J Clean Prod 88(3):160–171

    Google Scholar 

  • Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition, acoustics, speech and signal processing. IEEE Trans Electr Insul 26(1):43–49

    MATH  Google Scholar 

  • Stewart RA, Nguyen AK, Beal C, Zhang H, Sahin O, Bertone E, Vieira AS, Castelletti A, Cominola A, Giuliani M, Giurco D, Blumenstein M, Turner A, Liu A, Kenway S, Savic DA, Makropoulos C, Kossieris P (2018) Integrated intelligent water-energy metering systems and informatics: visioning a digital multi-utility service provider. J Environ Model Softw 105:94–117

    Article  Google Scholar 

  • Stewart RA, Willis RM, Giurco D, Panuwatwanich K, Capati B (2010) Web-based knowledge management system: linking smart metering to the future of urban water planning. Aust Plann 47(2):66–74

    Article  Google Scholar 

  • Stewart RA, Willis RM, Panuwatwanich K, Sahin O (2013) Showering behavioural response to alarming visual display monitors: longitudinal mixed method study. Behav Inform Technol 32(7):695–711

    Article  Google Scholar 

  • Vitter JS, Webber M (2016) Water event categorization using sub-metered water and coincident electricity data. Water (20734441) 10(6):714–721

    Article  Google Scholar 

  • Vitter JS, Webber M (2018) A non-intrusive approach for classifying residential water events using coincident electricity data. J Environ Model Softw 100:302–313

    Article  Google Scholar 

  • Yang A, Zhang H, Stewart RA, Nguyen KA (2018) Enhancing residential water end use pattern recognition accuracy using self-Organising map and K-means clustering techniques: Autoflow v3.1. Water 10(9):1221–1229

    Article  Google Scholar 

Download references

Funding

This project received funding from the Australian Research Council (ARC). The project received funding from the ARC Linkage Project scheme (Grant # LP160100215). Industry partners, including, Yarra Valley Water, City West Water, Southeast Water and Aquiba, contributed funding to this project.

Availability of data and materials

Australian Dataset:

USA Dataset:

Author information

Authors and Affiliations

Authors

Contributions

Dr. KN; Professor RAS; Professor HZ; and Dr. OS. KN developed the main algorithm for the adaptive autonomous water end use disaggregation. RAS provided expert advice on smart metering technology as well as particular features of each end use category required for the disaggregation process. HZ provided expert advice on the development of the smart algorithms for grouping of similar patterns together. OS performed model verification using Australian and US data. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Rodney A. Stewart.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nguyen, K.A., Stewart, R.A., Zhang, H. et al. An adaptive model for the autonomous monitoring and management of water end use. Smart Water 3, 5 (2018). https://doi.org/10.1186/s40713-018-0012-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40713-018-0012-7

Keywords