Energy Disaggregation done with plain mathematical functions might not cut to the chase in production-grade pipelines, with emphasis on generalizability and superior performance. Modern Deep Learning algorithms have proven to be particularly good architectures in mitigating basic mathematical and Machine Learning implementations. Also, the feature extraction is taken care of by the model itself since we are not handcrafting any features. Since the data size is in the magnitude of hundreds of GBs, it becomes more imperative to go for Deep Learning methodologies for a more robust Energy Disaggregation.
In our previous blog post on Energy Disaggregation and Non-Intrusive Load Monitoring of EVs from Smart Meter Data, we have had an in-depth understanding of Non-Intrusive Load Monitoring (NILM), its applications, and how to get started in NILM. Now, it is time to look at how we can leverage Deep Learning for this problem statement. For data like Pecan Street, we need to look at sequence modeling and use RNNs, LSTMs, or the recent go-to architecture “Transformers.” The first approach anyone can take is a binary/multi-label classification for each appliance given a grid consumption aggregate value. Think of it like this – Given a sequence of aggregated grid values, can we predict which appliances were on or off in that time sequence?
This becomes a multi-label classification problem where the X is of shape (1, sequence_len) and Y is of shape (1, no_of_appliances). This might look something like this,
X: [4.5, 4.6, 4.65, 4.7] (grid values in sequence)
And these aggregated grid values are sampled at 1/60 Hz. When we were building prototypes for this approach, the major hurdle we faced was the preparation of the dataset discussed above, and TensorFlow’s windowing feature under tf.data.Dataset API helped us tremendously in preparing the data for training. Once we batched the data the shape of the data looked like this – X: (Batch size, 1, sequence length) and Y: (Batch size, number of appliances).
Time to build a model!
We can use any type of Recurrent layer – Vanilla RNN, Long-Short Term Memory (LSTM), or Gated Recurrent Units (GRU) and stack up the layers. The model must be set up in a Multiple-Input Multiple-Output configuration.
Here are a few things to keep in mind while training,
Pay attention to the magnitude and always standardize/normalize the data
Pay attention to these appliances/power sources – Solar, Grid, Car1, AC, and Heater. These are particularly important, spend more time on performing EDA
Make sure to label the data correctly, and while you are doing this, you are automatically inducing a bias (Labelling bias)
Try to insert minor noise from other appliances to increase the generalizability and robustness of the model
Apart from using LSTMs in a multi-label classification setup, we can also use LSTMs in a generation setup as well – Generating the consumption signal of a particular appliance given the aggregated signal. This can be achieved using the TimeDistributed layer in TensorFlow. 1D Convolutional Neural Networks have also been proven remarkably effective and efficient (faster to train than LSTMs) for sequence modeling tasks with the right number of filters and filter sizes. We can perform model stacking for better overall performance.
Having seen NILM as a classic classification problem is just one way of looking at things. The other facet for solving NILM is turning it into a Denoising Autoencoder problem, that filters out noise and reconstructs the required appliances’ consumption signal from the aggregated signal. This is quite interesting to work on and find more on this with our upcoming blog on the usage of Autoencoders and Variational Autoencoders for NILM.
There is a lot of scope for improvement in this domain, the research is moving forward at a quick pace in this domain. Hope this article helped you to start conceptualizing your approaches and implementation plans. Build amazing models and share your experiences with the community!