The Air Quality Monitor – Data processing

After implementing and deploying the DSM501a based Air quality monitor, data was collected into an InfluxDB database and consumed by a Grafana Dashboard:

DSM501a and BMP180 Grafana Dashboard

While we can see that Temperature (yes it’s hot!) and Pressure looks ok, data collected from the DSM501a sensor is just a mess. It’s just a bunch of samples jumping around across several values, and so doesn’t look very promising that we can take meaningful data out of it.

So we might need to process the data so that it makes sense, and for that based on the fact that:

  1. Data sample is slow: 2 samples per minute
  2. We don’t want high variations, but smooth out the data

I’ve choosen to filter data using an IIR (Infinite Response Filter) LPF (Low pass filter) to remove any high frequency data/noise, and hence obtain a smoother output.
Did it work? Yes it did:

Original data vs IIR LPF filtered data

As we can see for each collected particle size of 1.0 and 2.5 we’ve filtered it with an IIR LPF that smoothed out any wild transitions while keeping the fundamental and underlying data validation.

IIR implementation is quite simple since it is only a set of additions, subtractions and multiplications with some factors that define the behavior of the filter.

IIR Filter

(The picture was taken from here that also explains nicely what is an IIR and FIR filters).

The input x[]n is DSM501a sample time at t=0, t=30s, t=60s, … and so on, and y[]n is the corresponding output. The b0,b1,b2 and a0,a1 and a2 are the filter factors, that define the filter response. For testing purposes I’ve just choose factors for a 1KHz Low Pass filter and tested it during several days, and hence the above output that can be seen on the Grafana dashboard.

The IIR filtering process is done on Node-Red but it can be done easily also on the ESP8266 since there is no complicated math/algorithms involved.

Node-Red IIR LPF filter

The function that implements the IIR LPF filter is (Note that on the code I use the a’s as the input factors and b’s as the output factors which is the contrary of the above IIR picture):

// IIR LPF factors
  f_a0 = 0.0010227586546542474;     // Input factors
  f_a1 = 0.002045517309308495;
  f_a2 = 0.0010227586546542474;
  f_b1 = -1.9066459797557103;       // Output factors
  f_b2 = 0.9107370143743273;

// PPM 1.0 input variables
var i0_c10 = msg.payload.cPM10;
var i1_c10 = context.get('i1_c10') || 0;
var i2_c10 = context.get('i2_c10') || 0;

// PPM 1.0 output variables
var o0_c10 = context.get('o0_c10') || 0;
var o1_c10 = context.get('o1_c10') || 0;

// Calculate the IIR
var lpf =   i0_c10 * f_a0 + 
            i1_c10 * f_a1 + 
            i2_c10 * f_a2 -         // We add the negative output factors
            o0_c10 * f_b1 - 
            o1_c10 * f_b2;
// Memorize the variables
context.set( 'i2_c10' , i1_c10 );
context.set( 'i1_c10' , i0_c10 );

context.set( 'o1_c10' , o0_c10 );
context.set( 'o0_c10' , lpf );

// PPM 2.5 input variables
var i0_c25 = msg.payload.cPM25;
var i1_c25 = context.get('i1_c25') || 0;
var i2_c25 = context.get('i2_c25') || 0;

// PPM 1.0 output variables
var o0_c25 = context.get('o0_c25') || 0;
var o1_c25 = context.get('o1_c25') || 0;

// Calculate the IIR
var lpf25 =   i0_c25 * f_a0 + 
              i1_c25 * f_a1 + 
              i2_c25 * f_a2 -         // We add the negative output factors
              o0_c25 * f_b1 - 
              o1_c25 * f_b2;
// Memorize the variables
context.set( 'i2_c25' , i1_c25 );
context.set( 'i1_c25' , i0_c25 );

context.set( 'o1_c25' , o0_c25 );
context.set( 'o0_c25' , lpf25 );

msg.payload = {}
msg.payload.cfP10 = lpf;
msg.payload.cfP25 = lpf25;

return msg;

We maintain the filter state (the two previous samples from the sensor) on Node-Red global variables (which will be reset if Node-red is restarted), and calculate for each PM1.0 and PM2.5 sample the filtered value, which depends on the previous samples. The final output is then fed to an InfluxDB sink node which saves the filtered data.
The complete code is at this gist.

While still this being a test by using a probably LPF filter that is not adequate to the sampled data (it was designed for Audio at 96Khz sample rate), it shows that we can do some simple processing to clean up inbound data so that it makes more sense. This mechanisms of using digital filtering signals (DSP) are applied widely in other systems such as electrocardiogram sensors or other biometric sensors the remove or damp signal noise. In this case we can see that after the filtering data looks much more promising to be processed and so be used to calculate the Air Quality Index without the index jumping around as the samples jump.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.