Hazy numbers

When it rains, it pours. As PSI reading of 100 becomes the new normal nowadays in Singapore, people also start to question whether they can trust the reading by NEA or not, since sometimes what they feel is worse than the given number. Well, the answer is yes, but with a caveat. The thing is the PSI number by NEA is a 3-hour average reading. It means that there will always be a lag factor in the number. Depending on the trend, the PSI reading can under-estimate the ‘current’ PSI value, but it also can over-estimate it. This can be illustrated as given in the figure below. I tried to back-calculate the 1-hour reading using several assumptions and you can see that the 3-hour average readings always lag the 1-hour estimate readings. So, for example, the peak value on 19/06 is not when the 3-hour PSI reached 321 at 10 pm, but an hour before that, where the 3-hour PSI reading was 290 but the 1-hour estimate was around 450.

PSI reading from 17/06/13 5am to 20/06/13 12pm. PSI 3-hour reading is given by NEA and the 1-hour reading is by back-calculation with assumptions.

PSI reading from 17/06/13 5am to 20/06/13 1pm. Most of the PSI 3-hour readings are given by NEA while the rest (e.g., those between 1AM-5AM) and the 1-hour readings are estimated by back-calculation with assumptions.

This 3-hour average reading is fine if there is not much movement in the data, and is actually a good practice of statistics (acquiring more samples), since it will reduce the noise in the data. The problem, of course, lies when there is indeed a large change in the data. In that case the 1-hour reading might give you a more accurate data, although the reading will be noisier. In the figure above, you would find more fluctuations in the 1-hour estimates if compared to the 3-hour readings. And also note that both the figures here are averages. Although the 1-hour estimate might be more ‘current’, it also comes with higher standard deviation and hence less confidence and reliability.

Back-calculation to get the 1-hour estimate reading is also useful to predict future readings, since parts of the current reading will still be used in at most 2 more hours afterwards. So, for example, the 20/06 3-hour reading at 1 pm is 371, which consists of 1-hour estimates of 256 at 11 am, 454 at 12 pm, and 403 at 1 pm. The 3-hour reading at 2 pm will consist of 454 at 12 pm, 403 at 1 pm, and the 1-hour estimate at 2 pm. Assuming that the 1-hour estimate at 2 pm will go down further (let say at 350), the 3-hour estimate PSI reading at 2 pm will be at 402. The number might sound even creepier, but if we look at the 1-hour estimate, actually we have gone through the worst hour (1-hour estimate of 454 at 12 pm). To put it differently, if the 3-hour PSI number were to go down from 371, we need the 1-hour reading at 2 pm to be below 250, which is pretty drastic since the current 1-hour estimate is at 403. But such drastic change is not unprecedented. The 1-hour estimate at 11 am today was 256 while it was 454 at 12 pm, an almost 200 change in value.

The point of all this is not to have some mathematical fun with the data, but to recognize the limitation that  the 3-hour readings inherently have. I still think that in an extra-ordinary situation like this we also need the 1-hour estimates, as the 3-hour readings, although more reliable, are not fast enough to capture the movement of the data.

(Update to this post: Here.)


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s