Tuesday, October 14, 2008

problems with the "blue sky day" metric

Late last month, consultant Steven Q. Andrews published an excellent report with detailed analysis of the publicly reported daily API data for Beijing. His key findings, which were first published as an op-ed in the Wall Street Journal (and subsequently covered by the New York Times, Time, and others), are as follows (quoted from the report's abstract):
Here I show that reported improvements in air quality [in Bejiing] for 2006–2007 over 2002 levels can be attributed to (a) a shift in reported daily PM10 concentrations from just above to just below the national standard, and (b) a shift of monitoring stations in 2006 to less polluted areas.
Many people, including Mr. Andrews, have been asking me my opinion on the findings of the report. Here, I will try to summarize what I consider to be the most impressive and surprising results, while also commenting on what the report's results do - and don't - show.

Summary: Statistical analysis of reported API frequency shows clear data biasing to reach Blue Sky Day targets, and, as such, I think effectively invalidates the use of annual number of Blue Sky Days as an air quality measure. This is an extremely impressive finding that I hope the Chinese government will respond to appropriately, both by reconsidering the use of the Blue Sky Day metric altogether and investigating how such bias was introduced and eliminating it in the future.

However, I caution against using this finding alone to make broader, sweeping assumptions about Beijing's air quality changes over the last five or ten years. Specifically, Mr. Andrews' report should not be used as proof that any and all recent improvements in air quality in Beijing have been simply the result of "gaming the numbers" as opposed to actual improvements. While Mr. Andrews' results regarding numbers of Blue Sky Days are dramatic, his analysis of pollutant concentrations in recent years focuses on only one pollutant, PM10, and shows concerning, but not dramatic, discrepancies to the officially reported data. Additionally, when discussing PM10, it should be noted that controlling concentrations of particulate is well known to be one of the biggest air quality challenges faced by Beijing, and that trends of PM10 concentration should not necessarily be equated with trends of other pollutants or even trends of overall air quality.

Finally, while I think Mr. Andrews' analysis normalizing air quality data across multiple years by accounting for the moving of monitoring stations is excellent, correct, and appropriate, I do not think it proves deliberate deceit about air quality (as the Blue Sky Day biasing does).


Blue Sky Day Data Biasing

First of all, to me, the most impressive graph in the report is this one, showing the dramatically higher frequency of reported PM10 concentration just below the Blue Sky Day cut-off than just above (Figure 2 from Mr. Andrews' report):

Equally impressive is this statement from the report:
While 52% of the days with a city API between 96 and 105 (PM10 = 142–160 μg m−3) were reported as ‘Blue Sky’ days in 2001, 98% of the days in this range were ‘Blue Sky’ days in 2006, and 93% of days in the range were ‘Blue Sky’ days in 2007.
This seems to show, unequivocally, that there is bias in the reported data around the Blue Sky Day cut-off point. From this data, it appears that the use of the number of Blue Sky Days metric is not reliable as an indicator of Beijing's air quality improvement. Therefore, I will stop using it as such and will edit a previous post on this blog that references it. I would hope that, in time, Beijing will recognize this clear bias and take steps towards identifying how it is introduced and preventing it in the future. At the same time, detailed investigations into potential biasing of other pollutant data should also be conducted (though it seems less likely that such biasing would have occurred, given the fact that the biasing appears to be related to meeting Blue Sky Day targets, for which PM10 is usually the limiting factor).

Moving of Monitoring Stations

The second issue Mr. Andrews raises is the moving of monitoring stations. While I find his analysis here to be fascinating and correct, I'm not convinced that his results prove that the moving of the monitoring stations was driven by the desire to lower artificially air pollution levels by measuring in less polluted areas. He mentions that the new monitoring regulations put into effect in 2006 included "new specifications...regarding the minimum distance from roadways that air pollution should be monitored." I don't know enough about international monitoring to know if perhaps these new standards were designed simply to bring China's monitoring better in line with international standards? Whatever the case, Mr. Andrews' point that different measuring systems were used is valid:
It has been widely reported that the number of ‘Blue Sky’ days in Beijing increased from 100 in 1998 to 246 in 2007, but these reported trends encompass a period during which air quality was evaluated in three different ways: (1) 1998– 1999, based on the 1996 Chinese national ambient air quality standards (2) 2000–2005, based on the 2000 revisions of the Chinese national ambient air quality standards and using the 1984–2005 monitoring station locations (3) 2006–2007, based on the 2000 revisions of the Chinese national ambient air quality standard and using the 2006–2007 monitoring station locations.
Ideally, officially reported data in the future should note the change in monitoring methodology on graphs showing data from both periods.

Impacts on Pollutant Concentration

As Mr. Andrews points out in the report, the Blue Sky Day metric is a "policy-relevant metric," and, "an effective communication tool...to facilitate greater public understanding." In other words, it is not a scientific metric, insofar as the cut-off point of API = 100 is rather arbitrary. Evaluating the effect of the aforementioned bias and monitoring station location change on reported vs. actual air quality requires analyzing pollutant concentrations.

Using a methodology to eliminate the reporting bias and normalize across similar reporting stations, Mr. Andrews' ran a new concentration analysis for PM10 and generated the following results:
In 2006, an annual average PM10 concentration of 161 μg m−3 was reported, however, if the monitoring station used from 1984 to 2005 continued to be used in 2006, the concentration would be ∼167 μg m−3—an average concentration ∼6 μg m−3 higher than reported. In 2007, an annual average PM10 concentration of 149 μg m−3 was reported, however, if the original monitoring stations continued to be used in 2007, the concentration would be ∼161 μg m−3—an average concentration of ∼12 μg m−3 higher than reported.
Stated differently, he concludes that Beijing's 2006 and 2007 reported values for PM10 were about 3.6% and 7.5% lower, respectively, than they would have been without data biasing or moving of monitoring stations. While this is concerning, the results are not nearly as dramatic as the difference in Blue Sky Days, as shown in Mr. Andrews' report (Figure 3 from the report):

In the above graph, note that the difference in Blue Sky Days (shown as columns) is much greater than the difference in PM10 concentration (shown as red lines). Note also that while the trending on Blue Sky Days changes dramatically based on the new analysis (increasing 2001-2005, decreasing 2005-2007), the trending on PM10 does not show a large change under the new analysis.


As I mentioned earlier, Mr. Andrews' result regarding the biasing of annual numbers of Blue Sky Days is powerful and dramatic. However, I'm not sure the result regarding 2006-2007 concentrations of PM10 is as dramatic, especially given the fact that PM10 has notoriously been one of the most difficult pollutants for Beijing to control. The Beijing EPB's own data show a 2007 PM10 concentration of 149 um/m3, higher than 2003 and 2005. While adjusting the concentration data according to Mr. Andrews' analysis may be important, doing so does not qualitatively change the 2001-2007 PM10 trends in Beijing.

Mr. Andrews concludes his report:
Although nine continuous years of air quality improvement has been reported in Beijing between 1998 and 2007, my analysis finds that these improvements, as indicated by the annual number of ‘Blue Sky’ days, are due to irregularities in the monitoring and reporting of air quality and not to less polluted air. Reported variations in air quality that occur as a result of changes in monitoring station locations or air quality standards, should be considered as inconsistencies in the metrics and not as actual changes in air quality.
While I agree with his analysis showing data biasing in the numbers of annual Blue Sky Days over the past few years, I think it is critical to clarify that such biasing does not mean that there was no improvement whatsoever in Beijing air quality over the last decade.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.