NCDC have just released version 3.1.0 of the GHCN product, detailed
in a tech note, as documented in the dataset paper of their global
Land Surface Air Temperature product – the Global Historical Climatology Network Monthly. This release
does two things.
Firstly it incorporates an array processing algorithm that
significantly speeds up the processing which will enable NCDC to process the
much larger databank holdings upon its first version release in
early-to-mid 2012 to form a yet more comprehensive estimate of the global Land
Surface Air Temperature evolution.
Secondly, and the focus of this post, is that it
incorporates a set of five process bug fixes, four of which were discovered in
the homogenization algorithm as a result of an effort undertaken by Daniel Rothenberg sponsored by the Google Summer of Code and mentored by
the Climate Code Foundation. The final bug was discovered as a result of
carefully checking for similarly based bugs which essentially related to array
compression / non-compression for missing values on passing between routines.
That bugs exist in what is several thousand lines of code is hardly surprising.
In fact it would have been far more surprising if it had been discovered that
there were no bugs. Daniel visited NCDC as part of his project and the bugs
were discussed at length with relevant NCDC staff and fixes have subsequently
been undertaken, extensively validated, and their impacts on the analysis
documented.
The bottom line impact on the global mean trend is a
difference of less than 0.002K/decade – below the typically quoted global mean estimate
precision of 2 decimal places and two orders of magnitude less than the
reported centennial scale global-mean Land Surface Air Temperature warming rate
from this dataset. Equally global annual means show negligible differences.
Differences at the station level are almost always below 0.2K/decade with
effectively zero mean change. So, whilst the bug fixes were important from both
a science and process perspective they do not significantly alter our current
understanding of changes in climate at the largest space and longest
timescales.
What this does provide is an example of the very real
potential value in openness and transparency, in code replication, and in
working in positive partnership to resolve the issues that arise. Daniel aims
to continue working on his port of the algorithm to python and it will be
of great interest to see what other benefits may accrue.
NCDC have released the old (v.3.0.0) and new (v.3.1.0)
versions of the homogenized data (in frozen form) and other relevant metadata (with ongoing additions) at ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v3/archives/.