Thursday, October 4, 2012

Beta release of first version of global land surface databank


Today marks the release of the first beta version of the global land surface databank constructed under the auspices of the International Surface Temperature Initiative’s Databank Working Group. The release is of monthly average temperatures from stations around the globe that have been made available without restriction.

The release will be in beta for a period of 3 months before an official first version release. It is hoped that during this time users can take a look and provide feedback (preferably through the Initiative blog) and advice to ensure that the first version release is of the highest possible quality. Additional data submissions received prior to November 30th will be incorporated in the first version release.

The release consists of:
·      Over 40 distinct source decks (compilations / holdings) submitted to the databank to date in Stage 0 (hardcopy / image; where available), Stage 1 (native digital format), and Stage 2 (converted to common format and with provenance flags).
·      A recommended merged product and several variants thereon which have all been built off the stage 2 holdings
·      All code used to process the data merge
·      Documentation necessary to understand at a high level the processing of the data

The release is available from ftp://ftp.ncdc.noaa.gov/pub/data/globaldatabank/.  The merged product can be found at ftp://ftp.ncdc.noaa.gov/pub/data/globaldatabank/monthly/stage3/ . The recommended merge consists of over 39 thousand stations, which range in length from a few years to over two Centuries.



This is data that mostly has not been quality controlled or bias corrected. It is important to stress that it therefore does not constitute a climate data record / dataset suitable for monitoring long-term changes. Rather, it provides a basis from which research groups can create algorithms to produce climate datasets. The results from these algorithms can then be compared and benchmarked as part of the International Surface Temperature Initiative activities. We hope that many groups and individuals take up this challenge which will lead to improved understanding of land surface air temperature changes particularly at regional scales.

This release is the culmination of two years effort by an international group of scientists to produce a truly comprehensive, open and transparent set of fundamental monthly data holdings. In the coming weeks a number of additional postings to the blog will attempt to explain different aspects of this databank.

More information on the Initiative and how to get involved can be found at www.surfacetemperatures.org .

4 comments:

  1. I'm wondering why Homogenized Environment Canada data was used rather than the full database? Steven Mosher has made a scraper for the full daily and monthly databases (raw) that gives a lot more stations than the 300 homogenized ones. I would suggest actually using both.

    ReplyDelete
    Replies
    1. Robert, we have used known data sources that are open. But 'known' here means known to us. There are almost certain many 'unknown knowns' things we don't know about. Our preference is for raw data over homogenized so we don't nix subsequent homogenization. So, any sources such as this - its not too late for inclusion. Of course, it is possible that the data is in one or more of the combobulated sources such as GHCND-raw or GHCNM-source etc. But we would also always prefer to use data with better provenance chain back towards (preferably to) the original observation itself.

      Delete
  2. Hi Peter,
    Here is the database I am referring to:
    http://climate.weatheroffice.gc.ca/advanceSearch/searchHistoricDataStations_e.html?searchType=stnProv&timeframe=1&lstProvince=&optLimit=yearRange&StartYear=1840&EndYear=2012&Month=11&Day=5&Year=2012&selRowPerPage=100&cmdProvSubmit=Search

    Unfortunately Environment Canada has not made their database available as one individual file like they have with the homogenized stuff meaning you have to download for each of the 8300 sites individually. However in previous work Steven Mosher developed a scraper for me which downloads the data and collates it into a format similar to GHCN.

    http://stevemosher.wordpress.com/2011/08/05/chcn-canadian-historical-climate-network/

    If you would like access to this database and its 8300 entries then I suggest you email him and he will be able to get it to you or help you get it.

    From my experience this database includes more than the GHCN in many areas of Canada.

    ReplyDelete
    Replies
    1. Thanks. We;ll take a look. I'll try to remember to drop back here with an update. I know that NCDC has a bilateral with Canada so the dailies may be in the top-dog GHCND deck already. I'll pass this on to the databank team.

      Delete