Climategate battle — start sharing dataNicholas Piël | December 11, 2009
Now that the dust has somewhat settled after climategate, the consensus seems to be that it has been overblown. If you look at the timeline of events this isn’t surprising. Between the public appearance of the report and the first damning articles on the 20th there was less then a single day. It is not that difficult to question how thorough the review of 160mb of data was. It simply wasn’t.
It was as if some people thought they had hit gold and where aggressively searching for that specific quote within the leaked emails which would make them famous instantly. But all in all it was a bit disappointing if you where hoping to find exciting revelations. The thing that could be distilled from the e-mails was that most researchers are having strong opinions and big ego’s, but this shouldn’t really be a surprise.
It is naive to think that scientists are unbiassed, they simply aren’t. However, they are expected to backup up their views with unbiassed facts. The main argument thats left if we ignore all personal slander seems to be focused around a quote in one of the emails concerning the WMO Statement of the status of the global climate in 1999. The front page of this report shows the picture below and indicates that 1990-1999 has been the hottest decade on the record. So yes, it is an argument about a 10 year old report. It might be worth noting that a few days ago (8 dec 2009), the World Meteorological Institute came with a new press release that our current decade is the warmest on record. That information got probably lost in the heated debate.
From the leaked emails conservative news sources state that the following quote is a clear sign of manipulation of evidence:
“I’ve just completed Mike’s Nature trick of adding in the real temps to each series for the last 20 years (ie from 1981 onwards) amd from 1961 for Keith’s to hide the decline.”
But is it? In a rebuttal by the Climate Research Unit they state the following:
This email referred to a “trick” of adding recent instrumental data to the end of temperature reconstructions that were based on proxy data.
Phil Jones comments further: “One of the three temperature reconstructions was based entirely on a particular set of tree-ring data that shows a strong correlation with temperature from the 19th century through to the mid-20th century, but does not show a realistic trend of temperature after 1960. This is well known and is called the ‘decline’ or ‘divergence’. The use of the term ‘hiding the decline’ was in an email written in haste. CRU has not sought to hide the decline. Indeed, CRU has published a number of articles that both illustrate, and discuss the implications of, this recent tree-ring decline, including the article that is listed in the legend of the WMO Statement figure.
They also provide an extra graph where they show the climate reconstruction and the recent instrumental data seperately:
So, as you can see there isn’t really anything shocking to report.
It seems that our viewpoint concerning climate change seems closely linked to our position on the political spectrum. In the red corner, we have the conservatives who consider any idea where they might need to change their way of living threatening. In the blue corner we have the progressives, those who feel that change is a goal not just a method. During the first round of the climate gate boxing match we mainly heard the conservative viewpoint represented by the Telegraph, FOX news, Washington Times and lots of infuriated bloggers but now that that the round is over i think the focus will shift to a more progressive point of view. You see, wether or not climate change is happening we will have to think about how we manage our environment. We are running out of resources and we are polluting our environment . When we do not act accordingly we will end up like the easter islands.
Round 2: The need for data sharing
A positive result of this climate battle is the renewed focus on the public availability of data and methodologies. CRU claims that 95% of their data is already open to the public and that they will make the remaining 5% publicly available, which is great news. This movement of ‘data freeing’ is a great initiative, certainly in this time of collective sharing. John Wilbanks of Science Commons says the following:
“the irony that right at the historical moment when we have the technologies to permit worldwide availability and distributed process of scientific data, broadening collaboration and accelerating the pace and depth of discovery…..we are busy locking up that data and preventing the use of correspondingly advanced technologies on knowledge”.
Making our research widely available is a great way to catalyze progress in the broadest sense, this is probably better illustrated with the next video by Jesse Dylan.
The importance of data sharing is already recognized by the government of the USA, they have created data.gov with the purpose to increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government. It currently has over 118000 different datasets which really makes it a dataminers wetdream.
My efforts on data sharing
In an effort to not just stand along the sideline but participate in this ‘release your data’ party, I have decided to put my master thesis and its results in the public domain. For my master thesis i have implemented a system in mostly Python code which does person recognition on static images. You can compare it with what Google’s Picasa does. However, i was able to outperform Picasa in recognition rate on a few datasets. I have already released some of the source code on BitBucket and you can find a little bit more information on the Projects page.
In the next few months i am going to explain this approach in more detail and will put up my collected resources and a bibtex file. I think this will be a great start for anyone interested in machine vision and person recognition. If your interested just follow me on twitter!