A recurrent issue in statistical climatology is how to deal with long-term trends when one is trying to estimate correlations between two time series. A reader sent us the following question, posed to him by a friend of his, related to our test of the method applied by Mann et al to produce the hockey-stick curve in 1998
„Im übrigen wurde der angeblich falsche Hockeystick (Stichwort Climategate) nicht nur widerlegt, sondern die Klimaskeptiker haben nun ihr eigenes Climategate, da ausgerechnet einer Studie eines Klimasketikers (von Storch) ein Rechenfehler nachgewiesen wurde. Er hat diesen auch eingeräumt, aber nicht in der Zeitschrift korrigiert, in der er den Artikel veröffentlicht hatte, sondern in einer völlig unbedeutend, was man als Vertuschungsversuch auslegen kann. Ausgerechnet diese - nun vom Verfasser selbst eingeräumt - falsche Studie wird häufig von Klimaskeptikern zitiert.
I will no enter the question of why realclimate linked to Comment published in Science by Wahl et al. (2006) and not to our response, both published side by side. Sometimes, actually never, should one trust blog as the sole source of information. Interestingly, the fact that 'the friend' was not aware that a response did exist and was published in the same journal lead him immediately to assume a dishonest behaviour. He/she did not bother either to check by himself if the comment published in Science has prompted a response. The 'confirmation bias' is present everywhere. It is probably unavoidable, but a useful Chinese proverb may offer some help: 'if you think you are 100% right then you are wrong with a 95% probability.
Instead of delving in Chinese philosophy formulated in terms of IPCC likelihoods, I will illustrate by this example how nasty trends present in many climate records pose some challenges to the design of regression models. The basic problem is that two series that display a prominent trend will always be correlated, independently of whether or not they are indeed physically related. In the press release on the Science comment at that time (2006) we showed a nice example of a completely false inference based on the correlation between two trendy series: the Northern Hemisphere mean temperature and unemployment in West Germany over the last decades. Taking this correlation at face value, one could design a statistical model that predicts the Northern Hemisphere temperature from the unemployment figures, and this statistical model would even deliver a nice value of a validation statistics that is commonly used in climate reconstructions, the Reduction of Error. This diagnostics places more weight on a closer agreement between the mean value of reconstructions and target data more strongly than the correlation between the two series, which in turn focuses on the agreement between their short-term wiggles. Depending on how strong the interannual variability relative to the long term trend is, the RE or the correlation would provide a more faithful measure of the skill of the estimation. In this example, the apparent agreement between both time series is obviously an artefact, since temperatures and unemployment are unrelated, but the problem illustrated here is present in many attempt to calibrate proxy records over the 20th century. A soon as a proxy record exhibits a trend, positive or negative, it will display an apparent correlation to the global mean temperature and thus it might be taken as an adequate proxy to reconstruct the global mean also during past times. It may happen that this correlation is physically sound, and thus correctly interpreted, but when the series are trendy, one cannot be sure. The relationship between proxies and climate is often not physically obvious.
Mann et al (1998), in their lengthy description of their reconstruction method, mentioned at some stages that they had used 'detrended variables ' to calculate some diagnostics of the skill of their method. We interpreted, wrongly as it turned out, that they had detrended the proxy and temperature series to calibrate their statistical model. This was very soon taken as a proof that we had incurred in a calculation error and that the the whole analysis was flawed. Our response to their Comment showed that, in essence, to detrend or not detrend the data did not make a material difference, and in both cases, the method applied to produce the hockey-stick would underestimate the long-term variations in most circumstances. Interestingly, our colleague and friend Gerd Bürger had also submitted in 2005 a comment to Science that raised very similar questions. A more elaborated version was eventually published in Geophysical Research Letters. But the journal Science thought in 2005 that Gerd's manuscript was not interesting enough to warrant publication. A few months later, it has changed its opinion, convincing me that for Science all authors are equal but some authors are more equal than others.
The stage was already set for prejudices to unfold and for the climate aficionados to choose their preferred sides. The paper by von Storch et al. (2004) was perceived by some as an attack to the hockey-stick and, by the same token, to the larger corpus of anthropogenic warming - something it was not. The mannistas and the anti-mannistas poised to fend-off the forays of their respective adversaries into their own territory, independently of the contents of the Wahl et al. Comment or of our response, which quite likely very few people took the time to read.
This little episode had, however, a positive ending: later on, I had the chance to personally meet Eugene Wahl, one of the nicest scientist you can imagine, both personally and professionally, and, ironically, one of the most unfairly treated since Climategate.