Mesokurtosis - Taleb and Pinker squabble

Posted on 29 May 2015
Tags: history, statistics, modeling, half-baked

I’m trying to make sense of a debate between Nassim Nicholas Taleb and Steven Pinker that’s come into my awareness today. I’m aware of each writer’s schtick,¹ and their (the schticks, not necessarily the individuals) incompatibility, but had not known about their ongoing conflict. Then I saw² that Taleb was including technical points in his invectives and I could not resist getting drawn in:

Ystdy as @sapinker was uttering his journalistic BS was giving talk on pbls w/ fat tailed estimators https://t.co/Pakx9bQkWN @davidmanheim
— Nassim NicholنTaleb (@nntaleb) May 29, 2015

After having gotten a bit deeper, I think I should have just gone with a gut reading.

@othercriteria Seems to me it's a pretty classic case of “each states correct retaliatory points, then slightly overstates conclusions”
— Connor Flexman (@ConnorFlexman) May 29, 2015

Framing

Pinker’s basic claim is that “[t]oday we are probably living in the most peaceful time in our species’ existence”. In figures the claim looks like this:

Recent years have shown a dramatic decrease in violent deaths. Further, even in absolute number of deaths, WWI and WWII do not stand too far outside the historical norm. (Reproduced without permission but hopefully with forgiveness from Our World in Data.)

This assertion is presented as being counterintuitive, but only in that very special TED Talk sense, where the intuition being countered is summoned into existence for rhetorical purposes. What is being called up is not quite a straw man—the first half of the last century was filled with consequential and traumatic blood-letting and the latter half was occupied with the promise of even worse—but only because there’s no particular opponent to pin this argument on. Pinker is mostly refuting the zeitgeist. What seems to have happened is that Taleb has stepped in as the zeitgeist’s advocate.

For background on the conflict, I started with a Vox explainer written by Zach Beauchamp, which picks up the story several years after Pinker’s TED Talk (2007) and the publication of his book The Better Angels of our Nature.

The most recent salient event is the release of a paper Taleb co-authored with Pasquale Cirillo, an applied probabilist. Beauchamp gives a broad interpretation that distills away most of the technical content:

Taleb and Cirillo’s core argument is that looking at raw numbers alone — casualty counts from different wars — is misleading. In order to understand the actual risk of large wars over time, they argue, you need some more complex statistical tools. Their paper uses a method called “extreme value theory”: a type of statistical analysis specifically designed to assess the probability of rare but extremely significant events, such as a world war.

Taleb and Cirillo conclude that there are two major flaws in Pinker’s theory. The first is that their analysis suggests huge conflicts (on the scale of 10 million casualties) only happen once a century, but Pinker’s “long peace” only covers 70 years. That could mean that what looks like a decline in violent conflict is merely a gap between major wars.

They also conclude that Pinker has underestimated the actual average casualty numbers in major wars by about three times, and that the real numbers don’t actually show a decline over time. If that’s right, his measurements of the apparent decline of war are overly rosy.

However, making the situation less clear, Andrew Gelman is quoted³ calling the problem “somewhat extrastatistical”. And Beauchamp makes the argument, or perhaps quotes Gelman making the argument, that “we can’t know whether the extraordinary peacefulness of the past 70 years is a continuing trend without really understanding why there have been so few deaths from war recently”.⁴

Taleb’s self-framing

It’s also worthwhile (or at least amusing) to look at how Taleb frames himself, since it deviates so far from Pinker’s image as a respectable Ivy League public intellectual.

A remarkable figure entitled “Genealogy of the INCERTO” on Taleb’s website is worth examining in detail and meditating on. The version down-scaled version below is just a teaser to convince you to take a closer look:

(Reproduced without permission but hopefully with forgiveness from Taleb.)

Given the importance of “unknown unknowns” in Taleb’s work, I was surprised that he placed “Knightian uncertainty” completely outside the ambit of “Black Swan (Anti)Fragility”.

Skimming through Taleb’s sprawling work-in-progress Silent Risk, a “mathematical parallel version of the author’s Incerto”, I found this brief passage which seems to unify and focus much of Taleb’s work.

The difference between “models” and “the real world” ecologies lies largely in an additional layer of uncertainty that typically (because of the same asymmetric response by small probabilities to additional uncertainty) thickens the tails and invalidates all probabilistic tail risk measurements − models, by their very nature of reduction, are vulnerable to a chronic underestimation of the tails.

So tail events are not measurable; but the good news is that exposure to tail events is.

I also found this cartoon that seemed almost comically on the nose given my flip description of his schtick in an ealier endnote:

(Reproduced without permission but hopefully with forgiveness from Taleb and from George Nasr.)

Meta-framing

There’s also a higher-level framing that I feel sort of ashamed to have fallen into myself. Imagery and metaphors of war and violence throughout the secondary sources and responses to the Taleb-Pinker debate. Neither of the principals is nice to each other but they themselves tend to use wordplay (“Fooled by Belligerence”) or unseemly ad hominem (assigning the name “The Pinker Problem” to a “class of naive empiricism”).

Volleys

Taleb: The “Long Peace” is a Statistical Illusion

Taleb’s first engagement with Pinker is in this bundle of material containing a nontechnical discussion, a published article, and an unpublished article. The middle part was later repurposed as Chapter 7 of Silent Risk.

There’s a strangeness in the rhetoric as Taleb moves from being “under the impression that [Pinker] simply misunderstood the difference between inference from symmetric, thin-tailed random variables [and] one from asymmetric, fat-tailed ones” to the fuzzier claim that:

Pinker doesn’t have a clear idea of the difference between science and journalism, or the one between rigorous empiricism and anecdotal statements. Science is not about making claims about a sample, but using a sample to make general claims and discuss properties that apply outside the sample.

If this is the problem, it’s not clear why heavy-duty technical machinery needs to be developed. Yet Taleb proceeds to expend many pages to basically make the point that realization of a point process don’t look like intensity measures and that, depending on the point process, intuitive estimators of the intensity measure may show substantial bias.

Pinker: Fooled by Belligerence: Comments on Nassim Taleb’s “The Long Peace is a Statistical Illusion”

Pinker responds with less inflamatory language:

I was surprised to learn that Nassim Taleb had a problem with my book The Better Angels of Our Nature, because its analysis of war and terrorism harmonizes with Taleb’s signature themes. The chapter on major war begins with 21 pages on historians’ overinterpretation of temporal trends in war and could have been called “Fooled by Randomness.” It was followed dozen pages on the thick-tailed distribution of the magnitudes of wars which could have been subtitled “The Black Swan.” Yet rather than acknowledging our similar mindsets, Taleb has come out swinging, pummeling away at what he thinks is the message of the book, accompanied by a stream of trash-talk about my statistical competence.

Taleb shows no signs of having read Better Angels with the slightest attention to its content. Instead he has merged it in his mind with claims by various fools and knaves whom he believes he has bettered in the past. The confusion begins with his remarkable claim that the thesis in Better Angels is “identical” to Ben Bernanke’s theory of a moderation in the stock market. Identical! This alone should warn readers that for all of Taleb’s prescience about the financial crisis, accurate attribution and careful analysis of other people’s ideas are not his strong suits.

Cirillo and Taleb: On the tail risk of violent conflict and its underestimation

This paper draft is the focal point of the current skirmish. Its condensation into a presentation is the “talk on pbls w/ fat tailed estimators” that pulled me into this mess.

Its methodological novelty seems to come from this:

We apply methods from extreme value theory on log-transformed⁵ data to remove compact support, then, owing to the boundedness of maximum casualties, retransform the data and derive expected means.

The primary results are:

A new estimate of Pareto distribution parameter \(\alpha = 0.53 \pm 0.04\) for death toll from named conflicts.
A characterization of the bias of the sample mean as approximately \(\times 1 / 3\).
Memorylessness of onset of conflicts, for those with death toll above 50,000.⁶

Mark Buchanan, in “Violent warfare is on the wane, right?” on Medium’s Bull Market collection(?), gives a lucid and detailed summary of the paper.

Other interpretations

David Roodman

David Roodman responds to the controversy in the cutely-titled “Little Greek letters become weapons in war of words over trend in violence” and the follow-up “More violence”.

In “Little Greek letters”, he notes that the EVT models in the Cirillo and Taleb paper use a steady aggregate casualty rate of warfare, which is certainly an unrealistic assumption but a fair one to use as a null hypothesis. It is relative to this model that the current “Long Peace” free of wars measuring in the mega-deaths wars is improbable. This could be explained by Taleb as us getting lucky and by Pinker as us getting wiser. He accuses Taleb of “jousting with a caricature of Pinker”⁷, but his proposed fix of introducing a “post-1945 dummy” variable into the model is a poor idea for reasons I’ll describe shortly.

In “More violence”, Roodman narrows in on a more precise statement of what Cirillo and Taleb are doing wrong:

But I think if you are going use statistics to show that someone else is wrong, you should 1) state precisely what view you question, 2) provide examples of your opponent espousing this view, and 3) run statistical tests specified to test this view. Cirillo and Taleb skip the first two and hardly do the third. The “long peace” hypothesis is never precisely defined; Pinker’s work appears only in some orphan footnotes; the clear meaning of the “long peace”—a break with the past in 1945—is never directly tested for.

Roodman elaborates his proposal into a simulation study. For the test of autocorrelation that Cirillo and Taleb leave a bit vague, he uses Stata’s wntestq, which implements the Ljung–Box test. There are a few issues with this idea. First, it only addresses point “3)” from his excerpt, leaving the test unconnected from the argument. Picking out a particular year for “trend break”⁸ adds a researcher degree of freedom, with each DOF making rejections of the null easier to dismiss as false positives. And even rejecting the null is not a fatal blow, as it can alternatively be explained as serial correlation arising from Turchin-ish cliodynamic cycles.

I won’t summarize the blog post by Jay Ulfelder since, besides bringing in some additional political science context, it mostly colors in the lines drawn by Roodman.

Michael Spagat

In “Is the Risk of War Declining?”, Michael Spagat ⁹ takes the position that the Cirillo and Taleb paper is not a fundamental challenge to Pinker’s thesis. He is skeptical that there is “a single war-generated process that has remained stable for 2,000 years”, given that the period encompasses “Roman tortoise and wedge formations, Mongolian horse warriors, Swiss pikemen, ironclad warships, trench warfare, nuclear attacks and suicide attacks”:

Second, the only channel that Cirillo and Taleb implicitly empower to potentially knock their war-generating mechanism off its pedestal is the accumulation of historical data on war sizes and timings. Since they focus on extreme wars, however, it will take a very long time before it is even possible for enough evidence to accumulate to seriously challenge their assumption of an unchanging war-generating mechanism.

In short, the authors declare that the risk of huge wars hasn’t really changed over two millennia of war and that they will stick with this belief until enough of a one particular type of slowly accumulating evidence appears to refute it. This stance may be fine for them but other people will wish to incorporate other evidence into their judgments of the risks we face.

Indeed, Pinker does offer additional evidence—including the spread of democracy, international trade, international organizations and the human rights protections for women, children, homosexuals and various minority groups—which suggests that recent downward trends in human violence will not suffer dramatic reversals. It is far-fetched in the extreme to imagine that Europe, with its centuries-long history of bloody wars, is now capable of descending into another World War I or World War II. I am happy making this judgment without a further 100 years of European history. I am similarly confident that the US has moved on to the point where another huge war over slavery is out of the question.

Whether this argument is compelling or not relies not just on historical mortality data but on qualitative historical judgments that may not convince everyone.

Williams M. Briggs

William M. Briggs gives a thoughtful and colorful review. Like Spagat’s response, whether it is convincing may depend on the ideological background of the reader. But, regardless, it’s an interesting read.¹⁰

Briggs makes the provocative meta-methodological point that “[t]he only reason to build a model of violence—and it’s a darn good reason—is to predict how many dead bodies we expect to create in the future (so we know where not to be).” I can’t agree with Briggs dismissal of modeling but I think his narrative about narratives in historical research is worth consideration:

Cirillo and Taleb looked for historian-defined “wars” and “conflicts” and not what we today call crime. These wars were classed as “events”, except when they lasted more than 25 years when the single event was cut up into multiple “events.” This makes the data more amenable to their model, but at the cost of changing reality. There are difficulties in counting the dead in named wars. Why not a year-by-year tally of violently killed regardless under what flag? Focusing on concrete historian-generated boundaries makes for better stories, but it hinders counting.

Conclusion

I hoped to dive into the Cirillo and Taleb and draw out some useful technical observations. Instead, I found that aspect of this debate pointless, given the almost complete lack of common ground shared by any of the participants in this conversation. It’s not clear to me how to fix this.

Taleb: the right tail is dark and full of terrors; without (Chesterton-style) fences, we can wander into really bad places and not even know it until it’s too late. Pinker: the arc of history bends towards less people dying violently, so whatever caused this can only fairly be called justice. In spite of this, Pinker is the one who gets called a “neo-reactionary”.↩
The tweet was apparently deleted.↩
Maybe this is from personal communication with Beauchamp, since I can’t find those words anywhere online? I would love to be able to see the full exchange between them.↩
Tangentially related to this, Andrew Boland summarizes Taleb’s prescription in The Black Swan as “avoid causal explanations, at least the ones made by people who are smarter than you” and then attempts to knock this down strawman-wise. But if one really took Taleb’s prescription to heart, one would already by immunized to Boland’s attack….↩
The transformation used, letting \(H\) be the maximum human population and \(L\) be the lower detectability threshold for conflict deaths, is \(\phi: [L,H] \to [L,\infty)\) given by \(\phi(x) = L - H \log((H - x) / (H - L))\).↩
For evidence of this claim, see Figure 8. However, although the caption asserts “no significant autocorrelation is visible”, there’s a suggestive peak at 15 years, which might line up with more-or-less numerological cycles.↩
This is somewhat related to the “motte-and-bailey” concept championed by Scott Alexander. The Taleb/Pinker feud has been discussed in SSC comments, but not in the context of motte-and-bailey.↩
Compare the analogous case of testing for a trend break in 1998 for climate data, which is typically used in AGW denial.↩
Spagat has also written on a (relatively) smaller-scale challenges in making sense of post-2003 invasion excess-death data in Iraq.↩
Fans of Zhang Xianzhong will be delighted to see that Briggs quotes the Seven Kill Stele!↩