Analysis 101: Question the Numbers

New York Times columnist Timothy Egan posted an editorial on June 19 criticizing Walmart, and other corporations for alleged sins in the treatment of their employees. Walmart's brilliant response offers a great opportunity to teach a lesson in basic analysis.

Principle of Analysis #1: Always question the numbers.

You can't follow policy debates very long before someone starts throwing numbers and statistics around. Numbers sound authoritative and most people, lacking strong math skills, tend to blindly swallow them as fact, which let's the numerically-armed control the discussion. The mathematically challenged end up on the defensive because math makes the other guy sound like he's an expert (or at least knows more about the subject).

When someone offering a critical opinion starts tossing out statistics, the first thing any good analyst should do is evaluate the credibility of those numbers. You don't have to be a mathematician or statitician to do it; common sense is enough. Just ask these simple questions.


1. What's the source for the numbers? You'd be surprised how many people just make up numbers out of thin air or repeat numbers they've heard (or think they heard) without knowing their real origin. Don't be afraid to ask for a citation or Internet link. The inability to show you the primary source is a big red flag.

Take a look at the third paragraph of Egan's editorial where he calls Walmart a "net drain on taxpayers." That's a conclusion that must be based in math — "Walmart pays less in taxes than it receives in subsidies" — but Egan doesn't show his math and never offers either Walmart's tax bill or the total number of subsidies the company receives. Without those numbers, readers can't do the math for themselves.

So where does Egan get his numbers? He doesn't say. Later in the piece he references this April 2014 report published by Americans for Tax Fairness, so let's assume he found them there. What follows is a pretty ugly path back to the numbers' original source.

Americans for Tax Fairness says Walmart got $6.2 billion in tax subsidies in 2012. What's their source? This May 2013 report written by the Democratic Staff of the U.S. House Committee on Education and the Workforce.

Where did the 2013 Democratic Staff get their number? By updating this 2004 report from the same committee staff (which I had to dig to find) which asserted, "one 200-person Wal-Mart store may result in a cost to federal taxpayers of $420,750 per year."

Where did the 2004 Democratic Staff get their number? By asserting that Walmart's "low wages result in the following additional public costs being passed along to taxpayer..." followed by a list of expense totals from several federal programs like "free and reduced lunches" and "Section 8 housing assistance." But none of the 2004 report's 111 end notes is a citation showing where they got the raw numbers to calculate those totals.

Egan ⇒ Americans for Tax Fairness ⇒ 2013 Democratic Staff ⇒ 2004 Democratic Staff ⇒ ?

In other words, we (and Egan) don't really now where the numbers originated. So Egan's assertion that Walmart is a tax leech likely is grounded in numbers that are at least a decade old and have no cited origin. (We'll quickly see that they're also updated using a questionable methodology).

If Egan knows any of this, he doesn't say it. I'm sure his columns have a word-count limit and he probably doesn't want to chew up column inches with methodological matters; but the omission leaves readers with the impression that he's using current numbers drawn from primary sources, not 10-year old numbers with unknown sources. If Egan hasn't done what I just did, he might not realize it himself. He might've just assumed that Americans for Tax Fairness know what they're talking about...a classic case of the logical fallacy know as "appeal to authority." More on that in a future post.


2. Are the numbers raw or calculated?  When numbers are derived from or adjusted by a mathematical operation or algorithm, there's always the possibility of computational error. Does the person citing the numbers understand how they were calculated? If not, you might offer them a copy of this classic tome.

In our case study, Egan's assertion is based on simple subtraction: subsidies received minus taxes paid equals a positive number, therefore Walmart is a "a net drain on taxpayers." Simple enough...except that it turns out that's not the only mathematical calculation being attempted in that trail of numbers we just tracked. The 2013 Democratic Staff couldn't just use the same numbers in the 2004 report -- almost ten years had passed after all. How did they update them? They extrapolated subsidies data from a single state (Wisconsin), took the medians, and applied them to the other 49 states to come up with an adjusted total for the whole country. 

That's a pretty lousy approach and they know it -- the Democratic Staff admits in its report that "because of varying program eligibility requirements across states, extrapolating taxpayer costs for Walmart stores in other states based on the Wisconsin data is difficult." Difficult isn't the worth for it — "absurd" would be a better term. Add in the disparities and 10-year changes among the 50 states' demographics and any final estimate they could possibly offer would be highly questionable. Neither do they explain that some federal programs impose unfunded mandates on the states, leaving the reader to imagine that the federal government is paying the full 100%. That's not true -- for example, the US government splits Medicaid funding about 50-50 with the states. Do the 2004 or 2014 numbers adjust for that 50% split? We don't know. They don't show us the math.

So even if the 2004 total is solid (which is unconfirmable), the adjusted 2014 number from which Egan must be subtracting Walmart's tax bill is almost certainly bogus.


3. How are the variables defined? When arguing over assertions based in math, one way proponents and opponents shape statistics to their liking is by shaping the definitions that determine which numbers get included in the calculations. So always ask where the lines are drawn.

For example, in our Walmart case study, the 2004 and 2014 Democratic Staffs both tried to total Walmart's government subsidies, but how did they define "subsidies"? Are they just talking about federal subsidies or are state and local subsidies folded in? The 2004 report says "one 200-person Wal-Mart store may result in a cost to federal taxpayers..." (emphasis mine), implying that only federal subsidies are included in the tally, but they only list six. How can that be? There are hundred of low-income federal subsidy programs, so did they really only use six in their total? If so, how did they choose those six? Did they use more but not list them? We don't know either way.

The 2014 report compounds the problem by trying to use one state's data (Wisconsin) to update the 2004 number but their list of subsidy programs is slightly different than the ones used to calculate the original 2004 total -- Medicaid is one example that appears in the 2014 report but not the 2004 report. So the 2014 number Egan relies on uses a different definition of "subsidies" than the one used to calculate the 2004 number that it "updates."


Confused yet? If not, good for you. If so, there's no shame in it -- that confusion is what some "experts" count on. I don't know whether any of these groups are trying intentionally to confuse the audience, but the end result is the same either way. They throw out a blizzard of numbers that appear to support their argument, the audience throws up its collective hands in despair, and everyone just assumes credibility.

Don't assume that. Spouting numbers doesn't make someone credible. Some of Egan's other assertions in his piece might be right (and some of Walmart's counterpoints might be wrong)...but given how many problems there are with his first conclusion in an editorial full of conclusions ostensibly drawn from quantitative data, I wouldn't count on it.

Always question the numbers.