Data :: Peter Frase

Data

Not a riot, it’s a rebellion

August 14th, 2014 | Published in Data, Politics

[Context](http://rap.genius.com/The-coup-the-coup-lyrics).

Solidarity to the people of Ferguson, Missouri, and a hearty fuck you to the cops, their bosses, and to anyone who wants to blather about "rioters" and otherwise engage in bogus "both sides" [equivalency](http://www.businessinsider.com/here-comes-obamas-statement-on-ferguson-2014-8) instead of keeping the focus on the extrajudicial executions of these state-sanctioned death squads. See also Robert Stephens II for an excellent [analysis](https://www.jacobinmag.com/2014/08/in-defense-of-the-ferguson-riots/) of the actions of the people in Ferguson as part of a process of political mobilization rather than simply undirected vandalism.

What is happening in Missouri is horrifying, yet unusual only in the attention it's receiving. I hope it at least wakes people up to the nature of our heavily militarized police forces---Ferguson is in no way unusual. The other day I sent my editors a draft manuscript for the longer-form adaptation of [Four Futures](https://www.jacobinmag.com/2011/12/four-futures/). In discussing the fourth of those futures, Exterminism, I describe the widespread militarization of the police in the United States, which has its roots in the 1960's but has intensified in the post-9/11 period.

This is a literal case of "bringing the war home." Many of the tanks and other equipment that can be found even in small towns are surplus military equipment, given away to police departments when no longer needed in Iraq or Afghanistan. And of course many cops are veterans, who had a chance to learn from the American government's callous approach to civilian life abroad. I struggled to finish that chapter, because it seemed every day brought a new and more horrifying example of what I was writing about.

It all leads here:

This isn't a movie scene #reality RT @FOX2now: Officers stand in a mist of tear gas. Protected by masks. #Ferguson pic.twitter.com/afsqru8rwg

— Trinna Leong (@trinnaleong) August 12, 2014

But I'm only repeating what many are now saying. As some kind of substantive contribution, I figured I'd refute a specific canard that arises from defenders of the [warrior cops](http://www.amazon.com/Rise-Warrior-Cop-Militarization-Americas/dp/1610392116) in situations like this. That is, that all of these trappings of military occupation are necessary because of the oh so dangerous environment the police supposedly face.

Policing is not the country's safest job, to be sure. But as the Bureau of Labor Statistics' [Census of Fatal Occupational Injuries](http://www.bls.gov/iif/oshwc/cfoi/cfoi_rates_2012hb.pdf) shows, it's far from the most dangerous. The 2012 data reports that for "police and sheriff's patrol officers," the Fatal Injury Rate---that is, the "number of fatal occupational injuries per 100,000 full-time equivalent workers"---was 15.0. And that includes all causes of death---of the 105 dead officers recorded in the 2012 data, only 51 [died](http://www.bls.gov/iif/oshwc/cfoi/cftb0272.pdf) due to "violence and other injuries by persons or animals." Nearly as many, 48, died in "transportation incidents," e.g., crashing their cars.

Here are some occupations with higher fatality rates than being a cop:

* Logging workers: 129.9
* Fishers and related fishing workers: 120.8
* Aircraft pilots and flight engineers: 54.3
* Roofers: 42.2
* Structural iron and steel workers: 37.0
* Refuse and recyclable material collectors: 32.3
* Drivers/sales workers and truck drivers: 24.3
* Electrical power-line installers and repairers: 23.9
* Farmers, ranchers and other agricultural managers: 22.8
* Construction laborers: 17.8
* Taxi drivers and chauffeurs: 16.2
* Maintenance and repairs workers, general: 15.7

Of these, construction labor is the one I've done myself. [This](http://www.bgdlegal.com/clientuploads/Publications/Blog%20and%20Article%20Photos/Construction%20Helmet.png) was what our required body armor looked like.

And for good measure, some more that approach the allegedly terrifying risks of being a cop:

* First-line supervisors of landscaping, lawn service, and groundskeeping workers: 14.7
* Grounds maintenance workers: 14.2
* Athletes, coaches, umpires, and related workers: 13.0

While being a cop might not be all that dangerous, being in the presence of cops certainly is. In 2012, there were a minimum of [410 people](http://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2012/crime-in-the-u.s.-2012/offenses-known-to-law-enforcement/expanded-homicide/expanded_homicide_data_table_14_justifiable_homicide_by_weapon_law_enforcement_2008-2012.xls) killed by police, and that includes only those reported to the FBI under the creepy category of "justifiable homicide." The [real number](http://www.thedenverchannel.com/news/teens-shooting-highlights-need-for-tracking-people-killed-by-police) is probably closer to 1000.

Of course, nobody who knows anything about what police actually do, and isn't pushing a reactionary political agenda, thinks cops actually need to be dressed in heavier armor than the [occupiers of Iraq and Afghanistan](https://storify.com/AthertonKD/veterans-on-ferguson). And the fact that you have a better than 1-in-1000 chance of dying in any given year in certain jobs it itself scandalous. But perhaps looking at these numbers helps put the real nature of American policing in a somewhat different perspective.

Infotainment Journalism

May 14th, 2014 | Published in Data, Statistics

We seem, mercifully, to have reached a bit of a [backlash](http://america.aljazeera.com/opinions/2014/3/nate-silver-new-mediajournalismwebstartups.html) to the [data journalism](http://www.salon.com/2014/04/21/538s_existential_problem_what_a_mystifying_failure_can_learn_from_grantland/)/[explainer](http://theplazz.com/2014/05/2190465/) hype typefied by sites like Vox and Fivethirtyeight. Nevertheless, editors in search of viral content find it irresistible to crank out clever articles that purport to illuminate or explain the world with "data".

Now, I am a big partisan of using quantitative data to understand the world. And I think the hostility to quantification in some parts of the academic Left is often misplaced. But what's so unfortunate about the wave of shoddy data journalism is that it mostly doesn't use data as a real tool of empirical inquiry. Instead, data becomes something you sprinkle on top of your substanceless linkbait, giving it the added appearance of having some kind of scientific weight behind it.

Some of the crappiest pop-data science comes in the form of viral maps of various kinds. Ben Blatt [at Slate](http://www.slate.com/articles/arts/culturebox/2014/04/viral_maps_the_problem_with_all_those_fun_maps_of_the_u_s_plus_some_fun.html) goes over a few of these, pertaining to things like baby names and popular bands. He shows how easy it is to craft misleading maps, even leaving aside the inherent problems with using spatial areas to represent facts about populations that occur in wildly different densities.

Having identified the pitfalls, Blatt then decided to try his hand at making his own viral map. And judging by the number of times I've seen his maps of [the most widely spoken language](http://www.slate.com/articles/arts/culturebox/2014/05/language_map_what_s_the_most_popular_language_in_your_state.html) in each state on Facebook, he succeeded. But in what is either a sophisticated troll or an example of "knowing too little to know what you don't know", Blatt's maps themselves are pretty uninformative and misleading.

The post consists of several maps. The first simply categorizes each state according to the most commonly spoken non-English language, which is almost always Spanish. Blatt calls this map "not too interesting", but I'd say it's the best of the bunch. It's the least misleading while still containing some useful information about the French-speaking clusters in the Northeast and Louisiana, and the holdout German speakers in North Dakota.

The next map, which shows the most common non-English and non-Spanish language, is also decent. It's when he starts getting down into more and more detailed subcategories that Blatt really gets into trouble. I'll illustrate this with the most egregious example, the map of "Most Commonly Spoken Native American Language".

Part of the problem is the familiar statistician's issue of sample size. The American Community Survey data that Blatt used to make his maps is extremely large, but you can still run into trouble when you're looking at a small population and dividing it up into 50 states. Native Americans are a tiny part of the population, and those who speak an indigenous language are an even smaller fraction. The more severe issue, though, is that this map would be misleading even if it were based on a complete census of the population.

That's because the Native American population in the United States is extremely unevenly distributed, due to the way in which the American colonial project of genocide and resettlement played out historically. In some areas, like the southwest and Alaska, there are sizable populations. In much of the east of the country, there are vanishingly small populations of people who still speak Native American languages. And without even going to the original data (although I did [do that](https://www.census.gov/hhes/socdemo/language/data/other/detailed-lang-tables.xls)), you can see that there are some things majorly wrong here. But you need a passing familiarity with the indigenous language families of North America, which is basically what I have from a cursory study of them as a linguistics major over a decade ago.

We see that Navajo is the most commonly spoken native language in New Mexico. That's a fairly interesting fact, as it reflects a sizeable population of around 63,000 speakers. But then, we could have seen that already from the previous "non-English and Spanish speakers" map.

But now look at the northeast. We find that the most commonly spoken native language in New Hampshire is Hopi; in Connecticut it's Navajo; in New Jersey it's Sahaptian. What does this tell us? The answer is, approximately nothing. The Navajo and Hopi languages originate in the southwest, and the Sahaptian languages in the Pacific northwest, so these values just reflect a handful of people who moved to the east coast for whatever reason. And a handful of people it is: do we really learn anything from the fact there are 36 Hopi speakers in New Hampshire, compared to only 24 speaking Muskogee (which originates in the south)? That is, if we could even know these were the right numbers. The standard errors on these estimates are larger than the estimates themselves, meaning that there is a very good chance that Muskogee, or some other language, is actually the most common native language in New Hampshire.

I suppose this could be regarded as nitpicking, as could the similar things I could say about some of the other maps. Boy, finding out about those 170 Gujurati speakers in Wyoming sure shows me what sets that state apart from its neighbors! OMG, the few hundred Norwegian speakers in Hawaii might slightly outnumber the Swedish speakers! (Or not.) Even the "non-English and Spanish" map, which I generally kind of like, doesn't quite say as much as it appears---or at least not what it appears to say. The large "German belt" in the plains and mountain west reflects low linguistic diversity more than a preponderance of Krauts. There is a small group of German speakers almost everywhere; in most of these states, the percentage of German speakers isn't much greater than the national average, which is well under 1 percent. In some, like Idaho and Tennessee, it's actually lower.

I belabor all this because I take data analysis seriously. The processing and presentation of quantitative data is a key way that facts are manufactured, a source of things people "know" about the world. So it bothers me to see the discursive pollution of things that are essentially vacuous "infotainment" dressed up in fancy terms like "data science" and "data journalism". I mean, I get it: it's fun to play with data and make maps! I just wish people would leave their experiments on their hard drives rather than setting them loose onto Facebook where they can mislead the unwary.

Trumbo’s Taxes

April 15th, 2014 | Published in Data, Statistical Graphics

Having filed my taxes in my customarily last-minute fashion, I thought I'd get in on the tax day blogging thing. Via [Sarah Jaffe](http://adifferentclass.com/), I came upon the following interesting passage from Victor Navasky's history of the Hollywood blacklist, [*Naming Names*](http://www.amazon.com/Naming-Names-Victor-S-Navasky/dp/0809001837):

> Conversely, during the blacklist years, which were also tight money years for the studios, agents often found it simpler to hint to their less talented clients that their difficulties were political rather than intrinsic. Since agents as a class follow the money, it is perhaps a clue to the environment of fear within which they operated that, for example, the Berg-Allenberg Agency was, even in late 1948, ready, eager, willing, and able to lose its most profitable client, Dalton Trumbo (at $3000 per week he was one of the highest paid writers in Hollywood)---and this even before the more general system of blacklisting had gone into effect.

The first thing that struck me about this that wow, that's a lot of money. It's not clear where the figure came from. But Navasky did interview Trumbo for the book, so I have to assume it came from the man himself. Now, presumably Trumbo wasn't working all the time, but rather getting picked up for various jobs with slack periods in between. But supposing for a moment that he did: $3000 a week (or $156,000 a year) would be a pretty cushy life *now*, so it would have been an astronomical amount of money in 1948. (And it's highly likely that there were people in Hollywood who were making that much. Ben Hecht is said to have gotten [$10,000 a week](http://www.imdb.com/name/nm0372942/bio).)

The second thing is to note that even being as rich and famous as Dalton Trumbo wasn't enough to protect him from the blacklist. In general, of course, the rich stick together and protect their own. But there are some lines you still can't cross, and the blacklist was one of them. In the end, ideological discipline trumped the solidarity of rich people. Which is what makes the rare radical defectors from the ruling class so significant.

But my final thought was, I wonder what Trumbo's net income would have been, had he made that much money? After all, that was the heyday of high marginal tax rates in the United States, those legendary 90 percent tax brackets that seem so unimaginable to people now. So I got to wondering how much Trumbo would have paid in taxes then, and how much he would have paid on a comparable amount of money today.

Fortunately, the Tax Foundation provides excellent data on historical tax rates. I used the spreadsheet [here](http://taxfoundation.org/sites/taxfoundation.org/files/docs/fed_individual_rate_history_nominal_adjusted-2013_0523.xls), which describes the federal income tax regimes from 1913 to 2013. Using that data, we can get a rough approximation of how much our hypothetical Dalton Trumbo would have paid in taxes, although of course it doesn't take into account any particular deductions or loopholes that may have played into an individual situation---and it's well known that few people actually paid the very high marginal rates of that time. So take this as a quick sketch, meant to demonstrate two things. First, how much our tax rates have changed, and second, how marginal tax rates really work.

Here's a table showing how Trumbo's income would have broken down in 1948. Each line shows a single tax bracket. The first three lines show that rate at which income in that bracket was taxed, and the lower and upper bounds that defined which income was taxed at that rate. The last two columns show how much income Trumbo received in each bracket, and how much tax he would have owed on it.

Tax Rate	Over	But Not Over	Income	Taxes
20.0%	$0	$2,000	$2,000	$400.00
22.0%	$2,000	$4,000	$2,000	$440.00
26.0%	$4,000	$6,000	$2,000	$520.00
30.0%	$6,000	$8,000	$2,000	$600.00
34.0%	$8,000	$10,000	$2,000	$680.00
38.0%	$10,000	$12,000	$2,000	$760.00
43.0%	$12,000	$14,000	$2,000	$860.00
47.0%	$14,000	$16,000	$2,000	$940.00
50.0%	$16,000	$18,000	$2,000	$1,000.00
53.0%	$18,000	$20,000	$2,000	$1,060.00
56.0%	$20,000	$22,000	$2,000	$1,120.00
59.0%	$22,000	$26,000	$4,000	$2,360.00
62.0%	$26,000	$32,000	$6,000	$3,720.00
65.0%	$32,000	$38,000	$6,000	$3,900.00
69.0%	$38,000	$44,000	$6,000	$4,140.00
72.0%	$44,000	$50,000	$6,000	$4,320.00
75.0%	$50,000	$60,000	$10,000	$7,500.00
78.0%	$60,000	$70,000	$10,000	$7,800.00
81.0%	$70,000	$80,000	$10,000	$8,100.00
84.0%	$80,000	$90,000	$10,000	$8,400.00
87.0%	$90,000	$100,000	$10,000	$8,700.00
89.0%	$100,000	$150,000	$50,000	$44,500.00
90.0%	$150,000	$200,000	$6,000	$5,400.00
91.0%	$200,000	-	$0	$0.00

This is a nice illustration of how marginal tax rates work. There is still, unbelievably, widepread confusion about this. People think that if the marginal tax rate is 90 percent on income over $150,000---as it was in 1948---then that means you'll only keep 10 percent of all your income if you make that much money. But Trumbo wouldn't pay 90 percent on all of his $156,000, only on the $6000 that was over the $150,000 threshold.

So what was Trumbo's real, overall tax rate? The tax figures above sum up to a total bill of $117,220. The Tax Foundation data also describes some additional reductions that were applied that year: 17 percent on taxes up to $400, 12 percent on taxes from $400 to $100,000, and 9.75 percent on taxes above $100,000. Taking those reductions into account, the tax bill comes down to $103,521.

So Trumbo would have had a net income of $52,479 in 1948, for an effective tax rate of 66 percent. Now, that's not 90 percent, but some will surely say that this seems like an unreasonably high level, for reasons of fairness or work incentives or whatever. But let's keep in mind just how where our Trumbo falls in the 1948 United States' distribution of income. Here's a graphical representation of the above data:

Each bar is a tax bracket. The width of the bar shows how wide the bracket is, while the height shows the income earned in that bracket. The red-shaded portion shows how much of that income was paid in tax. This is a bit visually misleading, because the amount of income in each bar corresponds only to the *height* of the box, not its volume. But I'll swallow my data-visualization pride for the sake of a quick blog post.

A few things to note about this graph. You can see how much of the income in the higher brackets was taxed away, due to the extremely high rates there. You can also see that the tax system is progressive, because the height of the red bars slopes upward, even when the amount of money contained in the brackets remains the same. But the most important thing to pay attention to is that dotted line that you can barely see on the far left. That's the median personal income in the United States for 1948, which according to the Census Bureau was around $1900. In other words, almost all of this would have been irrelevant to half the population, who would have paid just the lowest rate, 20 percent, on all of their income.

If we adjust Trumbo's income for inflation with the [Consumer Price Index](https://www.census.gov/hhes/www/income/data/incpovhlth/2012/CPI-U-RS-Index-2012.pdf), his income would be equivalent to over 1.5 million dollars today. And the tax bill would have been over 1 million dollars. But how would that kind of pay be taxed now? Here's a table like the one above, except applying current tax rates to Trumbo's inflation-adjusted pay:

Tax Rate	Over	But Not Over	Income	Taxes
10.0%	$0	$17,850	$17,850	$1,785.00
15.0%	$17,850	$72,500	$54,650	$8,197.50
25.0%	$72,500	$146,400	$73,900	$18,475.00
28.0%	$146,400	$223,050	$76,650	$21,462.00
33.0%	$223,050	$398,350	$175,300	$57,849.00
35.0%	$398,350	$450,000	$51,650	$18,077.50
39.6%	$450,000		$1,066,944	$422,509.82

What a difference 65 years and two generations of neoliberalism makes! Now Trumbo's effective tax rate is only 36.15 percent, and he takes home $968,000 after a $548,000 tax bill. To finish things up, here's a graphical representation like the one above:

This time, most of the income falls into the top bracket. But since the rate there is only 39.6 percent, our hypothetical 2013 Trumbo still keeps most of his money. And once again, these brackets are mostly irrelevant to most of the population---note the line marking median income.

The punchline to this story, of course, is that it was things like the Hollywood blacklist that helped set the stage for the period of conservative reaction that gave us these tax rates. Check this nice [documentary](http://www.netflix.com/WiMovie/Trumbo/70081095) on Dalton Trumbo to get a sense of a Hollywood radical who puts most of our contemporary celebrity liberals to shame.

*The spreadsheet used to estimate these figures is [here](http://www.peterfrase.com/wordpress/wp-content/uploads/2014/04/TrumboTaxes.xlsx), if you care to play with it yourself.*

The State of the Unions

September 2nd, 2011 | Published in Data, Work

Here's something timely for Labor Day: a couple of my colleagues at CUNY have produced a report on the state of union membership--focused on New York State and City, but with national numbers included as well. (I did some work on the report as well, but my role was limited to designing the layout, so I can take no credit for the writing or data analysis.)

The broad findings will not be surprising to those who follow these things: the percentage of workers who are members of labor unions has fallen at a fairly rapid pace in the past ten years, and has continued to fall during the recession. This trend is driven primarily by the decline in private sector unionization--union density in the public sector is both much higher and fairly stable over the past decade.

There are lots of other interesting details in the report, which includes breakdowns by age, gender, race, education, industry, and immigration status. You should [go read the whole thing](http://www.urbanresearch.org/news/second-annual-state-of-the-unions-report-released-in-commemoration-of-2011-labor-day), but here a few semi-randomly chosen facts that I found interesting:

- People with at least a 4-year college degree are the most likely to be union members.
- This is probably because the sector of the economy with by far the highest unionization rates is education, which is also one of the biggest sectors. It's not surprising to see teachers bearing the brunt of anti-union attacks, when you realize what a huge portion of American union members they constitute.
- In the U.S. as a whole, men are more likely to be union members than women. In New York City, though, women are actually more unionized--largely because they tend to work in the highly-unionized public sector. Women are the future of the labor movement, if it is to have one.
- Blacks and whites are unionized at roughly equal rates nationwide, but blacks are much more highly unionized in New York, again probably because blacks are more likely to work in the public sector.
- It's true, as you might expect, that immigrant workers are less likely to be unionized than native born workers. But that's really just a small subplot of the broader story of declining unionization: workers who immigrated recently are much less unionized than those who immigrated earlier, just as young workers are much less unionized than older workers; people who immigrated before 1990 are unionized at a higher rate than native-born workers.

For more analysis, and lots of graphs and tables, go [check out the report](http://www.urbanresearch.org/news/second-annual-state-of-the-unions-report-released-in-commemoration-of-2011-labor-day).

These facts about unions bear on some of the recent [discussions](http://www.peterfrase.com/2011/07/policy-politics-and-strategy/) of [theories of politics](http://crookedtimber.org/2011/07/19/20991/) and the political basis of progressive politics under neoliberalism. Leftists and liberals still don't really have a credible strategy for building a winning progressive coalition that isn't centered on the labor movement. The decline in union density, and the transformation of the labor movement from a private sector to a public sector institution, force us to ask some hard questions. Either the labor movement has to be revived, or we need a new institutional basis for the left. I tend to be pessimistic about reviving labor in anything like its traditional form, since we really only have one historical example of sustained union strength, and that was based on an industrial economy that [isn't coming back](http://www.peterfrase.com/2011/04/the-united-states-makes-things/).

But there are obviously a lot of things that would help labor to recover at least a bit (EFCA, sigh). I'll close with one thing that's based on a personal observation, from on my experience as a member of a union bargaining committee that recently [negotiated a first contract](http://psc-cuny.org/new-union-contract-cuny-research-foundation-workers). I'm convinced that severing the connection between health care and employment would be really good for unions, despite the labor movement's opposition to [some of the moves](http://articles.latimes.com/2010/jan/15/nation/la-na-health-congress15-2010jan15) in this direction. A huge amount of our negotiating time was taken up with a fight over how the cost of health insurance would be divided between employer and employee, in the context of premiums that are accelerating rapidly for reasons neither workers nor bosses can control. The need to hold down our members' health care costs sucked up a huge amount of bargaining time and money that could otherwise have gone to providing raises or addressing other aspects of the work environment. If there were a real, quality public option for health care, I would have considered trying to sell my fellow members on a radical idea: let's propose phasing out employer-provided insurance, getting people onto public plans, and putting those employer savings into big wage increases. But for now, that's just a dream for the future, and instead the best I can tell those members is that we successfully fought for their health care costs to skyrocket less rapidly than their non-union counterparts.

The Recession and the Decline in Driving

August 19th, 2011 | Published in Data, Social Science, Statistical Graphics, Statistics

Jared Bernstein [recently posted](http://jaredbernsteinblog.com/miles-to-go-before-we-sleep/) the graph of U.S. Vehicle Miles Traveled released by the Federal Highway Administration. Bernstein notes that normally, recessions and unemployment don't affect our driving habits very much--until the recent recession, miles traveled just kept going up. That has changed in recent years, as VMT still hasn't gotten back to the pre-recession peak. Bernstein:

> What you see in __the current period is a quite different—a massive decline in driving over the downturn with little uptick since.__ Again, both high unemployment and high [gas] prices are in play here, so there may be a bounce back out there once the economy gets back on track. But it bears watching—__there may be a new behavioral response in play, with people's driving habits a lot more responsive to these economic changes than they used to be.__

> Ok, but what's the big deal? Well, I've generally been skeptical of arguments about "the new normal," thinking that __much of what we're going through is cyclical__, not structural, meaning things pretty much revert back to the old normal once we're growing in earnest again. __But it's worth tracking signals like this that remind one that at some point, if it goes on long enough, cyclical morphs into structural.__

Brad Plumer [elaborates](http://www.washingtonpost.com/blogs/ezra-klein/post/why-are-americans-driving-less/2011/08/18/gIQAUv7tNJ_blog.html):

> __What could explain this cultural shift? Maybe more young people are worried about the price of gas or the environment.__ But—and this is just a theory—technology could play a role, too. Once upon a time, newly licensed teens would pile all their friends into their new car and drive around aimlessly. For young suburban Americans, it was practically a rite of passage. Nowadays, however, __teens can socialize via Facebook or texting__ instead—in the Zipcar survey, more than half of all young adults said they'd rather chat online than drive to meet their friends.

> But that's all just speculation at this point. As Bernstein says, __it's still unclear whether the decline in driving is a structural change or just a cyclical shift that will disappear once (if) the U.S. economy starts growing again.__

Is it really plausible to posit this kind of cultural shift, particularly given the evidence about the [price elasticity of oil](http://motherjones.com/kevin-drum/2011/04/raw-data-everyone-loves-oil)? As it happens, I did a bit of analysis on this point a couple of years ago. Back then, Nate Silver wrote a [column](http://www.esquire.com/features/data/nate-silver-car-culture-stats-0609) in which he tried to use a regression model to address this question of whether the decline in driving was a response to economic factors or an indication of a cultural trend. Silver argued that economic factors--in his model, unemployment and gas prices--couldn't completely explain the decline in driving. If true, that result would support the "cultural shift" argument against the "cyclical downturn" argument.

I wrote a [series](http://www.peterfrase.com/2009/05/attempt-to-regress/) [of](http://www.peterfrase.com/2009/05/predictin/) [posts](http://www.peterfrase.com/2009/05/one-last-time/) in which I argued that with a more complete model--including wealth and the lagged effect of gas prices--the discrepancies in Silver's model seemed to disappear. That suggests that we don't need to hypothesize any cultural change to explain the decline in driving. You can go to those older posts for the gory methodological details; in this post, I'm just going to post an updated version of one of my old graphs:

The blue line is the 12-month moving average of Vehicle Miles Travelled--the same thing Bernstein posted. The green and red lines are 12-month moving averages of *predicted* VMT from two different regression models--the Nate Silver model and my expanded model, as described in the earlier post I linked. The underlying models haven't changed since my earlier version of this graph, except that I updated the data to include the most recent information, and switched to the 10-city Case Shiller average for my house price measure, rather than the OFHEO House Price Index that I was using before, but which seems to be an [inferior measure](http://www.calculatedriskblog.com/2008/01/house-prices-comparing-ofheo-vs-case.html).

The basic conclusion I draw here is the same as it was before: a complete set of economic covariates does a pretty good job of predicting miles traveled. In fact, even Nate Silver's simple "gas prices and unemployment" model does fine for recent months, although it greatly overpredicts during the depths of the recession.\* So I don't see any cultural shift away from driving here--much as I would like to, since I personally hate to drive and I wish America wasn't built around car ownership. Instead, the story seems to be that Americans, collectively, have experienced an unprecedented combination of lost wealth, lost income, and high gas prices. That's consistent with graphs like [these](http://thinkprogress.org/yglesias/2011/07/18/271412/the-consumer-bust-and-the-inevitability-of-politics/), which look a lot like the VMT graph.

The larger point here is that we can't count on shifts in individual preferences to get us away from car culture. The entire built environment of the United States is designed around the car--sprawling suburbs, massive highways, meager public transit, and so on. A lot of people can't afford to live in walkable, bikeable, or transit-accessible places even if they want to. Changing that is going to require a long-term change in government priorities, not just a cultural shift.

Below are the coefficients for my model. The data is [here](http://www.peterfrase.com/wordpress/wp-content/uploads/2011/08/silver_driving_2011.csv), and the code to generate the models and graph is [here](http://www.peterfrase.com/wordpress/wp-content/uploads/2011/08/silver_driving_2011.R.txt).

Coef. s.e.

(Intercept) 111.55 2.09

unemp -1.57 0.27

gasprice -0.08 0.01

gasprice_lag12 -0.03 0.01

date 0.01 0.00

stocks 0.58 0.23

housing 0.10 0.01

monthAugust 17.52 1.01

monthDecember -9.21 1.02

monthFebruary -31.83 1.03

monthJanuary -22.90 1.02

monthJuly 17.84 1.02

monthJune 11.31 1.03

monthMarch -0.09 1.03

monthMay 12.08 1.02

monthNovember -10.46 1.01

monthOctober 5.82 1.01

monthSeptember -2.73 1.01

---

n = 234, k = 18

residual sd = 3.16, R-Squared = 0.99

\* *That's important, since you could otherwise argue that the housing variable in my model--which has seen an unprecedented drop in recent years--is actually proxying a cultural change. I doubt that for other reasons, though. If housing is removed from the model, it underpredicts VMT during the runup of the bubble, just as Silver's model does. That suggests that there is some real wealth effect of house prices on driving.*

Redistribution Under Neoliberalism

August 8th, 2011 | Published in Data, Political Economy, Politics, Social Science, Statistical Graphics, xkcd.com/386

Last week, Seth Ackerman wrote a *Jacobin* [blog post](http://jacobinmag.com/blog/?p=891) in which he gave us a snarky attack on the record of "left neo-liberalism" in the United Kingdom. Basically, he showed that while New Labour managed to reduce poverty somewhat with cash transfer programs, the progress was meager and could not be sustained. Since the programs were financed out of a series of asset bubbles, the UK has seen poverty go back up again with the recent crisis.

I don't have much quarrel with this account, but I'm not sure it can bear the weight of the argument that Seth wants to put on it. He suggests that the UK experience is a refutation of the general strategy of progressive neoliberalism, which Freddie DeBoer felicitously dubbed ["globalize-grow-give"](http://lhote.blogspot.com/2011/01/globalize-grow-give-progressivism-and.html):

> First, you embrace the standard globalization model of reduced or eliminated tariff walls, large free trade agreements such as NAFTA or CAFTA, deregulation, and general trade liberalization. This encourages international trade and the exporting of jobs from highly-regulated, fairly well compensated, high worker standard of living places like the United States to the cheap labor, low regulation, low worker standard of living places like China or Indonesia. This spurs international economic growth in both the exporting and importing countries. Here at home, higher growth results in higher tax revenues which can then be redistributed from those at the top of the income distribution (who have benefited from the globalized trade regime) to those at the bottom of the income distribution (who have been hurt by the globalized trade regime that undercuts their wages and exports their jobs).

I think that if you want to really criticize this view, you need to look beyond the UK, which is neither a very generous nor a particularly well-designed welfare state. As it happens, my day job involves analyzing cross-national income data, so I'm going to perpetrate some social science on y'all.

The way I read the "globalize-grow-give" critique, you can extract an empirical claim about how the income distribution should look in a G-G-G economy. The distribution of income *before* taxes and transfers will become increasingly unequal due to deregulation and globalization, but the distribution *after* taxes and transfers are accounted for will not become vastly more unequal because government is compensating for the inequality in the private market.

To test this, I did some simple calculations, following other researchers who have done [similar](http://www.lisproject.org/publications/liswps/392.pdf) [things](http://www.lisproject.org/publications/liswps/458.pdf). Using data from the [Luxembourg Income Study](http://www.lisdatacenter.org/), I calculated the [Gini coefficient](http://en.wikipedia.org/wiki/Gini_coefficient), a standard measure of inequality, for several different countries. I calculated two different Ginis:

- The Gini of *market income*. Market income is defined here as income from wages, pensions, self-employment and property. This is income *before* any taxes or transfers are accounted for.
- The Gini of *disposable income*. This is the income that people actually have to spend, after taxes are deducted and any transfers are added in. (For more details about the variables, see the postscript).

Unfortunately, the difficulty of harmonizing cross-national data means that the numbers I have access to are a bit out of date--specifically, they end before the current crisis period. I still think we can learn something useful from them, however. The way G-G-G neoliberalism is supposed to work, the Gini of market income should go up but the Gini of disposable income should not--or at least should rise more slowly. We can think of the difference between market income inequality and disposable income inequality as a rough measure of the amount of redistribution done by the state.

So here's what things look like in the UK:

This figure basically supports Seth's argument. Market income inequality has gone way up in the last few decades, but disposable income inequality has gone up by a lot as well. The state is doing a bit more redistribution than it used to, but not enough to make up for the rise in private-market inequality. If you look at the United States, the situation is even worse, as the state has done essentially nothing to counter rising inequality in market income:

The question, though, is whether it has to be like this. Let's put the UK alongside another rich European economy, Germany:

Here we see something very interesting. Before you take taxes and transfers into account, the rise in inequality in Germany looks very similar to what happened in the UK--indeed, the two countries converge to almost the same value by 2005. But disposable income inequality has stayed flat in Germany, because the German state has used taxes and transfers to counteract rising inequality.

Every good social democrat loves the Nordic model, so let's finish off with a look at Sweden:

Here the story is a bit different--both market income and disposable income inequality have remained pretty flat, although both have risen a bit. The important thing to note here is that even in the most socialist of welfare states, market income inequality is very high, nearly as high as it is in the UK or US. The fact that Sweden is one of the least unequal countries on earth has to do almost entirely with taxes and transfers.

So what can we conclude from all this? Let me be clear that I don't think this is a knock-down argument in favor of "globalize-grow-give" as a political model. But I think the best argument against the G-G-G model is not that it's economically impossible or dependent on asset bubbles. Rather, I'd point us back to the political arguments enumerated by [me](http://www.peterfrase.com/2011/07/policy-politics-and-strategy/), [Henry Farrell](http://crookedtimber.org/2011/07/25/neo-liberalism-the-submerged-state-and-the-politics-of-nudge/), and [Cosma Shalizi](http://cscs.umich.edu/~crshalizi/weblog/778.html) among others. What makes Sweden and Germany different is not that their economies are different from those in the US and UK (although they are), but that they have different political environments, featuring things like a hegemonic Social Democratic party in Sweden and a strong labor movement in Germany.

So if left-neoliberalism is to be a workable political agenda rather than the motto of useful idiots for the "globalize-grow-keep" agenda of the right-wing neoliberals, it has to either make its peace with the sources of working-class power that currently exist, or else come up with workable models of what might replace them.

*[Postscript for income inequality nerds only: the income variables are equivalized for household size using the square root of the number of persons in the household as the equivalence scale. The variables are then topcoded at ten times the equivalized mean and bottom-coded at 1 percent of the equivalized mean.*

*Note that the transfers included in disposable income are only cash transfers and "near-cash" benefits (like food stamps), not in-kind services like health care. So you could argue that this data actually understates the extent of redistribution.*

*If you'd like to look at the data, including a bunch of countries I didn't include in the post, it's [here](http://www.peterfrase.com/wordpress/wp-content/uploads/2011/08/mi_dpi_gini1.csv). For help interpreting the country codes, go [here](http://www.lisdatacenter.org/our-data/lis-database/documentation/list-of-datasets/)]*

Obligatory Google Ngram Post

December 20th, 2010 | Published in Data, Social Science, Statistical Graphics, Time

It appears that everyone with a presence on the Internet is obligated to post some kind of riff on the [amazing Google Ngram Viewer](http://ngrams.googlelabs.com/info). Via Henry Farrell, I see that Daniel Little has attempted to [perpetrate some social science](http://understandingsociety.blogspot.com/2010/12/new-tool-for-intellectual-history.html), which made me think that perhaps while I'm at it, I can post something that actually relates to my dissertation research for a change. Hence, this:

Click for a bigger version, but the gist is that the red line indicates the phrase "higher wages", and the blue line is "shorter hours". Higher wages have a head start, with hours not really appearing on the agenda until the late 19th century. That's a bit later than I expected, but it's generally consistent with what I know about hours-related labor struggle in the 19th century.

The 20th century is the more interesting part of the graph in any case. For a while, it seems that discussion of wages and hours moves together. They rise in the period of ferment after World War I, and again during the depression. Both decline during World War II, which is unsurprising--both wage and hour demands were subordinated to the mobilization for war. But then after the war, the spike in mentions of "higher wages" greatly outpaces mentions of "shorter hours"--the latter has only a small spike, and thereafter the phrase enters a secular decline right through to the present.

Interest in higher wages appears to experience a modest revival in the 1970's, corresponding to the beginnings of the era of wage stagnation that we are still living in. But for the first time, there is no corresponding increase in discussion of shorter hours. This is again not really surprising, since the disappearance of work-time reduction from labor's agenda as been widely remarked upon. But it's still pretty interesting to see such evidence of it in the written corpus.

Making things, marking time

January 27th, 2010 | Published in Data, Political Economy, R, Work

Today Matt Yglesias revisits a favorite topic of mine, the distinction between U.S. manufacturing employment and manufacturing production. It has become increasingly common to hear liberals complain about the "decline" in American manufacturing, and lament that America doesn't "make things" anymore:

Harold Meyerson had a typical riff on this recently:

Reviving American manufacturing may be an economic and strategic necessity, without which our trade deficit will continue to climb, our credit-based economy will produce and consume even more debt, and our already-rickety ladders of economic mobility, up which generations of immigrants have climbed, may splinter altogether.

. . .

The epochal shift that's overtaken the American economy over the past 30 years . . . finance, which has compelled manufacturers to move offshore in search of higher profit margins . . . retailers, who have compelled manufacturers to move offshore in search of lower prices for consumers and higher profits for themselves

. . .

Creating the better paid, less debt-ridden work force that would emerge from a shift to an economy with more manufacturing and a higher rate of unionization would reduce the huge revenue streams flowing to the Bentonvilles (Wal-Mart's home town) and the banks . . . . The campaign contributions from the financial sector to Democrats and Republicans alike now dwarf those from manufacturing -- a major reason why our government's adherence to free-trade orthodoxy in what is otherwise a mercantilist world is likely to persist.

. . .

[Sen. Sherrod] Brown . . . acknowledges that as manufacturing employs a steadily smaller share of the American work force, "younger people probably don't think about it as much" as their elders . . . . Politically, American manufacturing is in a race against time: As manufacturing becomes more alien to a growing number of Americans, its support may dwindle, even as the social, economic, and strategic need to bolster it becomes more acute. That makes push for a national industrial policy -- to become again a nation that makes things instead of debt, to build again our house upon a rock -- even more urgent.

I don't dispute that manufacturing has become "more alien" to the bulk of American working people. But I question Meyerson's explanation for why this has happened, and I wonder whether we should really be so horrified by it. The evidence suggests that the decline in manufacturing employment in this country has been driven not primarily by offshoring (as Meyerson would have it), but by a dramatic increase in productivity. Yglesias provides one graphical illustration of this; here is my home-brewed alternative, going back to World War II:

This picture leaves some unanswered questions, to be sure. First, one would want to know what kind of manufacturing has grown in the U.S., for one thing; however, my cursory examination of the data suggests that U.S. output is still more heavily oriented toward consumer goods over defense and aerospace production, despite what one might think. Second, it's possible that the globally integrated system of production is "hiding" labor in other parts of the supply chain, in China and other countries with low labor costs.

But I don't think the general story of rapidly increasing productivity can be easily ignored. To really reverse the decline in manufacturing employment, we would need to have something like a ban on labor-saving technologies, in order to return the U.S. economy to the low-productivity equilibrium of forty or fifty years ago. Of course, that would also require either reducing American wages to Chinese levels or imposing a level of autarchy in trade policy beyond what any left-protectionist advocates.

Needless to say, I think this modest proposal is totally undesirable, and I raise it only to suggest the folly of "rebuilding manufacturing" as a slogan for the left. As Yglesias observes in the linked post, manufacturing now seems to be going through a transition like the one that agriculture experienced in the last century: farming went from being the major activity of most people to being a niche of the economy that employs very few people. Yet of course food hasn't ceased to be one of the fundamental necessities of human life, and we produce more of it than ever.

And yet I understand the real problem that motivates the pro-manufacturing instinct among liberals. The decline in manufacturing has coincided with a massive increase in income inequality and a decline in the prospects for low-skill workers. Moreover, the decline of manufacturing has coincided with the decline of organized labor, and it is unclear whether traditional workplace-based labor union organizing can ever really succeed in a post-industrial economy. But the nostalgia for a manufacturing-centered economy is an attempt to universalize a very specific period in the history of capitalism, one which is unlikely to recur.

The obsession with manufacturing jobs is, I think, a symptom of a larger weakness of liberal thought: the preoccupation with a certain kind of full-employment Keynesianism, predicated on the assumption that a good society is one in which everyone is engaged in full-time waged employment. But this sells short the real potential of higher productivity: less work for all. As Keynes himself observed:

For the moment the very rapidity of these changes is hurting us and bringing difficult problems to solve. Those countries are suffering relatively which are not in the vanguard of progress. We are being afflicted with a new disease of which some readers may not yet have heard the name, but of which they will hear a great deal in the years to come-‑namely, technological unemployment. This means unemployment due to our discovery of means of economising the use of labour outrunning the pace at which we can find new uses for labour.

But this is only a temporary phase of maladjustment. All this means in the long run that mankind is solving its economic problem. I would predict that the standard of life in progressive countries one hundred years hence will be between four and eight times as high as it is to‑day. There would be nothing surprising in this even in the light of our present knowledge. It would not be foolish to contemplate the possibility of afar greater progress still.

. . .

Thus for the first time since his creation man will be faced with his real, his permanent problem‑how to use his freedom from pressing economic cares, how to occupy the leisure, which science and compound interest will have won for him, to live wisely and agreeably and well.

Productivity has continued to increase, just as Keynes predicted. Yet the long weekend of permanent leisure never arrives. This--and not deindustrialization--is the cruel joke played on working class. The answer is not to force people into deadening make-work jobs, but rather to acknowledge our tremendous social wealth and ensure that those who do not have access to paid work still have access to at least the basic necessities of life--through something like a guaranteed minimum income.

Geeky addendum: I thought the plot I made for this post was kind of nice and it took some figuring out to make it, so below is the R code required to reproduce it. It queries the data sources (A couple of Federal Reserve sites) directly, so no saving of files is required, and it should automatically use the most recent available data.

manemp <- read.table("http://research.stlouisfed.org/fred2/data/MANEMP.txt",
   skip=19,header=TRUE)
names(manemp) <- tolower(names(manemp))
manemp$date <- as.Date(manemp$date, format="%Y-%m-%d")
 
curdate <- format(as.Date(substr(as.character(Sys.time()),1,10)),"%m/%d/%Y")
 
outputurl <- url(paste(
   'http://www.federalreserve.gov/datadownload/Output.aspx?rel=G17&amp;series=063c8e96205b9dd107f74061a32d9dd9&amp;lastObs=&amp;from=01/01/1939&amp;to=',
   curdate,
   '&amp;filetype=csv&amp;label=omit&amp;layout=seriescolumn',sep=''))
 
manout <- read.csv(outputurl,
   as.is=TRUE,skip=1,col.names=c("date","value"))
manout$date <- as.Date(paste(manout$date,"01",sep="-"), format="%Y-%m-%d") par(mar=c(2,2,2,2)) plot(manemp$date[manemp$date&gt;="1939-01-01"],
   manemp$value[manemp$date&gt;="1939-01-01"],
type="l", col="blue", lwd=2,
xlab="",ylab="",axes=FALSE, xaxs="i")
axis(side=1,
   at=as.Date(paste(seq(1940,2015,10),"01","01",sep="-")),
   labels=seq(1940,2015,10))
text(as.Date("1955-01-01"),17500,
   "Manufacturing employment (millions)",col="blue")
axis(side=2,col="blue")
 
par(new=TRUE)
plot(manout$date,manout$value,
   type="l", col="red",axes=FALSE,xlab="",ylab="",lwd=2,xaxs="i")
text(as.Date("1975-01-01"),20,
   "Manfacturing output (% of output in 2002)", col="red")
axis(side=4,col="red")

Measuring globalization

January 25th, 2010 | Published in Data, Social Science, Statistical Graphics

Via the Monkey Cage, an interesting and comprehensive new database, the "KOF Index of Globalization". I'm generally a bit leery of attempts to boil down complex configurations of political economy into a pat "index", but this one is reasonably straightforward, measuring both "economic globalization" (economic flows and trade restrictions) and "political globalization" (participation in international institutions and diplomatic relations.) The example graph at the Monkey Cage is interesting, but I immediately thought it would be better represented like this:

This could be cleaned up to deal with the overlapping names, and additional information might be useful (such as the average globalization score of all countries in each year, and the maximum and minimum scores), but I think this is pretty informative. You can see the overall political and economic integration of these countries into the capitalist world, for example. There's also the increasing distance between the main cluster of countries on the one hand, and the insular autocracies of Belarus and Uzbekistan on the other.

The data seem so much less real once you ask the same person the same question twice

October 12th, 2009 | Published in Data, Social Science, Statistics

I identify with Jeremy Freese to an unhealthy degree. When the other options are to a) have a life; or b) do something that advances his career, he chooses to concoct a home-brewed match between GSS respondents in 2006 and their 2008 re-interviews. I would totally do this. I still might do this.

And then he drops the brutal insight that provides my title. Context.

UPDATE: And then Kieran Healy drops this:

The real distinction between qualitative and quantitative is not widely appreciated. People think it has something to do with counting versus not counting, but this is a mistake. If the interpretive work necessary to make sense of things is immediately obvious to everyone, it’s qualitative data. If the interpretative work you need to do is immediately obvious only to experts, it’s quantitative data.

Peter Frase

Data

Not a riot, it’s a rebellion

Infotainment Journalism

Trumbo’s Taxes

The State of the Unions

The Recession and the Decline in Driving

Redistribution Under Neoliberalism

Obligatory Google Ngram Post

Making things, marking time

Measuring globalization

The data seem so much less real once you ask the same person the same question twice

RSS Feeds

Blog

Blog Topics

Bookmarks

Archives