Today I came across the article: “High Home Ownership Is Strongly Linked To High Unemployment [STUDY]” from the Business Insider – link.
Seems fascinating, right? Perhaps we be changing public policies encouraging home ownership?
Unfortunately, there seems to be quite a leap from the original research saying “We should be cautious before imputing meaning into such patterns” to the claims in the article. Moreover, the original research article has basic flaws like conflicting data, sources without citation, and inaccurate descriptions of how the data was collected. Beyond those basic errors, I’m unconvinced my the study’s central point. And, even if we take the association to be valid, I am unconvinced that there is an argument for causation or by ability for the hypothesis to explain the data they present. All in all, this shows how easy it is to make a very quick leap from a potential association in data, to assuming causation, to questioning policy.
Since the BI article doesn’t cite the original paper, I searched for it and believe the original article is this one as it’s on topic and by the mentioned author.
Although the research is focused on statistical associations and does not (and can not) make any claims of causation, the researcher Oswald clearly believes that there is causation as he is reported in the BI article to have said “I have become convinced that by boosting home ownership we have ruined our labor market.”
Next, the report suggests that it’s the homeowners who are disproportionately unemployed, when the abstract for the research says “Our argument is not that owners
themselves are disproportionately unemployed.” The research article later reports “nor does [the conclusions] rely on the idea that home owners are themselves disproportionately unemployed (there is a considerable literature that suggests such a claim is false, or, at best, weak).” This arguably confusing in the original research, though, since 2 of the 3 suggested causes explaining how home ownership rates may increase unemployment rates a few years later are about reasons why a homeowner might be less employed (lack of mobility and long commutes).
Looking to the original research, even upon a relatively quick read of the paper I found several things that made me question their results. The three most basic are about the data itself:
Critical data sources are not cited, or the citations are incorrect.
For example, Tables 1 and 2 contain the raw data used in the study. Yet, the citations for these are not included in the text and the source for the tables says only: “Source: US Census Bureau” and “Source(here and in the next table): Current Population Survey.” None of the cited references include the US Census Bureau.
Searching for any links to the US Census Bureau, I found one in footnote #5. Unfortunately, the posted link http://www.census.gov/prod/www/abs/decennial/1950cenpopv2.html is broken and leads to a page saying “We are really sorry but the page you requested cannot be found.”
Data in two different tables, Table 1 and Table 2a, is contradictory.
For example, both tables include data on the 2000 and 2010 home ownership rates in the United States. In one table these are given as 66.2% and 65.1% while in the other they are 67.4% and 66.9%.
There is no discussion of why these tables disagree. Nor, did they clearly say which data source was used in their analysis, why the other was included if it was not used or how they merged the two sources, or why we source was more appropriate for their study or not.
Sampling description is inaccurate, and sample size seems misleading
The introduction claims that “Using data on two million randomly sampled Americans, we also estimate equations for the number of weeks worked, the extent of labor mobility, the length of commuting times, and the number of businesses.” Later, they explain in more detail that “Table 7 … estimates a weeks-worked equation using data from the March Current Population Surveys between 1992 and 2011. The sample size is approximately 2 million individuals.”
However, according to the US Census Bureau “the CPS [Current Population Survey] is administered by the Census Bureau using a probability selected sample of about 60,000 occupied households.” Each of these household in the study is sampled 8 times: “in the survey for 4 consecutive months, out for 8, and then return for another 4 months before leaving the sample permanently.”
In short, the survey :
* samples households, not Americans as claimed
* samples only 60,000 households per monthly data point, not 2 million
* includes households that are are probability selected, not random as described. The method, described here, seems much better than random. But the point is that it is different from what Oswald’s paper reported.
Granted, if there are 60,000 households sampled monthly for 20 years and each household is sampled 8 times, then there are 60,000 households * 12 months * 20 years / 8 repetitions = 1.8M households sampled total. 1.8M is within the ballpark of the 2 million reported, so perhaps this is how they came up with the 2 million value to report. But, I would imagine that many people reading the statement “using data on two million randomly sampled Americans,” would think that these 2 million were tracked throughout the data and not just for 8 months within one year in a 20 year study.
Citing sources, reporting internally consistent data (or discussing why it’s not consistent and why a certain data set was used), and representing how the data was collected accurately are all absolutely fundamental to research. Yet, unless I am missing something, this research fails to do these.
More generally, there are a lot of other things that might be correlated with both home ownership and unemployment rates that were not considered. What are the demographics of the state? How much does housing cost and how does this compare to salaries? Are 20’s somethings living with their parents or starting households? When were most of the houses purchased, and what were the unemployment rates at that time? In general, sometimes the economy is doing better and other times worse. If houses are purchased when the economy is stronger so more people are able to afford a home, then at some point later the economy (and unemployment) are likely to be worse. Could that explain the lag? And many more…
Similarly, data on housing and data on unemployment are counting different groups of people. Unemployment data includes only those people who are part of the labor force, excluding retirees for example. At the same time, home ownership is counted for head of households over the age of 25. And, in 2010, people over the age of 60 had higher rates of home ownership than under the age of 60 – link. Furthermore, home-ownership is measured by household while unemployment is measured by person in the labor force. So, when comparing these two metrics, we’re not comparing the same populations. This, coupled with the fact that there is no evidence that the home-owners themselves are less employed, means that an association between home ownership rates and unemployment rates isn’t about home-owners being less employed but some larger systemic relationship between some people being more likely owning homes and others being more likely to be unemployed.
I remain unconvinced by both the original research and in the translation from the research to the article written in Business Insider. And, even if the fundamental association suggested between home ownership and, a few years later, unemployment rate is robust, there is no reason to conclude that higher rates of home ownership cause increased unemployment. Yet, someone reading the article reported in the Business Insider could easily come away wondering if (or believing that) governments should change policies around encouraging home ownership in order to protect against increased unemployment.