{"id":576,"date":"2018-11-12T21:38:36","date_gmt":"2018-11-13T04:38:36","guid":{"rendered":"https:\/\/blogs.ubc.ca\/datawithstata\/?page_id=576"},"modified":"2019-05-07T11:54:04","modified_gmt":"2019-05-07T18:54:04","slug":"difference-in-difference-regression","status":"publish","type":"page","link":"https:\/\/blogs.ubc.ca\/datawithstata\/home-page\/regression\/difference-in-difference-regression\/","title":{"rendered":"Difference in Difference"},"content":{"rendered":"<p>DID is a version of fixed effects estimation with panel data that can be used to estimate causal effects under the easily verifiable common trend assumption. A DID estimate captures the causal impact of a policy change by comparing the differences between the treated and control groups before and after the policy was implemented \u2013 the first difference is between before and after the policy intervention, and the second difference between the treatment and control groups. Now let\u2019s look at two concrete examples where DID was used in Economics research.<\/p>\n<div id=\"random-accordion-id-633\" class=\"accordion-shortcode  \"> <h3 ><a href=\"#example-1-gulati-and-malhotra--0\" >Example 1: Gulati and Malhotra (2006) <\/a><\/h3><div id=\"example-1-gulati-and-malhotra--0\" class=\"accordian-shortcode-content \" ><\/p>\n<p>On April 1, 1996, Canada and U.S. entered into a five-year contract called the Softwood Lumber Agreement (<em>SLA<\/em> henceforth), whereby a tariff rate quota was imposed on US-bound lumber exports from four Canadian provinces: Alberta, British Columbia, Ontario and Quebec. The first 14.7 billion board feet of softwood lumber from these provinces was exported duty free but further exports were subject to tariffs. A tariff of $50 per thousand board feet was imposed on the next 650 million board feet exported and further exports were going to be taxed at $100 per thousand board feet. The <em>SLA <\/em>was novel since exports from only four provinces were limited and other provinces were not subject to any restriction.<\/p>\n<p><a href=\"https:\/\/blogs.ubc.ca\/datawithstata\/files\/2019\/05\/Gulati-Malhotra-2006-Estimating-Export-Response-in-Canadian-Provinces-to-the-Canada-US-Softwood-Lumber.pdf\">Gulati and Malhotra<\/a> (2006) investigated firstly, whether the <em>SLA<\/em> caused a reduction in softwood exports to the U.S. from the four provinces and if so, what was the size of the decline. Secondly, what was the size of the increases, if any, in the softwood lumber exports of the other provinces to the U.S.<\/p>\n<p>The regression equation they run is the following:<\/p>\n<p>X<sub>it<\/sub> = \u03b1<sub>0<\/sub> + \u03b1<sub>1<\/sub>*Y<sub>it<\/sub> + \u03b1<sub>2<\/sub>*Y<sub>US,t<\/sub> + \u03b1<sub>3<\/sub>*Dist<sub>t<\/sub> + \u03b1<sub>4<\/sub>*Ex<sub>t<\/sub> + \u03b1<sub>5<\/sub>*R<sub>US,t<\/sub> + \u03b1<sub>6<\/sub>*<em>SLA<\/em><sub>i<\/sub> + \u03b1<sub>7<\/sub>*Rest<sub>t<\/sub> + \u03b1<sub>8<\/sub>*<em>SLA<\/em><sub>i<\/sub>*Rest<sub>t<\/sub> + u<sub>t<\/sub><\/p>\n<p>and the <abbr class='c2c-text-hover' title='Statistical Software for Data Analysis'>STATA<\/abbr> command will be:<\/p>\n<p><strong>reg<\/strong> X Y<sub>i<\/sub> Y<sub>US<\/sub> Dist Ex R <em>SLA<\/em> Rest <em>SLA<strong>#<\/strong><\/em>Rest<\/p>\n<p>where, X<sub>it<\/sub> is log value of exports or log quantity of exports (annual) from province I to US. Y<sub>it<\/sub> and Y<sub>US,t<\/sub> are the log GDP of province i and US at time t respectively to control for demand of lumber in Canada and US. Dist<sub>i<\/sub> is the log of distance from province I to the US border. R<sub>US,t<\/sub> is the US interest rate to control for influence of interest rates on demand for new homes (a major source of softwood lumber). Ex<sub>t<\/sub> is the US-Canada interest rate to control for price effects since exchange rates affect the relative price of Canadian lumber. <em>SLA<\/em> is a dummy variable which takes the value 1 for provinces on which tariffs were applied under <em>SLA<\/em> and 0 otherwise. Rest is a dummy for the years <em>SLA<\/em> was in effect and 0 otherwise. <em>SLA<\/em>*Rest is an interaction term that takes the value 1 for provinces under <em>SLA<\/em> for the years <em>SLA<\/em> was in place.<\/p>\n<p>The key coefficient of interest is \u03b1<sub>8<\/sub> as it is interpreted as the causal effect of the SLA on the change in exports of the <em>SLA<\/em> provinces compared to the non-<em>SLA<\/em> provinces. To understand why \u03b1<sub>8<\/sub> is referred as the \u201cDifference in Difference\u201d estimate, take a look at the following table which is Table 4 in the paper.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-1222\" src=\"https:\/\/blogs.ubc.ca\/datawithstata\/files\/2019\/05\/did_table_gulati_malhotra-300x174.png\" alt=\"\" width=\"431\" height=\"250\" srcset=\"https:\/\/blogs.ubc.ca\/datawithstata\/files\/2019\/05\/did_table_gulati_malhotra-300x174.png 300w, https:\/\/blogs.ubc.ca\/datawithstata\/files\/2019\/05\/did_table_gulati_malhotra-400x232.png 400w, https:\/\/blogs.ubc.ca\/datawithstata\/files\/2019\/05\/did_table_gulati_malhotra.png 517w\" sizes=\"auto, (max-width: 431px) 100vw, 431px\" \/><\/p>\n<p>\u03b1<sub>6<\/sub> captures the difference in the mean of log exports for <em>SLA<\/em> and non-<em>SLA<\/em> provinces before the <em>SLA <\/em>restrictions came into effect. \u03b1<sub>7<\/sub> captures the effect of the <em>SLA <\/em>agreement on the non-<em>SLA<\/em> provinces. \u03b1<sub>7<\/sub> + \u03b1<sub>8<\/sub> reflects the effect of the <em>SLA<\/em> contract on the provinces named in the contract. Thus, \u03b1<sub>8<\/sub> shows the difference in export performance of the provinces named in the <em>SLA <\/em>vs. not included in the <em>SLA<\/em> because of the <em>SLA<\/em> restriction.<\/p>\n<p>Their results indicate that the <em>SLA<\/em> had a significant impact on the exports of non-<em>SLA<\/em> provinces and the <em>SLA<\/em> by itself increased the exports from these provinces compared to the <em>SLA<\/em> provinces by four times. But, the <em>SLA<\/em> lead to a decrease in exports of <em>SLA<\/em> provinces of only 5 percent which was statistically insignificant. Thus, <em>SLA<\/em> caused an increase in the exports of the non-<em>SLA<\/em> provinces rather than decrease in exports of <em>SLA <\/em>provinces.<\/p>\n<p>\n<\/div><\/div><!-- #random-accordion-id-633end of accordion shortcode -->\n<div id=\"random-accordion-id-475\" class=\"accordion-shortcode  \"> <h3 ><a href=\"#example-2-card-and-krueger-199-1\" >Example 2: Card and Krueger (1994)<\/a><\/h3><div id=\"example-2-card-and-krueger-199-1\" class=\"accordian-shortcode-content \" ><\/p>\n<p>On April 1, 1992, New Jersey raised the state minimum wage from $4.25 to $5.05. However, right across the Delaware river, in Pennsylvania, minimum wages were kept unchanged at $4.25. Thus, if a researcher wanted to study the causal impact of minimum wage on employment, then comparing the difference in employment between New Jersey and Pennsylvania before and after April 1992 would be ideal for a DID. <a href=\"http:\/\/davidcard.berkeley.edu\/papers\/njmin-aer.pdf\">Card and Krueger (1994)<\/a> did precisely that by studying the impact of the statutory minimum wage increase on employment at fast food restaurants where most minimum wage workers are employed. In practice, this is implemented in a simple panel regression with state and time fixed effects along with an interaction term:<\/p>\n<p>Y<sub>ist<\/sub> = \u03b1 + \u00b5D<sub>t<\/sub> + \u03c6d<sub>s<\/sub> + \u03b2D<sub>t<\/sub>d<sub>s<\/sub> + u<sub>ist<\/sub> ,<\/p>\n<p>where Y<sub>ist<\/sub> is the employment at restaurant <em>i <\/em>in state <em>s <\/em>at time <em>t<\/em>; D<sub>t<\/sub> is a time-dummy variable, which takes the value 1 for periods after the minimum wage increase and takes the value 0 otherwise; and d<sub>s<\/sub> is a state-dummy variable, which takes the value 1 for the treated state (New Jersey) and 0 for the control state (Pennsylvania). The key coefficient of interest is \u03b2, the coefficient of the interaction between the state and time dummies, as it is interpreted as the causal effect of a minimum wage increase on employment. To understand why \u03b2 is the so-called \u201cDifference in Difference\u201d estimate, a quick look at the following table of the conditional expectation of <span style=\"display: inline !important; float: none; background-color: #ffffff; color: #444444; cursor: text; font-family: Georgia,'Bitstream Charter',serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;\">Y<\/span><sub>ist<\/sub> will suffice:<\/p>\n<table>\n<tbody>\n<tr>\n<td width=\"189\"><\/td>\n<td width=\"104\"><strong>Before (D<sub>t<\/sub>=0)<\/strong><\/td>\n<td width=\"94\"><strong>After (D<sub>t<\/sub>=1)<\/strong><\/td>\n<td width=\"236\"><strong><em>Difference in D<\/em><sub>t<\/sub><\/strong><\/td>\n<\/tr>\n<tr>\n<td width=\"189\"><strong>Pennsylvania (d<sub>s<\/sub>=0)<\/strong><\/td>\n<td width=\"104\">\u03b1<\/td>\n<td width=\"94\">\u03b1+\u00b5<\/td>\n<td width=\"236\">\u00b5<\/td>\n<\/tr>\n<tr>\n<td width=\"189\"><strong>New Jersey (d<sub>s<\/sub>=1)<\/strong><\/td>\n<td width=\"104\">\u03b1+\u03c6<\/td>\n<td width=\"94\">\u03b1+\u00b5+\u03c6+\u03b2<\/td>\n<td width=\"236\">\u00b5+\u03b2<\/td>\n<\/tr>\n<tr>\n<td width=\"189\"><strong><em>Difference in d<sub>s<\/sub><\/em><\/strong><\/td>\n<td width=\"104\">\u03c6<\/td>\n<td width=\"94\">\u03c6+\u03b2<\/td>\n<td width=\"236\"><strong>Difference in Difference: \u03b2<\/strong><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>In short, DID estimate = (<em>Difference<\/em> in pre- and post-treatment outcomes for treated group) <em>minus<\/em> (<em>Difference<\/em> in pre- and post-treatment outcomes for control group).<\/p>\n<p>Note that the panel regression set-up above can be reduced to a cross-sectional regression in first-differences by first averaging employment across all restaurants in a state, and then taking the difference between pre- and post-treatment periods.<\/p>\n<p>\u0394Y<sub>s<\/sub> = \u00b5* + \u03b2d<sub>s<\/sub> + \u0394u<sub>s<\/sub><\/p>\n<p>Here, <span style=\"display: inline !important; float: none; background-color: #ffffff; color: #444444; cursor: text; font-family: Georgia,'Bitstream Charter',serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;\">\u0394Y<\/span><sub>s<\/sub> is the change in average employment is state <em>s\u00a0<\/em>between the pre- and post-treatment periods, and <span style=\"display: inline !important; float: none; background-color: #ffffff; color: #444444; cursor: text; font-family: Georgia,'Bitstream Charter',serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;\">d<\/span><sub>s<\/sub> is the same state-dummy variable as above. This reformulation of the panel regression in terms of a cross-sectional regression definitely reduces data requirements significantly.<\/p>\n<p>One useful feature of implementing DID through regressions like above is that one is not limited to studying policy interventions that are binary in nature. For example, <a href=\"http:\/\/davidcard.berkeley.edu\/papers\/fed-min-wage-var.pdf\">Card (1992)<\/a> studies the impact of a federal minimum wage rise from $3.35 to $3.80 that has a differential impact across 51 U.S. states. Some U.S. states have minimum wages a little higher than the federal minimum, some a lot higher, and some are the same. The minimum wage is, therefore, a variable with different treatment intensity across states. In such a scenario,\u00a0 where the treatment intensity varies across different treatment groups, instead of using the dummy variable d<sub>s<\/sub>, one should use the <em>pre-treatment level of intensity of the treatment<\/em><em>,\u00a0<\/em>X<sub>s<\/sub><em>. <\/em>In the minimum wage study, <a href=\"http:\/\/davidcard.berkeley.edu\/papers\/fed-min-wage-var.pdf\">Card (1992)<\/a> uses the baseline (pre-increase) proportion of each state\u2019s teenage labour force earning less than $3.80 as the measure of <span style=\"display: inline !important; float: none; background-color: #ffffff; color: #444444; cursor: text; font-family: Georgia,'Bitstream Charter',serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;\">X<\/span><sub>s<\/sub><em>.\u00a0<\/em>Card's specification was simply,<\/p>\n<p><span style=\"display: inline !important; float: none; background-color: #ffffff; color: #444444; cursor: text; font-family: Georgia,'Bitstream Charter',serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;\">\u0394Y<\/span><sub>s<\/sub><span style=\"display: inline !important; float: none; background-color: #ffffff; color: #444444; cursor: text; font-family: Georgia,'Bitstream Charter',serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;\"> = \u00b5* + \u03b2X<\/span><sub>s<\/sub><span style=\"display: inline !important; float: none; background-color: #ffffff; color: #444444; cursor: text; font-family: Georgia,'Bitstream Charter',serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;\"> + \u0394u<\/span><sub>s .<\/sub><\/p>\n<p>\n<\/div><\/div><!-- #random-accordion-id-475end of accordion shortcode -->\n","protected":false},"excerpt":{"rendered":"<p>DID is a version of fixed effects estimation with panel data that can be used to estimate causal effects under the easily verifiable common trend assumption. A DID estimate captures the causal impact of a policy change by comparing the &hellip; <a href=\"https:\/\/blogs.ubc.ca\/datawithstata\/home-page\/regression\/difference-in-difference-regression\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":62204,"featured_media":0,"parent":170,"menu_order":8,"comment_status":"closed","ping_status":"closed","template":"full-width-page.php","meta":{"footnotes":""},"class_list":["post-576","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/blogs.ubc.ca\/datawithstata\/wp-json\/wp\/v2\/pages\/576","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.ubc.ca\/datawithstata\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/blogs.ubc.ca\/datawithstata\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.ubc.ca\/datawithstata\/wp-json\/wp\/v2\/users\/62204"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.ubc.ca\/datawithstata\/wp-json\/wp\/v2\/comments?post=576"}],"version-history":[{"count":16,"href":"https:\/\/blogs.ubc.ca\/datawithstata\/wp-json\/wp\/v2\/pages\/576\/revisions"}],"predecessor-version":[{"id":1224,"href":"https:\/\/blogs.ubc.ca\/datawithstata\/wp-json\/wp\/v2\/pages\/576\/revisions\/1224"}],"up":[{"embeddable":true,"href":"https:\/\/blogs.ubc.ca\/datawithstata\/wp-json\/wp\/v2\/pages\/170"}],"wp:attachment":[{"href":"https:\/\/blogs.ubc.ca\/datawithstata\/wp-json\/wp\/v2\/media?parent=576"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}