# CS计算机代考程序代写 AI finance Panel Data and Diff-in-Diff

Panel Data and Diff-in-Diff
Chris Hansman
Empirical Finance: Methods and Applications
January 25-26, 2020
1/76

Some Details
􏰒 First assignment released this week 􏰒 Posted on January 26th
􏰒 Due on February 9th.
2/76

Overview
􏰒 Last class: an introduction to causality
􏰒 This class: estimating causal effects with panel data 1. An introduction to panel data
􏰒 Multiple observations of the same unit over time 2. First difference and fixed effects estimators
􏰒 Estimating causal effects with fixed omitted variables
3. Difference-in-difference estimators
􏰒 A more robust method for estimating causal effects
3/76

Part 1: Introducing Panel Data
􏰒 Three common types of data 1. Cross-sectional
2. Time-series 3. Panel
􏰒 Estimating unit and time specific averages
4/76

Three common types of data:
(1) Cross-Sectional
􏰒 A single observation for each unit i in {1,2,··· ,N}
􏰒 e.g. test scores and study times for each individual in the class (2) Time Series
􏰒 Repeated observations from time t = 1, · · · , T for a single unit 􏰒 e.g. yearly GDP and unemployment in the UK
(3) Panel
􏰒 Repeated observations over time for multiple units
􏰒 e.g. monthly market cap and leverage for every firm in the S&P
5/76

Cross-sectional data: One observation per unit
6/76

Time series data: One unit over time
7/76

Panel data: Multiple units followed over time
8/76

Panel data: Notation
􏰒 Panel data consists of observations of the same n units in T different periods
􏰒 If the data contains variables x and y, we write them (xit,yit)
􏰒 fori=1,···,N
􏰒 i denotes the unit, e.g. Microsoft or Apple
􏰒 andt=1,···,T
􏰒 t denotes the time period, e.g. September or October
9/76

Panel data: Multiple units followed over time
􏰓􏰕􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰓􏰾􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰙􏰗􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰕􏰻􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰓􏰙􏰿􏰰􏰜􏰕􏰓􏰙 􏰚􏰰􏰟􏰛
􏱀􏰬􏰵􏰰 􏱀􏰬􏱁􏰰 􏱂􏰬􏰭􏰽 􏰯􏰳 􏱃􏱄􏱅􏱄 􏱆􏰞􏱁􏰟􏰬􏰜
􏰹
10/76
􏰼􏰭􏰞􏰵􏰛
􏰓 􏰙􏰓 􏰕􏰓 􏰻􏰓 􏰖􏰓 􏰢􏰓

Panel data: Allows Averaging Within Units
􏰒 Because we see every unit multiple times: 􏰒 Can take unit specific averages
pricei = ∑Tt=1 priceit T
􏰒 Because we see many units at the same time period 􏰒 Can take time specific averages:
pricet = ∑Ni=1 priceit N
􏰒 The overall average is (of course):
price = ∑Tt=1 ∑Ni=1 priceit N×T
11/76

Panel data: Unit Specific Averages
􏱃􏱄􏱅􏱄 􏱀􏰬􏰵􏰰 􏱀􏰬􏱁􏰰 􏱂􏰬􏰭􏰽 􏰯􏰳 􏱆􏰞􏱁􏰟􏰬􏰜 􏱀􏰬􏰮􏰴􏰰􏰜􏰠
12/76
􏰼􏰭􏰞􏰵􏰛
􏰓 􏰙􏰓 􏰕􏰓 􏰻􏰓 􏰖􏰓 􏰢􏰓

Panel data: Time Specific Averages
􏰓􏰕􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰓􏰾􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰙􏰗􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰕􏰻􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰓􏰙􏰿􏰰􏰜􏰕􏰓􏰙 􏰚􏰰􏰟􏰛
􏰹
13/76
􏰼􏰭􏰞􏰵􏰛
􏰓 􏰙􏰓 􏰕􏰓 􏰻􏰓 􏰖􏰓 􏰢􏰓

Calculating Unit Specific Averages With Regression
􏰒 Recall that dummy variables let you calculate these means
􏰒 Create dummy variables for each i (e.g. Company) omitting 1
􏰒 Lets call them D1,D2,··· ,DN
􏰒 And consider the following regression
N−1
yit = β0 + ∑ δi Di + vit
i=1 􏰒 Recall that we can then estimate
􏰒 Average for the omitted unit: βˆ0
􏰒 Average for any other i: βˆ +δˆ 0i
14/76

Residualizing to Remove Differences in Means
N−1
yit = β0 + ∑ δi Di + vit
i=1
􏰒 After estimating this regression, we can also compute the residuals: N−1
vˆ=y−βˆ− δˆD it it 0 ∑ii
i=1
􏰒 For any given i, this translates to: vˆ = y −βˆ −δˆ
􏰒 This is just yit −y ̄i
􏰒 The price minus the unit specific average
􏰒 Lets us compare changes over time 􏰒 Putting aside level differences
it it 0 i
15/76

Residualizing Removes Group Specific Means
􏰓􏰕􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰓􏰾􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰙􏰗􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰕􏰻􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰓􏰙􏰿􏰰􏰜􏰕􏰓􏰙 􏰚􏰰􏰟􏰛
􏱀􏰬􏰵􏰰 􏱀􏰬􏱁􏰰 􏱂􏰬􏰭􏰽 􏰯􏰳 􏱃􏱄􏱅􏱄 􏱆􏰞􏱁􏰟􏰬􏰜
􏰹
16/76
􏰼􏰭􏰞􏰵􏰛
􏰓 􏰙􏰓 􏰕􏰓 􏰻􏰓 􏰖􏰓 􏰢􏰓

Residualizing Removes Group Specific Means
􏰓􏰕􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰓􏰾􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰙􏰗􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰕􏰻􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰓􏰙􏰿􏰰􏰜􏰕􏰓􏰙 􏰚􏰰􏰟􏰛
􏱀􏰬􏰵􏰰 􏱀􏰬􏱁􏰰 􏱂􏰬􏰭􏰽 􏰯􏰳 􏱃􏱄􏱅􏱄 􏱆􏰞􏱁􏰟􏰬􏰜
􏰹
16/76
􏰼􏰭􏰞􏰵􏰛
􏰡􏰢 􏰡􏰕􏰔􏰢 􏰓 􏰕􏰔􏰢 􏰢

Residualizing Removes Group Specific Means
􏰓􏰕􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰓􏰾􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰙􏰗􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰕􏰻􏰽􏰛􏰵􏰕􏰓􏰙􏰗 􏰓􏰙􏰿􏰰􏰜􏰕􏰓􏰙 􏰚􏰰􏰟􏰛
􏱀􏰬􏰵􏰰 􏱀􏰬􏱁􏰰 􏱂􏰬􏰭􏰽 􏰯􏰳 􏱃􏱄􏱅􏱄 􏱆􏰞􏱁􏰟􏰬􏰜
􏰹
16/76
􏰼􏰭􏰞􏰵􏰛
􏰡􏰢 􏰡􏰕􏰔􏰢 􏰓 􏰕􏰔􏰢 􏰢

Calculating Time Specific Averages With Regression
􏰒 Can similarly calculate average for each time period with regression 􏰒 Create dummy variables for each t (e.g. Dec. 15) omitting 1
􏰒 Lets call them D1,D2,··· ,DT
􏰒 And consider the following regression
T−1
yit = β0 + ∑ τt Dt + vit
t=1 􏰒 Recall that we can then estimate
􏰒 Average for the omitted unit: βˆ0
􏰒 Average for any other i: βˆ +τˆ 0t
17/76

Part 2: Advantages of Panel Data for Causal Effects
􏰒 A simple approach using a panel: event study
􏰒 Two approaches to dealing with a fixed omitted variables
􏰒 First differences 􏰒 Fixed effects
18/76

A simple panel approach: Before vs. after
􏰒 Suppose we are interested in the causal effect of a particular event or policy
yit = β0 + β1 AfterEventit + vi
􏰒 Example: Impact of Brexit on UK firms 􏰒 Can we simply compare?
E [yit |Afterevent = 1] − E [yit |Afterevent = 0]
19/76

A simple panel approach: Before vs. after
Y
2016m1 2016m4 2016m7 2016m10 2017m1 Month (t)
20/76

A simple panel approach: Before vs. after
E[Y|Before]
Y
2016m1 2016m4 2016m7 2016m10 2017m1 Month (t)
E[Y|After]
20/76

Before vs. after an event used frequently
􏰒 This tactic underlies an approach called event study 􏰒 Lots of different techniques/bells and whistles
􏰒 Chapter 4 of The Econometrics of Financial Markets (Cambell, Lo and MacKinlay) if you want more detail
21/76

Entrance into the S&P (Shleifer,1986; Harris and Gurel, 1986)
Source: Gompers, Greenwood, and Lerner’s Lecture Notes
21/76

When is an event study ineffective?
E[Y|Before]
Y
2016m1 2016m4 2016m7 2016m10 Month (t)
2017m1
E[Y|After]
21/76

Panel Data and Omitted Variables
􏰒 We will come back to this before vs. after strategy in a bit 􏰒 Lets reconsider our omitted variables problem:
yit =β0+β1xit+γai+eit 􏰒 Suppose we see xit and yit but not ai
􏰒 Suppose Corr(xit,eit) = 0 but Corr(ai,xi) ̸= 0
􏰒 Note that we are assuming ai doesn’t depend on t
22/76

Panel Data and Omitted Variables
􏰒 An example:
Leverageit = β0 + β1 Profitit + γ ai + eit 􏰒 Some potential (fixed) omitted variables
􏰒 Manager skill or risk aversion 􏰒 Cost of capital
23/76

Panel Data and Omitted Variables
􏰒 Suppose we are unable to observe ai yit=β0+β1xit+ vit
􏰐􏰏􏰎􏰑
γ ai +eit 􏰒 If we estimate this regression, will we recover
􏰒 No! because
βols =β 11
corr(xit,ai) ̸= 0 ⇒ corr(xit,vit) ̸= 0
􏰒 Aside: Regression of this form are often called “pooled”
􏰒 Because they “pool” data across individuals and time periods

Panel Data and Omitted Variables
βOLS +βOLSX 01
β0 + β1X
X
25/76
Y

Our first Mentis…
􏰒 Load the data panel example.csv
􏰒 What is the coefficient βˆols if we treat ai as unobserved?
regression
1
yit =β0+β1xit+vit
􏰒 What is the coefficient βˆols if we observe and include ai in the 1
yit =β0+β1xit+γai+eit
26/76

First Difference Regression
yit=β0+β1xit+ vit 􏰐􏰏􏰎􏰑
γ ai +eit
􏰒 Suppose we see exactly two time periods t = {1, 2} for each i 􏰒 We can write our two time periods as:
yi,1 = β0 +β1xi,1 +γai +ei,1
yi,2 = β0 +β1xi,2 +γai +ei,2 􏰒 Then take the difference:
􏰒 Or
yi,2 −yi,1 = β1(xi,2 −xi,1)+(ei,2 −ei,1) ∆yi,2−1 = β1(∆xi,2−1)+∆ei,2−1
27/76

First Difference Regression
􏰒 Instead of regressing yit on xit , regress the change in yit on the change in xit
􏰒 Taking changes (differences) gets rid of fixed omitted variables ∆yi,2−1 = β1∆xi,2−1 +∆ei,2−1
􏰒 As long as ∆ei,2−1 is mean independent of ∆xi,2−1:
E[∆ei,2−1|∆xi,2−1] = E[∆ei,2−1]
􏰒 Note that this is not the same as:
E[eit|xit] = E[eit]
􏰒 Menti: What is the coefficient βˆFD from a first difference regression? 1
28/76

Fixed Effects Regression
yit =β0+β1xit+γai+eit
􏰒 An alternative approach:
􏰒 Lets define δi = γai and rewrite:
yit =β0+β1xit+δi+eit 􏰒 So yi is determined by
(i) The baseline intercept β0 (ii) The effect of xi
(iii) An individual specific change in the intercept: δi 􏰒 Intuition behind fixed effects: Lets just estimate δi
29/76

What is δi
yit =β0+β1xit+δi+eit
􏰒 δi is often referred to as i’s “fixed effect”
E[yit|xit = 0] = β0 +E[β1 ·0]+δi +E[eit|xit = 0]
􏰒 So δi is just the change in individual is intercept: δi = E[yit|xit = 0]−β0
30/76

Fixed Effects Regression: Estimating δi
y1t =β0+β1xit+δ1+eit y2t =β0+β1xit+δ2+eit
.
ynt =β0+β1xit+δn+eit
􏰒 How do we estimate δ1,δ2,··· ,δn?
31/76

Fixed Effects Regression: Estimating δi
yit =β0+β1xit+δi+eit
􏰒 Simplest approach (to me): Dummy variables
􏰒 Construct N-1 dummy variables D1,D2,··· ,DN−1
􏰒 D1 =1 when i =1 and 0 otherwise 􏰒 D2 =1 when i =2 and 0 otherwise 􏰒 D3 =1 when i =3 and 0 otherwise 􏰒 And so on…
􏰒 DN−1 =1 when i =N−1 and 0 otherwise
32/76

Fixed Effects Regression: Implementation
N−1
yit = β0 +β1xit + ∑ δiDi +eit
i=1
􏰒 Note that we’ve left out DN
􏰒 βOLS is interpreted as the intercept for individual N:
βOLS=E[y|x =0,i=N] 0 itit
0
􏰒 and for all other i (e.g. i=2)
δ2 = E[yi|xit = 0,i = 2]−β0
􏰒 Menti: What is the coefficient βˆFE from a fixed effects regression? 1
33/76

Fixed Effects Regression: Intuition
􏰒 Any fixed characteristic of i is captured by the average of yit (for i)
􏰒 By using dummy variables for i, we can just estimate (and hence
account for) those averages.
􏰒 No longer have to worry about xit being correlated with a fixed component of eit
34/76

Why is This? Recall Regression Anatomy
βOLS = Cov(yit,x ̃it) 1 Var (x ̃it )
􏰒 Where x ̃it is the residual from a regression of xit on Di N
xit = α0 + ∑αjDj +εit j=1
x ̃ =x −(αOLS+αOLS) it it 0 i
􏰒 Subtracting (partialling out) the average xit for each i 􏰒 x ̃it is no longer correlated with eit
35/76

Fixed Effects Regression: Assumptions
􏰒 There is one important difference in the assumptions necessary for OLS to capture the causal effect:
􏰒 Before, we needed 􏰒 Now, we need:
E[eit|xit] = E[eit] E[eit|xi1,xi2,··· ,xiT ] = E[eit]
36/76

When Will Fixed Effects Not Be Enough?
􏰒 We need
E[eit|xi1,xi2,··· ,xiT ] = E[eit]
􏰒 But what if eit is growing over time?
􏰒 E.g. interest rates rising each quarter, influencing profits and leverage
37/76

Time Fixed Effects
􏰒 We so far have focused on controlling for entity i fixed effects
􏰒 What if xit is correlated with something that changes over time but
is fixed across individual units?
Leverageit = β0 + β1 Profitsit + τt + vit
􏰒 For example, many time-varying macro variables (e.g. monetary policy) might affect profits and leverage
􏰒 If these are constant for all firms than they will be captured by τt
38/76

Time Fixed Effects
yit =β0+β1xit+τt+eit
􏰒 Exact same approach as with entity fixed effects
􏰒 Construct T −1 dummy variables D1,D2,··· ,DT−1
􏰒 D1 =1 when t =1 and 0 otherwise 􏰒 D2 =1 when t =2 and 0 otherwise 􏰒 And so on…
􏰒 And then, omitting one time period, we can estimate T−1
􏰒 What is β0? τt?
yit = β0 +β1xit + ∑ τtDt +eit t=1
39/76

Time Fixed Effects
􏰒 Time fixed effects do not deal with fixed individual characteristics 􏰒 What about combining both approaches?
40/76

Part 3: Difference-in-Difference
􏰒 An example: Bankruptcy Costs and Leverage 􏰒 The difference-in-difference framework
􏰒 Key assumption: Parallel Trends
41/76

Example: Bankruptcy Costs and Leverage
􏰒 What is the effect of a decline in bankrutpcy costs on leverage?
􏰒 Theory: Lower expected bankruptcy costs should increase leverage
􏰒 Ideal (impossible to conduct) experiment: 􏰒 Randomly select a subset of firms
􏰒 Reduce bankruptcy costs for these firms (e.g. streamline bankruptcy procedures)
􏰒 Compare leverage between this subset and the remaining firms
42/76

Example: Bankruptcy Costs and Leverage
􏰒 At the end of 1991 the state of Delaware passed a new law (“the reform”)
􏰒 Significantly streamlined bankruptcy proceedings 􏰒 Reduced costs and time of litigation
􏰒 Can we use this to learn something about our question? 􏰒 Suppose we call the causal effect of the reform: β1
􏰒 How do we recover this parameter?
43/76

Approach 1: Before vs. After
􏰒 Compare the average leverage of Delaware firms in 1991 vs. 1992 􏰒 Let Aftert be a dummy equal to 1 after the reform
􏰒 We would like to describe the relationship between the reform and leverage as:
Leverageit = β0 + β1 Aftert + vit
􏰒 Where vit contains all other time and firm specific factors that influence leverage
44/76

Approach 1: Before vs. After
􏰒 Suppose we regress Leverageit on our Aftert dummy: 􏰒 What is βOLS?
βOLS =E[Leverage |After =1]−E[Leverage |After =0] 1 it t it t
= β1 +E[vit|Aftert = 1]−E[vit|Aftert = 0] 􏰒 So β OLS = β1 (the causal effect of treatment) if
1
􏰒 Why might that fail?
E[vit|Aftert]=E[vit]
1
45/76

Before vs. After
Leverage
E[Y|After=0]
1991m7 1991m10 1992m1 1992m4 1992m7 Month (t)
E[Y|After=1]
46/76

When is Before vs. After Ineffective?
Leverage
1991m7 1991m10 1992m1 1992m4 1992m7 Month (t)
46/76

When is Before vs. After Ineffective?
Leverage
E[Y|After=0]
1991m7 1991m10 1992m1 1992m4 1992m7 Month (t)
E[Y|After=1]
46/76

Approach 1: Before vs. After
􏰒 βOLS is just the difference in leverage for 1992 Delaware firms 1
(“treatment”) relative to 1991 Delaware firms (“Control”)
􏰒 We require E [vit |Aftert = 1] = E [vit |Aftert = 0] for this to identify the causal effect of the reform
􏰒 Any time trend/other events in 1992 will cause vit for later observations to be different from vit for earlier observations
􏰒 e.g. tight credit in 1992 may have reduced debt (and hence leverage)
47/76

Approach 2: Cross Sectional
􏰒 Compare Delaware Firms (“Treatment”) vs. Non-Delaware firms (Control) in 1992
􏰒 Don’t need to worry about time trends
􏰒 Requires data from firms in surrounding states
􏰒 Let Di be a dummy equal to 1 if firm i is registered in Delaware
􏰒 We would like to describle the relationship between the reform and leverage as:
Leveragei =β0+β1Di+vi
􏰒 Where vi contains all other time and firm specific factors that influence leverage
48/76

Approach 2: Cross Sectional
􏰒 Suppose we regress Leveragei on our Di dummy:
βOLS = E[Leverage |D = 1]−E[Leverage |D = 0]
1iiii = β1 +E[vi|Di = 1]−E[vi|Di = 0]
􏰒 So β OLS = β1 (the causal effect of treatment) if 1
E[vi|Di] = E[vi]
􏰒 Do we expect everything else that impacts leverage to be the same in Delaware and other states?
49/76

When is Cross Sectional Approach Ineffective?
􏰒 Do we expect everything else that impacts leverage to be the same in Delaware and other states?
􏰒 What if firms in Delaware are more capital-intensive 􏰒 Typically capital intensivity ⇒ more leverage
􏰒 This is just an omitted variable:
Leveragei = β0 + β1Di + β2CIi + ei
􏰒 So if we omit CIi and estimate
Leveragei =β0+β1Di+vi
􏰒 Will βOLS be larger or smaller than β1? 1
50/76

When is Cross Sectional Approach Ineffective?
􏰒 Of course, we could measure and control for capital intensivity Leveragei = β0 + β1Di + β2CIi + ei
􏰒 Then our the assumption for β OLS = β1 becomes: 1
E[ei|Di,CIi] = E[ei|CIi]
􏰒 Beyond capital intensivity, do we expect everything else that
impacts leverage to be the same in Delaware and other states?
􏰒 Hard to control for everything
51/76

Difference-in-Difference Approach
􏰒 Let’s combine the positive features of the cross-sectional and before/after approaches
􏰒 Cross sectional avoided omitted trends
􏰒 Before/after avoided omitted (fixed) characteristics
􏰒 The difference-in-difference estimator does exactly this Leverageit = β0 + β1Di × Aftert + β2Di + β3Aftert + vit
􏰒 Here β1 is the causal effect of the reform in Delaware
􏰒 Requires data on firms in/out of Delaware before/after the reform
52/76

What Does Data Look Like for Difference-in-Difference
State Delaware Maryland Virginia Delaware Virginia Virginia Delaware Maryland Virginia
. .
Year Leverageit (D/E) Di Aftert 1991 1.2 1 0 1991 3.1 0 0 1991 1.9 0 0 1991 0.9 1 0 1991 1.5 0 0 1991 1.1 0 0 1991 1.2 1 0 1991 1.6 0 0 1991 0.5 0 0
. . .. . . ..
Di ×Aftert 0
0
0
0
0
0
0
0
0
0 1 0 1 0 1
Maryland 1992 Delaware 1992 Virginia 1992 Delaware 1992 Maryland 1992 Delaware 1992
0.8 0 1 0.9 1 1 1.6 0 1 2.2 1 1 1.4 0 1 1.9 1 1
53/76

What Do the Difference-in-Difference Estimates Capture?
􏰒 Recall that when righthand side variables take discrete values, OLS perfectly captures the conditional expectation function:
E[Leverageit|Di,Aftert]=E[βOLS +βOLSDi ×Aftert +βOLSDi +βOLSAftert|Di,Aftert] 0123
􏰒 There are four groups:
1. Non-Delaware Before: {Di = 0, Aftert = 0}
2. Delaware Before: {Di = 1, Aftert = 0}
3. Non-Delaware After: {Di = 0, Aftert = 1} 4. Delaware After: {Di = 1, Aftert = 1}
54/76

What Do the Difference-in-Difference estimates Capture?
􏰒 Lets calculate conditional expectations for these four groups: 1. E[Leverageit|Di =0,Aftert =0]=βOLS
2. E[Leverageit|Di = 1,Aftert = 0] = βOLS +βOLS 02
3. E[Leverageit|Di = 0,Aftert = 1] = βOLS +βOLS 03
4. E[Leverageit|Di = 1,Aftert = 1] = βOLS +βOLS +βOLS +βOLS 0123
0
55/76

What Do the Difference-in-Difference estimates Capture?
􏰒 Lets calculate conditional expectations for these four groups: 1. E[Leverageit|Di =0,Aftert =0]=βOLS
2. E[Leverageit|Di = 1,Aftert = 0] = βOLS +βOLS 02
3. E[Leverageit|Di = 0,Aftert = 1] = βOLS +βOLS 03
4. E[Leverageit|Di = 1,Aftert = 1] = βOLS +βOLS +βOLS +βOLS 0123
0
55/76

Diff-in-Diff Solves Issues with Cross-Sectional Approach
􏰒 Cross Sectional: Compare averages In Delaware vs. outside, after the reform
E[Leverageit|Di =1,Aftert =1]−E[Leverageit|Di =0,Aftert =1] 􏰐 􏰏􏰎 􏰑􏰐 􏰏􏰎 􏰑
βOLS+βOLS+βOLS+βOLS (βOLS+βOLS) 0123 03
􏰐 􏰏􏰎 􏰑
Cross-sectional Difference After
= β OLS + β OLS 12
􏰒 We worried about the possibility of some omitted difference between Delaware and other states (β OLS ̸= 0)
􏰒 Solution: Use the pre-reform difference to account for any fixed differences
E[Leverageit|Di =1,Aftert =0]−E[Leverageit|Di =0,Aftert =0] 􏰐 􏰏􏰎 􏰑􏰐 􏰏􏰎 􏰑
βOLS+βOLS βOLS 020
􏰐 􏰏􏰎 􏰑
Cross-sectional Difference Before
=βOLS 2
2
56/76

Diff-in-Diff Solves Issues with Cross Sectional Approach
􏰒 Difference in Difference=
Difference After−Difference Before
􏰐 􏰏􏰎 􏰑􏰐 􏰏􏰎 􏰑
βOLS+βOLS βOLS 122
=βOLS 1
57/76

Diff-in-Diff Solves Issues with Before vs. After
􏰒 Before vs After: Compare averages before vs. after within Delaware: E[Leverageit|Di =1,Aftert =1]−E[Leverageit|Di =1,Aftert =0]
􏰐 􏰏􏰎 􏰑􏰐 􏰏􏰎 􏰑
βOLS+βOLS+βOLS+βOLS (βOLS+βOLS) 0123 02
􏰐 􏰏􏰎 􏰑
Difference In Delaware
= β OLS + β OLS 13
􏰒 We worried about the possibility of some time trend 􏰒 Solution: Use other states to account for time trends
E[Leverageit|Di =0,Aftert =1]−E[Leverageit|Di =0,Aftert =0] 􏰐 􏰏􏰎 􏰑􏰐 􏰏􏰎 􏰑
βOLS+βOLS βOLS 030
􏰐 􏰏􏰎 􏰑
Difference Out of Delaware
=βOLS 3
58/76

Diff-in-Diff Solves Issues with Before vs. After
􏰒 Difference in Difference=
Difference In Delaware−Difference Out of Delaware
􏰐 􏰏􏰎 􏰑􏰐 􏰏􏰎 􏰑
βOLS+βOLS βOLS 133
=βOLS 1
59/76

Difference in Difference Matrix
􏰒 Two ways to interpret the same estimator βOLS : 1
Delaware (Treatment) Other States (Control) Difference
Before After Difference βOLS +βOLS βOLS +βOLS +βOLS +βOLS =βOLS +βOLS
02012313
βOLS βOLS +βOLS =βOLS 0033
= βOLS = βOLS +βOLS = βOLS 2121
60/76

Diff-in-Diff Graphically
Leverage
Treatment (Delaware)
Control (Non−Delaware)
Before After
Month (t)
61/76

Diff-in-Diff Graphically
Leverage
Treatment (Delaware)
Control (Non−Delaware) β OLS
Before 0 After Month (t)
62/76

Diff-in-Diff Graphically
Leverage
Treatment (Delaware)
β OLS 2
Control (Non−Delaware) β OLS
Before 0 After Month (t)
63/76

Diff-in-Diff Graphically
Leverage
Treatment (Delaware)
β OLS β OLS 23
Control (Non−Delaware) β OLS
Before 0 After Month (t)
64/76

Diff-in-Diff Graphically
Leverage
Treatment (Delaware)
β OLS β OLS 23
Control (Non−Delaware) β OLS
Before 0 After Month (t)
β OLS 1
65/76

Diff-in-Diff Graphically
Leverage
Treatment (Delaware)
Control (Non−Delaware)
β OLS 1
β OLS 2
β OLS 3
β OLS 0
Before After
Month (t)
66/76

When Does Diff-in-Diff Identify A Causal Effect
􏰒 As usual, we need
E[vit|Di,Aftert] = E[vit]
􏰒 What does this mean intuitively?
􏰒 Parallel trends assumption: In the absence of any reform the
average change in leverage would have been the same in the treatment and control groups
􏰒 In other words: trends in both groups are similar
67/76

Parallel Trends
Leverage
Treatment (Delaware)
β OLS β OLS 23
Control (Non−Delaware) β OLS
Before 0 After Month (t)
β OLS 1
68/76

Parallel Trends
􏰒 Parallel trends does not require that there is no trend in leverage 􏰒 Just that it is the same between groups
􏰒 Does not require that the levels be the same in the two groups 􏰒 What does it look like when the parallel trends assumption fails?
69/76

When Parallel Trends Fails
Leverage
Treatment (Delaware)
Control (Non−Delaware)
Before After
Month (t)
70/76

When Parallel Trends Fails
Leverage
Treatment (Delaware)
β OLS β OLS 32
Control (Non−Delaware) β OLS
Before 0 After Month (t)
β OLS 1
71/76

When Parallel Trends Fails
Treatment (Delaware)
OLS β3
Control (Non−Delaware)
β OLS 2
β OLS 0
Before After
Month (t)
β OLS 1
Leverage
72/76

Testing the Parallel Trends Assumption?
􏰒 It is impossible to truly test
􏰒 Assumption about what patterns would have been without treatment
􏰒 However with data for several periods before the reform, can provide convincing evidence
􏰒 Intution: show that the two groups have been parallel for a long time
􏰒 Typically, plot the difference in means between treated and control groups
􏰒 If the difference in means is flat ⇒ parallel trends more likely to hold
73/76

General Form of Diff-in-Diff
􏰒 We are interested in the impact of some treatment on outcome Yi
􏰒 Suppose we have a treated group and a control group
􏰒 Let Di =1 be a dummy equal to 1 if i belongs to the treatment
group
􏰒 And suppose we see both groups before and after the treatment occurs
􏰒 Let Aftert = 1 be equal to 1 if time t is after the treatment date Yit =β0+β1Di×Aftert+β2Di+β3Aftert+vit
􏰒 For more precision:
Yit = β0 +β1Di ×Aftert +δi +τt +vit
􏰒 Where τt and δi are fixed effects for each time period and individual
74/76

Data Exercise
􏰒 Load the d in d dataset
􏰒 Perform the following regression
Leverageit = β0 + β1Di × Aftert + β2Di + β3Aftert + vit
􏰒 Where Di = 1 in delaware and 0 otherwise
􏰒 and Aftert = 1 in 1992
􏰒 Menti: what is βˆOLS 1
􏰒 If you complete this, estimate:
Yit = β0 +β1Di ×Aftert +δi +τt +vit
75/76

Overview
􏰒 This class: estimating causal effects with panel data 1. An introduction to panel data
􏰒 Multiple observations of the same unit over time 2. First difference and fixed effects estimators
􏰒 Estimating causal effects with fixed omitted variables
3. Difference-in-difference estimators
􏰒 A more robust method for estimating causal effects
76/76