# CS计算机代考程序代写 AI finance Panel Data and Diff-in-Diff

Panel Data and Diff-in-Diff

Chris Hansman

Empirical Finance: Methods and Applications

January 25-26, 2020

1/76

Some Details

First assignment released this week Posted on January 26th

Due on February 9th.

2/76

Overview

Last class: an introduction to causality

This class: estimating causal effects with panel data 1. An introduction to panel data

Multiple observations of the same unit over time 2. First difference and fixed effects estimators

Estimating causal effects with fixed omitted variables

3. Difference-in-difference estimators

A more robust method for estimating causal effects

3/76

Part 1: Introducing Panel Data

Three common types of data 1. Cross-sectional

2. Time-series 3. Panel

Estimating unit and time specific averages

4/76

Three common types of data:

(1) Cross-Sectional

A single observation for each unit i in {1,2,··· ,N}

e.g. test scores and study times for each individual in the class (2) Time Series

Repeated observations from time t = 1, · · · , T for a single unit e.g. yearly GDP and unemployment in the UK

(3) Panel

Repeated observations over time for multiple units

e.g. monthly market cap and leverage for every firm in the S&P

5/76

Cross-sectional data: One observation per unit

6/76

Time series data: One unit over time

7/76

Panel data: Multiple units followed over time

8/76

Panel data: Notation

Panel data consists of observations of the same n units in T different periods

If the data contains variables x and y, we write them (xit,yit)

fori=1,···,N

i denotes the unit, e.g. Microsoft or Apple

andt=1,···,T

t denotes the time period, e.g. September or October

9/76

Panel data: Multiple units followed over time

10/76

Panel data: Allows Averaging Within Units

Because we see every unit multiple times: Can take unit specific averages

pricei = ∑Tt=1 priceit T

Because we see many units at the same time period Can take time specific averages:

pricet = ∑Ni=1 priceit N

The overall average is (of course):

price = ∑Tt=1 ∑Ni=1 priceit N×T

11/76

Panel data: Unit Specific Averages

12/76

Panel data: Time Specific Averages

13/76

Calculating Unit Specific Averages With Regression

Recall that dummy variables let you calculate these means

Create dummy variables for each i (e.g. Company) omitting 1

Lets call them D1,D2,··· ,DN

And consider the following regression

N−1

yit = β0 + ∑ δi Di + vit

i=1 Recall that we can then estimate

Average for the omitted unit: βˆ0

Average for any other i: βˆ +δˆ 0i

14/76

Residualizing to Remove Differences in Means

N−1

yit = β0 + ∑ δi Di + vit

i=1

After estimating this regression, we can also compute the residuals: N−1

vˆ=y−βˆ− δˆD it it 0 ∑ii

i=1

For any given i, this translates to: vˆ = y −βˆ −δˆ

This is just yit −y ̄i

The price minus the unit specific average

Lets us compare changes over time Putting aside level differences

it it 0 i

15/76

Residualizing Removes Group Specific Means

16/76

Residualizing Removes Group Specific Means

16/76

Residualizing Removes Group Specific Means

16/76

Calculating Time Specific Averages With Regression

Can similarly calculate average for each time period with regression Create dummy variables for each t (e.g. Dec. 15) omitting 1

Lets call them D1,D2,··· ,DT

And consider the following regression

T−1

yit = β0 + ∑ τt Dt + vit

t=1 Recall that we can then estimate

Average for the omitted unit: βˆ0

Average for any other i: βˆ +τˆ 0t

17/76

Part 2: Advantages of Panel Data for Causal Effects

A simple approach using a panel: event study

Two approaches to dealing with a fixed omitted variables

First differences Fixed effects

18/76

A simple panel approach: Before vs. after

Suppose we are interested in the causal effect of a particular event or policy

yit = β0 + β1 AfterEventit + vi

Example: Impact of Brexit on UK firms Can we simply compare?

E [yit |Afterevent = 1] − E [yit |Afterevent = 0]

19/76

A simple panel approach: Before vs. after

Y

2016m1 2016m4 2016m7 2016m10 2017m1 Month (t)

20/76

A simple panel approach: Before vs. after

E[Y|Before]

Y

2016m1 2016m4 2016m7 2016m10 2017m1 Month (t)

E[Y|After]

20/76

Before vs. after an event used frequently

This tactic underlies an approach called event study Lots of different techniques/bells and whistles

Chapter 4 of The Econometrics of Financial Markets (Cambell, Lo and MacKinlay) if you want more detail

21/76

Entrance into the S&P (Shleifer,1986; Harris and Gurel, 1986)

Source: Gompers, Greenwood, and Lerner’s Lecture Notes

21/76

When is an event study ineffective?

E[Y|Before]

Y

2016m1 2016m4 2016m7 2016m10 Month (t)

2017m1

E[Y|After]

21/76

Panel Data and Omitted Variables

We will come back to this before vs. after strategy in a bit Lets reconsider our omitted variables problem:

yit =β0+β1xit+γai+eit Suppose we see xit and yit but not ai

Suppose Corr(xit,eit) = 0 but Corr(ai,xi) ̸= 0

Note that we are assuming ai doesn’t depend on t

22/76

Panel Data and Omitted Variables

An example:

Leverageit = β0 + β1 Profitit + γ ai + eit Some potential (fixed) omitted variables

Manager skill or risk aversion Cost of capital

23/76

Panel Data and Omitted Variables

Suppose we are unable to observe ai yit=β0+β1xit+ vit

γ ai +eit If we estimate this regression, will we recover

No! because

βols =β 11

corr(xit,ai) ̸= 0 ⇒ corr(xit,vit) ̸= 0

Aside: Regression of this form are often called “pooled”

Because they “pool” data across individuals and time periods

24/76

Panel Data and Omitted Variables

βOLS +βOLSX 01

β0 + β1X

X

25/76

Y

Our first Mentis…

Load the data panel example.csv

What is the coefficient βˆols if we treat ai as unobserved?

regression

1

yit =β0+β1xit+vit

What is the coefficient βˆols if we observe and include ai in the 1

yit =β0+β1xit+γai+eit

26/76

First Difference Regression

yit=β0+β1xit+ vit

γ ai +eit

Suppose we see exactly two time periods t = {1, 2} for each i We can write our two time periods as:

yi,1 = β0 +β1xi,1 +γai +ei,1

yi,2 = β0 +β1xi,2 +γai +ei,2 Then take the difference:

Or

yi,2 −yi,1 = β1(xi,2 −xi,1)+(ei,2 −ei,1) ∆yi,2−1 = β1(∆xi,2−1)+∆ei,2−1

27/76

First Difference Regression

Instead of regressing yit on xit , regress the change in yit on the change in xit

Taking changes (differences) gets rid of fixed omitted variables ∆yi,2−1 = β1∆xi,2−1 +∆ei,2−1

As long as ∆ei,2−1 is mean independent of ∆xi,2−1:

E[∆ei,2−1|∆xi,2−1] = E[∆ei,2−1]

Note that this is not the same as:

E[eit|xit] = E[eit]

Menti: What is the coefficient βˆFD from a first difference regression? 1

28/76

Fixed Effects Regression

yit =β0+β1xit+γai+eit

An alternative approach:

Lets define δi = γai and rewrite:

yit =β0+β1xit+δi+eit So yi is determined by

(i) The baseline intercept β0 (ii) The effect of xi

(iii) An individual specific change in the intercept: δi Intuition behind fixed effects: Lets just estimate δi

29/76

What is δi

yit =β0+β1xit+δi+eit

δi is often referred to as i’s “fixed effect”

E[yit|xit = 0] = β0 +E[β1 ·0]+δi +E[eit|xit = 0]

So δi is just the change in individual is intercept: δi = E[yit|xit = 0]−β0

30/76

Fixed Effects Regression: Estimating δi

y1t =β0+β1xit+δ1+eit y2t =β0+β1xit+δ2+eit

.

ynt =β0+β1xit+δn+eit

How do we estimate δ1,δ2,··· ,δn?

31/76

Fixed Effects Regression: Estimating δi

yit =β0+β1xit+δi+eit

Simplest approach (to me): Dummy variables

Construct N-1 dummy variables D1,D2,··· ,DN−1

D1 =1 when i =1 and 0 otherwise D2 =1 when i =2 and 0 otherwise D3 =1 when i =3 and 0 otherwise And so on…

DN−1 =1 when i =N−1 and 0 otherwise

32/76

Fixed Effects Regression: Implementation

N−1

yit = β0 +β1xit + ∑ δiDi +eit

i=1

Note that we’ve left out DN

βOLS is interpreted as the intercept for individual N:

βOLS=E[y|x =0,i=N] 0 itit

0

and for all other i (e.g. i=2)

δ2 = E[yi|xit = 0,i = 2]−β0

Menti: What is the coefficient βˆFE from a fixed effects regression? 1

33/76

Fixed Effects Regression: Intuition

Any fixed characteristic of i is captured by the average of yit (for i)

By using dummy variables for i, we can just estimate (and hence

account for) those averages.

No longer have to worry about xit being correlated with a fixed component of eit

34/76

Why is This? Recall Regression Anatomy

βOLS = Cov(yit,x ̃it) 1 Var (x ̃it )

Where x ̃it is the residual from a regression of xit on Di N

xit = α0 + ∑αjDj +εit j=1

x ̃ =x −(αOLS+αOLS) it it 0 i

Subtracting (partialling out) the average xit for each i x ̃it is no longer correlated with eit

35/76

Fixed Effects Regression: Assumptions

There is one important difference in the assumptions necessary for OLS to capture the causal effect:

Before, we needed Now, we need:

E[eit|xit] = E[eit] E[eit|xi1,xi2,··· ,xiT ] = E[eit]

36/76

When Will Fixed Effects Not Be Enough?

We need

E[eit|xi1,xi2,··· ,xiT ] = E[eit]

But what if eit is growing over time?

E.g. interest rates rising each quarter, influencing profits and leverage

37/76

Time Fixed Effects

We so far have focused on controlling for entity i fixed effects

What if xit is correlated with something that changes over time but

is fixed across individual units?

Leverageit = β0 + β1 Profitsit + τt + vit

For example, many time-varying macro variables (e.g. monetary policy) might affect profits and leverage

If these are constant for all firms than they will be captured by τt

38/76

Time Fixed Effects

yit =β0+β1xit+τt+eit

Exact same approach as with entity fixed effects

Construct T −1 dummy variables D1,D2,··· ,DT−1

D1 =1 when t =1 and 0 otherwise D2 =1 when t =2 and 0 otherwise And so on…

And then, omitting one time period, we can estimate T−1

What is β0? τt?

yit = β0 +β1xit + ∑ τtDt +eit t=1

39/76

Time Fixed Effects

Time fixed effects do not deal with fixed individual characteristics What about combining both approaches?

40/76

Part 3: Difference-in-Difference

An example: Bankruptcy Costs and Leverage The difference-in-difference framework

Key assumption: Parallel Trends

41/76

Example: Bankruptcy Costs and Leverage

What is the effect of a decline in bankrutpcy costs on leverage?

Theory: Lower expected bankruptcy costs should increase leverage

Ideal (impossible to conduct) experiment: Randomly select a subset of firms

Reduce bankruptcy costs for these firms (e.g. streamline bankruptcy procedures)

Compare leverage between this subset and the remaining firms

42/76

Example: Bankruptcy Costs and Leverage

At the end of 1991 the state of Delaware passed a new law (“the reform”)

Significantly streamlined bankruptcy proceedings Reduced costs and time of litigation

Can we use this to learn something about our question? Suppose we call the causal effect of the reform: β1

How do we recover this parameter?

43/76

Approach 1: Before vs. After

Compare the average leverage of Delaware firms in 1991 vs. 1992 Let Aftert be a dummy equal to 1 after the reform

We would like to describe the relationship between the reform and leverage as:

Leverageit = β0 + β1 Aftert + vit

Where vit contains all other time and firm specific factors that influence leverage

44/76

Approach 1: Before vs. After

Suppose we regress Leverageit on our Aftert dummy: What is βOLS?

βOLS =E[Leverage |After =1]−E[Leverage |After =0] 1 it t it t

= β1 +E[vit|Aftert = 1]−E[vit|Aftert = 0] So β OLS = β1 (the causal effect of treatment) if

1

Why might that fail?

E[vit|Aftert]=E[vit]

1

45/76

Before vs. After

Leverage

E[Y|After=0]

1991m7 1991m10 1992m1 1992m4 1992m7 Month (t)

E[Y|After=1]

46/76

When is Before vs. After Ineffective?

Leverage

1991m7 1991m10 1992m1 1992m4 1992m7 Month (t)

46/76

When is Before vs. After Ineffective?

Leverage

E[Y|After=0]

1991m7 1991m10 1992m1 1992m4 1992m7 Month (t)

E[Y|After=1]

46/76

Approach 1: Before vs. After

βOLS is just the difference in leverage for 1992 Delaware firms 1

(“treatment”) relative to 1991 Delaware firms (“Control”)

We require E [vit |Aftert = 1] = E [vit |Aftert = 0] for this to identify the causal effect of the reform

Any time trend/other events in 1992 will cause vit for later observations to be different from vit for earlier observations

e.g. tight credit in 1992 may have reduced debt (and hence leverage)

47/76

Approach 2: Cross Sectional

Compare Delaware Firms (“Treatment”) vs. Non-Delaware firms (Control) in 1992

Don’t need to worry about time trends

Requires data from firms in surrounding states

Let Di be a dummy equal to 1 if firm i is registered in Delaware

We would like to describle the relationship between the reform and leverage as:

Leveragei =β0+β1Di+vi

Where vi contains all other time and firm specific factors that influence leverage

48/76

Approach 2: Cross Sectional

Suppose we regress Leveragei on our Di dummy:

βOLS = E[Leverage |D = 1]−E[Leverage |D = 0]

1iiii = β1 +E[vi|Di = 1]−E[vi|Di = 0]

So β OLS = β1 (the causal effect of treatment) if 1

E[vi|Di] = E[vi]

Do we expect everything else that impacts leverage to be the same in Delaware and other states?

49/76

When is Cross Sectional Approach Ineffective?

Do we expect everything else that impacts leverage to be the same in Delaware and other states?

What if firms in Delaware are more capital-intensive Typically capital intensivity ⇒ more leverage

This is just an omitted variable:

Leveragei = β0 + β1Di + β2CIi + ei

So if we omit CIi and estimate

Leveragei =β0+β1Di+vi

Will βOLS be larger or smaller than β1? 1

50/76

When is Cross Sectional Approach Ineffective?

Of course, we could measure and control for capital intensivity Leveragei = β0 + β1Di + β2CIi + ei

Then our the assumption for β OLS = β1 becomes: 1

E[ei|Di,CIi] = E[ei|CIi]

Beyond capital intensivity, do we expect everything else that

impacts leverage to be the same in Delaware and other states?

Hard to control for everything

51/76

Difference-in-Difference Approach

Let’s combine the positive features of the cross-sectional and before/after approaches

Cross sectional avoided omitted trends

Before/after avoided omitted (fixed) characteristics

The difference-in-difference estimator does exactly this Leverageit = β0 + β1Di × Aftert + β2Di + β3Aftert + vit

Here β1 is the causal effect of the reform in Delaware

Requires data on firms in/out of Delaware before/after the reform

52/76

What Does Data Look Like for Difference-in-Difference

State Delaware Maryland Virginia Delaware Virginia Virginia Delaware Maryland Virginia

. .

Year Leverageit (D/E) Di Aftert 1991 1.2 1 0 1991 3.1 0 0 1991 1.9 0 0 1991 0.9 1 0 1991 1.5 0 0 1991 1.1 0 0 1991 1.2 1 0 1991 1.6 0 0 1991 0.5 0 0

. . .. . . ..

Di ×Aftert 0

0

0

0

0

0

0

0

0

0 1 0 1 0 1

Maryland 1992 Delaware 1992 Virginia 1992 Delaware 1992 Maryland 1992 Delaware 1992

0.8 0 1 0.9 1 1 1.6 0 1 2.2 1 1 1.4 0 1 1.9 1 1

53/76

What Do the Difference-in-Difference Estimates Capture?

Recall that when righthand side variables take discrete values, OLS perfectly captures the conditional expectation function:

E[Leverageit|Di,Aftert]=E[βOLS +βOLSDi ×Aftert +βOLSDi +βOLSAftert|Di,Aftert] 0123

There are four groups:

1. Non-Delaware Before: {Di = 0, Aftert = 0}

2. Delaware Before: {Di = 1, Aftert = 0}

3. Non-Delaware After: {Di = 0, Aftert = 1} 4. Delaware After: {Di = 1, Aftert = 1}

54/76

What Do the Difference-in-Difference estimates Capture?

Lets calculate conditional expectations for these four groups: 1. E[Leverageit|Di =0,Aftert =0]=βOLS

2. E[Leverageit|Di = 1,Aftert = 0] = βOLS +βOLS 02

3. E[Leverageit|Di = 0,Aftert = 1] = βOLS +βOLS 03

4. E[Leverageit|Di = 1,Aftert = 1] = βOLS +βOLS +βOLS +βOLS 0123

0

55/76

What Do the Difference-in-Difference estimates Capture?

Lets calculate conditional expectations for these four groups: 1. E[Leverageit|Di =0,Aftert =0]=βOLS

2. E[Leverageit|Di = 1,Aftert = 0] = βOLS +βOLS 02

3. E[Leverageit|Di = 0,Aftert = 1] = βOLS +βOLS 03

4. E[Leverageit|Di = 1,Aftert = 1] = βOLS +βOLS +βOLS +βOLS 0123

0

55/76

Diff-in-Diff Solves Issues with Cross-Sectional Approach

Cross Sectional: Compare averages In Delaware vs. outside, after the reform

E[Leverageit|Di =1,Aftert =1]−E[Leverageit|Di =0,Aftert =1]

βOLS+βOLS+βOLS+βOLS (βOLS+βOLS) 0123 03

Cross-sectional Difference After

= β OLS + β OLS 12

We worried about the possibility of some omitted difference between Delaware and other states (β OLS ̸= 0)

Solution: Use the pre-reform difference to account for any fixed differences

E[Leverageit|Di =1,Aftert =0]−E[Leverageit|Di =0,Aftert =0]

βOLS+βOLS βOLS 020

Cross-sectional Difference Before

=βOLS 2

2

56/76

Diff-in-Diff Solves Issues with Cross Sectional Approach

Difference in Difference=

Difference After−Difference Before

βOLS+βOLS βOLS 122

=βOLS 1

57/76

Diff-in-Diff Solves Issues with Before vs. After

Before vs After: Compare averages before vs. after within Delaware: E[Leverageit|Di =1,Aftert =1]−E[Leverageit|Di =1,Aftert =0]

βOLS+βOLS+βOLS+βOLS (βOLS+βOLS) 0123 02

Difference In Delaware

= β OLS + β OLS 13

We worried about the possibility of some time trend Solution: Use other states to account for time trends

E[Leverageit|Di =0,Aftert =1]−E[Leverageit|Di =0,Aftert =0]

βOLS+βOLS βOLS 030

Difference Out of Delaware

=βOLS 3

58/76

Diff-in-Diff Solves Issues with Before vs. After

Difference in Difference=

Difference In Delaware−Difference Out of Delaware

βOLS+βOLS βOLS 133

=βOLS 1

59/76

Difference in Difference Matrix

Two ways to interpret the same estimator βOLS : 1

Delaware (Treatment) Other States (Control) Difference

Before After Difference βOLS +βOLS βOLS +βOLS +βOLS +βOLS =βOLS +βOLS

02012313

βOLS βOLS +βOLS =βOLS 0033

= βOLS = βOLS +βOLS = βOLS 2121

60/76

Diff-in-Diff Graphically

Leverage

Treatment (Delaware)

Control (Non−Delaware)

Before After

Month (t)

61/76

Diff-in-Diff Graphically

Leverage

Treatment (Delaware)

Control (Non−Delaware) β OLS

Before 0 After Month (t)

62/76

Diff-in-Diff Graphically

Leverage

Treatment (Delaware)

β OLS 2

Control (Non−Delaware) β OLS

Before 0 After Month (t)

63/76

Diff-in-Diff Graphically

Leverage

Treatment (Delaware)

β OLS β OLS 23

Control (Non−Delaware) β OLS

Before 0 After Month (t)

64/76

Diff-in-Diff Graphically

Leverage

Treatment (Delaware)

β OLS β OLS 23

Control (Non−Delaware) β OLS

Before 0 After Month (t)

β OLS 1

65/76

Diff-in-Diff Graphically

Leverage

Treatment (Delaware)

Control (Non−Delaware)

β OLS 1

β OLS 2

β OLS 3

β OLS 0

Before After

Month (t)

66/76

When Does Diff-in-Diff Identify A Causal Effect

As usual, we need

E[vit|Di,Aftert] = E[vit]

What does this mean intuitively?

Parallel trends assumption: In the absence of any reform the

average change in leverage would have been the same in the treatment and control groups

In other words: trends in both groups are similar

67/76

Parallel Trends

Leverage

Treatment (Delaware)

β OLS β OLS 23

Control (Non−Delaware) β OLS

Before 0 After Month (t)

β OLS 1

68/76

Parallel Trends

Parallel trends does not require that there is no trend in leverage Just that it is the same between groups

Does not require that the levels be the same in the two groups What does it look like when the parallel trends assumption fails?

69/76

When Parallel Trends Fails

Leverage

Treatment (Delaware)

Control (Non−Delaware)

Before After

Month (t)

70/76

When Parallel Trends Fails

Leverage

Treatment (Delaware)

β OLS β OLS 32

Control (Non−Delaware) β OLS

Before 0 After Month (t)

β OLS 1

71/76

When Parallel Trends Fails

Treatment (Delaware)

OLS β3

Control (Non−Delaware)

β OLS 2

β OLS 0

Before After

Month (t)

β OLS 1

Leverage

72/76

Testing the Parallel Trends Assumption?

It is impossible to truly test

Assumption about what patterns would have been without treatment

However with data for several periods before the reform, can provide convincing evidence

Intution: show that the two groups have been parallel for a long time

Typically, plot the difference in means between treated and control groups

If the difference in means is flat ⇒ parallel trends more likely to hold

73/76

General Form of Diff-in-Diff

We are interested in the impact of some treatment on outcome Yi

Suppose we have a treated group and a control group

Let Di =1 be a dummy equal to 1 if i belongs to the treatment

group

And suppose we see both groups before and after the treatment occurs

Let Aftert = 1 be equal to 1 if time t is after the treatment date Yit =β0+β1Di×Aftert+β2Di+β3Aftert+vit

For more precision:

Yit = β0 +β1Di ×Aftert +δi +τt +vit

Where τt and δi are fixed effects for each time period and individual

74/76

Data Exercise

Load the d in d dataset

Perform the following regression

Leverageit = β0 + β1Di × Aftert + β2Di + β3Aftert + vit

Where Di = 1 in delaware and 0 otherwise

and Aftert = 1 in 1992

Menti: what is βˆOLS 1

If you complete this, estimate:

Yit = β0 +β1Di ×Aftert +δi +τt +vit

75/76

Overview

This class: estimating causal effects with panel data 1. An introduction to panel data

Multiple observations of the same unit over time 2. First difference and fixed effects estimators

Estimating causal effects with fixed omitted variables

3. Difference-in-difference estimators

A more robust method for estimating causal effects

76/76