Statistics for International Relations Research II

class: center, middle, inverse, title-slide

# Statistics for International Relations Research II
## Panel Models
### <large>James Hollway and Juliette Ganne</large>

---

class: center, middle

.pull-1[.circleon[![](https://www.club.cc.cmu.edu/~cmccabe/image/crazy_clock.jpg)]]
.pull-1[.circleon[![](https://www.fabrikat.ch/media/catalog/product/cache/4/thumbnail/447x/9df78eab33525d08d6e5fb8d27136e95/r/o/rollins_leatherrip_hammer_9_1.jpg)]]
.pull-1[.circleon[![](https://www.random.org/analysis/randbitmap-wamp.png)]]

???

Today's menu

- What is a panel structure?
- Why not use simple OLS?
- Different types of variables and potential effects.
- Issues of serial correlation.

- Motivating and using fixed and random effects models.
- When to use which model?.

---
class: center, middle

# Data and error structures

.pull-1[.circleon[![](https://www.club.cc.cmu.edu/~cmccabe/image/crazy_clock.jpg)]]
.pull-1[.circleoff[![](https://www.fabrikat.ch/media/catalog/product/cache/4/thumbnail/447x/9df78eab33525d08d6e5fb8d27136e95/r/o/rollins_leatherrip_hammer_9_1.jpg)]]
.pull-1[.circleoff[![](https://www.random.org/analysis/randbitmap-wamp.png)]]

---
## Notation

The following terminology and notation will be useful:
- “units” are the individual things on which we have data (countries, clinics, individuals)
  - `$i \in \mathcal{N}$`
- “observations” are the measurements (for each variable) on each unit at a given point of time (GDP, nb. of successful surgery, wage)
  - `$t \in {T}$`

This means that the total number of observations is `$\mathcal{N}{T}$`.

Cross-sectional data is just at one point in time, `$Nt$`,
whereas time-series data is often just one unit over time, `$iT$`.

To talk about panel data, we need variation in both.

---
.pull-left[

## Types of data structures 1

.red[Cross-sectional data] consists of observations on different individuals or groups at a single point in time.
- Examples are R&D spending by firms by industry, social spending policy across countries, or level of dissident repression among authoritarian regimes.
- Endogeneity _very_ difficult to rule out, unless one uses causal estimation models (e.g. matching, or instrumental variables).

]

.pull-right[

.panelset[
.panel[
.panel-name[Cross code]

```r
crosss <- filter(data1, year == "2000")

ggplot(crosss, 
       aes(x = id, y = sstaxes)) + 
  geom_bar(stat="identity", 
           width=0.4, fill="darkred") +
  theme_classic() +
  labs(title = "Social security taxes collected in 2000",
       x = "Countries", 
       y = "Social security taxes as a percentage of GDP")
```
]

.panel[
.panel-name[Cross Plot]
![](STAT_PanelOutcomes_files/figure-html/crosssectional-1.png)
]

]
]

.red[Pooled cross-sectional data] describes randomly sampled cross-sections of individuals at different points in time.
- Example: ESS or ISSP (surveys come in ‘waves’ ask the same questions, but _different individuals_).
- Pooling makes sense if cross-sections are randomly sampled (like one big sample), and the units are interchangeable.
- Time dummy variables can be used to capture structural change over time.
- Often used to see the impact of policy or programs.

---
## Types of data structures 2

.red[Panel data] generally refers to data which are cross-sectionally dominated; that is, where *N* is significantly larger than *T*. 
- Examples are the ANES panel studies (_N = 2000; T = 6_) or the Panel Study of Income Dynamics (_N_ = large, _T_ = 12 or so).
- Such data usually have a fixed _T_, so that these data's asymptotics are in _N_, which is important (we'll come back to this).
- This is a longitudinal type, where the same units i.e. the same households or individuals are captured over time.
- Panel data structure makes it possible to deal with certain types of endogeneity without the use of exogenous instruments.

.red[Time series cross-sectional data] (TSCS) usually refers to data in which either _T_ is dominant, or `$N \approx T$`.
- Common in comparative politics. 
- It can also refer to data where _N_ is dominant, but _T_ is larger than in panel data 
(e.g. all-dyads all-years IR data, with _N_ = several thousand and _T_ = 50 or more).
- Here, _N_ is usually fixed, and the asymptotics are in _T_; 
moreover, if we have enough data, we can say something about the time-series properties of the data as well as the cross-sectional part.

---
## Data for today

.red[Comparative Welfare States] data comprises 22 states over 40 years and covers income distribution, social spending and welfare and market institutions, demographic and macroeconomic data, and political variables (turnout, distribution of votes, seats, and cabinet share).

.pull-left[

<div class="Rtable1"><table class="Rtable1">
<thead>
<tr>
<th class='rowlabel firstrow lastrow'></th>
<th class='firstrow lastrow'>Overall (N=859)</th>
</tr>
</thead>
<tbody>
<tr>
<td class='rowlabel firstrow'>Social expenditure per capita</td>
<td class='firstrow'></td>
</tr>
<tr>
<td class='rowlabel'>Mean (SD)</td>
<td>14.0 (3.44)</td>
</tr>
<tr>
<td class='rowlabel'>Median [Min, Max]</td>
<td>14.3 [6.17, 23.1]</td>
</tr>
<tr>
<td class='rowlabel lastrow'>Missing</td>
<td class='lastrow'>62 (7.2%)</td>
</tr>
<tr>
<td class='rowlabel firstrow'>Imports</td>
<td class='firstrow'></td>
</tr>
<tr>
<td class='rowlabel'>Mean (SD)</td>
<td>-0.411 (0.249)</td>
</tr>
<tr>
<td class='rowlabel'>Median [Min, Max]</td>
<td>-0.360 [-1.47, -0.0829]</td>
</tr>
<tr>
<td class='rowlabel lastrow'>Missing</td>
<td class='lastrow'>89 (10.4%)</td>
</tr>
<tr>
<td class='rowlabel firstrow'>Votes for leftist parties</td>
<td class='firstrow'></td>
</tr>
<tr>
<td class='rowlabel'>Mean (SD)</td>
<td>37.0 (13.2)</td>
</tr>
<tr>
<td class='rowlabel'>Median [Min, Max]</td>
<td>39.7 [0, 60.7]</td>
</tr>
<tr>
<td class='rowlabel lastrow'>Missing</td>
<td class='lastrow'>23 (2.7%)</td>
</tr>
</tbody>
</table>
</div>
]

.pull-right[

.panelset[
.panel[
.panel-name[Data code]

```r
data1 %>% 
    ggplot(aes(x=year, y=sstran, group=idn, color=id)) + 
    geom_line(size=1) + theme_classic()
```
]

.panel[
.panel-name[Cross Plot]
![](STAT_PanelOutcomes_files/figure-html/dataPlot-1.png)

]
]
]

???

Some variables vary over time, while others only over the units,
and others over both units and time.

Absence of variation on one dimension means there is nothing to say about that phenomenon there, and little variation in one dimension means there is little to say about the phenomenon: *variation is information*!

This means one needs to consider carefully “where” the variation in one's data is, 
and (more important) where one's theories suggest we should see variation as well.

---
## Long vs Wide

Panel/TSCS treated in one of two formats: long vs wide.
Long format is often preferred as more efficient and flexible.

.pull-left[

.red[Long data] has `$NT$` rows, with columns for each variable.
.tiny[
.panelset[
.panel[
.panel-name[Long Data Table]

```
## # A tibble: 859 x 5
## idn year sstran rgdpecap leftvot 
## <fct> <dbl> <labelled> <dbl> <labelled>
## 1 1 1980 6.364682 21955. 40.5 
## 2 1 1981 6.353022 22753. 45.1 
## 3 1 1982 7.232827 21889. 45.1 
## 4 1 1983 7.439353 22836. 48.8 
## 5 1 1984 7.268586 23339. 49.3 
## # … with 854 more rows
```

]

.panel[
.panel-name[Long Data Code]

```r
data1 %>% tbl_df %>% select(idn, year, sstran, rgdpecap, leftvot) %>% 
  print(n=5)
```

]
]
]
]

.pull-right[

.red[Wide data] has `$N$` rows and additional columns for each variable multiplied by `$T$`.
.tiny[
.panelset[
.panel[
.panel-name[Wide Data Table]

```
## # A tibble: 22 x 121
## idn sstran_1980 sstran_1981 sstran_1982 sstran_1983 sstran_1984 sstran_1985
## <fct> <labelled> <labelled> <labelled> <labelled> <labelled> <labelled> 
## 1 1 6.364682 6.353022 7.232827 7.439353 7.268586 7.041956 
## 2 2 16.370188 16.818114 17.164325 17.293337 17.458069 17.804682 
## # … with 20 more rows, and 114 more variables: sstran_1986 <labelled>,
## # sstran_1987 <labelled>, sstran_1988 <labelled>, sstran_1989 <labelled>,
## # sstran_1990 <labelled>, sstran_1991 <labelled>, sstran_1992 <labelled>,
## # sstran_1993 <labelled>, sstran_1994 <labelled>, sstran_1995 <labelled>,
## # sstran_1996 <labelled>, sstran_1997 <labelled>, sstran_1998 <labelled>,
## # sstran_1999 <labelled>, sstran_2000 <labelled>, sstran_2001 <labelled>,
## # sstran_2002 <labelled>, sstran_2003 <labelled>, sstran_2004 <labelled>,
## # sstran_2005 <labelled>, sstran_2006 <labelled>, sstran_2007 <labelled>,
## # sstran_2008 <labelled>, sstran_2009 <labelled>, sstran_2010 <labelled>,
## # sstran_2011 <labelled>, sstran_2012 <labelled>, sstran_2013 <labelled>,
## # sstran_2014 <labelled>, sstran_2015 <labelled>, sstran_2016 <labelled>,
## # sstran_2017 <labelled>, sstran_2018 <labelled>, sstran_NA <labelled>,
## # rgdpecap_1980 <dbl>, rgdpecap_1981 <dbl>, rgdpecap_1982 <dbl>,
## # rgdpecap_1983 <dbl>, rgdpecap_1984 <dbl>, rgdpecap_1985 <dbl>,
## # rgdpecap_1986 <dbl>, rgdpecap_1987 <dbl>, rgdpecap_1988 <dbl>,
## # rgdpecap_1989 <dbl>, rgdpecap_1990 <dbl>, rgdpecap_1991 <dbl>,
## # rgdpecap_1992 <dbl>, rgdpecap_1993 <dbl>, rgdpecap_1994 <dbl>,
## # rgdpecap_1995 <dbl>, rgdpecap_1996 <dbl>, rgdpecap_1997 <dbl>,
## # rgdpecap_1998 <dbl>, rgdpecap_1999 <dbl>, rgdpecap_2000 <dbl>,
## # rgdpecap_2001 <dbl>, rgdpecap_2002 <dbl>, rgdpecap_2003 <dbl>,
## # rgdpecap_2004 <dbl>, rgdpecap_2005 <dbl>, rgdpecap_2006 <dbl>,
## # rgdpecap_2007 <dbl>, rgdpecap_2008 <dbl>, rgdpecap_2009 <dbl>,
## # rgdpecap_2010 <dbl>, rgdpecap_2011 <dbl>, rgdpecap_2012 <dbl>,
## # rgdpecap_2013 <dbl>, rgdpecap_2014 <dbl>, rgdpecap_2015 <dbl>,
## # rgdpecap_2016 <dbl>, rgdpecap_2017 <dbl>, rgdpecap_2018 <dbl>,
## # rgdpecap_NA <dbl>, leftvot_1980 <labelled>, leftvot_1981 <labelled>,
## # leftvot_1982 <labelled>, leftvot_1983 <labelled>, leftvot_1984 <labelled>,
## # leftvot_1985 <labelled>, leftvot_1986 <labelled>, leftvot_1987 <labelled>,
## # leftvot_1988 <labelled>, leftvot_1989 <labelled>, leftvot_1990 <labelled>,
## # leftvot_1991 <labelled>, leftvot_1992 <labelled>, leftvot_1993 <labelled>,
## # leftvot_1994 <labelled>, leftvot_1995 <labelled>, leftvot_1996 <labelled>,
## # leftvot_1997 <labelled>, leftvot_1998 <labelled>, leftvot_1999 <labelled>,
## # leftvot_2000 <labelled>, leftvot_2001 <labelled>, leftvot_2002 <labelled>,
## # leftvot_2003 <labelled>, leftvot_2004 <labelled>, leftvot_2005 <labelled>,
## # …
```
]

.panel[
.panel-name[Wide Data Code]

```r
data1 %>% tbl_df %>% select(idn, year, sstran, rgdpecap, leftvot) %>%
* pivot_wider(names_from = year,
*             values_from = c(sstran, rgdpecap, leftvot)) %>%
  print(n=2)
```

]
]
]
]

???

You can pivot or *reshape* between these two forms using various functions in R. 
They were presented in Stats I.

---
## Time-Constant and Time-Varying Variables

Three types of explanatory variables that can be located either at the level of units or level of contexts (aka time/group).

.red[Time-constant variables]: 
- e.g. ethnicity or gender (individuals), geographical location or type of government (context).
- Some are treated as time-constant because change is rare or a variable is more or less a stable characteristic.
- They do not vary over time (obviously) but can vary across units.

.red[Time-varying variables]: 
- e.g. labor force experience and on the job-training (individual), or economic growth and public spending (context).
- Can characterize the unit or the context.

.red[Time]: 
- Debatable whether time itself really an explanatory variable or an indicator for other unobserved characteristics that change over time.
- But time may capture possible time trends in the data.

---
## Simultaneity Bias

Simultaneity bias is introduced if we do not account for the fact that some effects unfold not immediately but slowly over time.
- E.g., an increase in spending on active labor market programs does not lead in the same year to a reduction in unemployment. 
Rather, it may take up to two or more years.

To that end it is a common practice to lag the dependent variable, 
e.g. in cases of GDP measures, expenditure, unemployment, etc.

One key theoretical challenge is to justify the lag structure. 
Is it 1- year, 2-years, or even 5-years? 
Depends on your variable of interest.

You can get some information about this by observing the correlation 
between your outcome variable with the predictor of interest for different lag values 
(e.g. correlation between Y and Xt−1, Y and Xt−2, etc.).

---
### Lagging the DV

.small[

.pull-left[
Pros:
- Including the lagged DV will help you overcome omitted variable bias. 
- You can account for autocorrelation.
- Parsimonious.

Cons:
- Including the lagged DV will take out a lot of your variance and is likely to make your effects less significant (which means both make the `$\beta$`s smaller and the standard errors bigger). 
- In other words, we underestimate the true relationships at play.

When we lag, we make sure that we are examining the association between social spending with observations on voting recorded at an earlier time.

We are giving two years to leftist vote to have an effect on social spending.

]

.pull-right[

.panelset[
.panel[
.panel-name[Lag code]

```r
library(Hmisc)
data1$sstran_lag <- Lag(data1$sstran, -2)

data_w_lag <- dplyr::select(data1, year, leftvot, sstran, sstran_lag)

head(data_w_lag, addrownums = FALSE)
```
]

.panel[
.panel-name[Lag table]

```
## # Panel data: 6 x 5
## # entities: idn [1]
## # wave variable: year [1980, 1981, 1982, ... (6 waves)]
## idn year leftvot sstran sstran_lag
## <fct> <dbl> <labelled> <labelled> <labelled>
## 1 1 1980 40.5 6.364682 7.232827 
## 2 1 1981 45.1 6.353022 7.439353 
## 3 1 1982 45.1 7.232827 7.268586 
## 4 1 1983 48.8 7.439353 7.041956 
## 5 1 1984 49.3 7.268586 6.882550 
## 6 1 1985 47.5 7.041956 6.674260
```

]
]
]
]
---

## Pooled OLS

.pull-left[

Now, we could just model this using OLS: `$Y_i = \alpha + \beta X_i + u_i$`

Since each observation involves both a unit and a timepoint, it is really:

`$$Y_{it} = \alpha + \beta X_{it} + u_{it}$$`

This basically ‘pools’ all this information and just concentrates on the relationship
between explanatory variables and the response variable (here social expenditure).

]
.pull-right[

```r
ols <- lm(sstran ~ csh_m + leftvot, data1)
tab_model(ols, digits = 3)
```

</table>

]

---
## So why not OLS?

Recall that, in addition to all the usual assumptions, OLS is also assuming that
- the constant term is constant across different *i*s
- the effect of any given variable `$X$` on `$Y$` is constant across observations 
(at least to the extent that non-constancy isn’t specified in the model, e.g., through interaction terms).

But this is usually problematic in a panel/TSCS context,
because we usually have some reason to believe that there may be differences
in either `$\alpha$` or `$\beta$` across `$i$` or over `$t$`, which leads to a form of specification bias.

While aggregating variation across a dimension can be useful, 
it can also tempt one to commit the .red[ecological fallacy]:
inferring individual-level relationships on the basis of aggregate data.

For example, at the individual level, in the US, being wealthier is positively correlated with voting for the Republican party, but at the state level, the poorest states are the ones voting for Republican Presidential candidates.

---
## Simpson’s Paradox

.pull-left[

]

.pull-right[

]

---
### Another example

---
## Varying intercepts and slopes

.small[
.pull-left[
*Intercepts may vary*
- e.g. different units have different starting points for the (same) slopes

If we estimate this model as OLS, we can get biased coefficients.

*Slopes may vary*
- e.g. different units respond to covariates differently and so the effect of _X_ on _Y_ differs

If we estimate this model as OLS, we’ll only get an ‘average’ of the different slopes,
and if there are, say, two groups that have radically different responses to a covariate,
then these can cancel out and we could get a Type II error (false negative).

*Both intercepts and slopes may vary*
- e.g. different units start in different places *and* respond to covariates differently

]
]

.pull-right[

]

???

Though varying intercepts and slopes most commonly used by unit,
we could also have variation in `$\alpha$` or `$\beta$` over time.

---
## Error term and serial correlation

Now we could have different `$\alpha$`s and `$\beta$`s for each unit, each time point, 
or every combination of unit and time point:

`$$Y_{it} = \alpha_{it} + \beta_{it} + u_{it}$$`

But we’ve been assuming throughout that `$u_{it}$` is homoskedastic and uncorrelated, 
both within and across `$i$` and `$t$`, i.e.:
- no cross-unit heteroskedasticity
- no temporal heteroskedasticity
- no autocorrelation

.pull-left[

<img src="STAT_PanelOutcomes_files/figure-html/residuals heteroscedasticity-1.png" width="1080" />
]

.pull-right[

<img src="STAT_PanelOutcomes_files/figure-html/autocorrelation-1.png" width="1080" />
]

???

Heterogeneity across units and over time

```r
library(gplots)
plotmeans(sstran ~ id, data = data1)
```

```r
plotmeans(sstran ~ year, data = data1)
```

That’s a pretty tall order.

Remember, the error term is supposed to be a *stochastic* element to the model and should not incorporate any *systematic* differences
(those should be specified in the model). But:
- Cross-unit differences mean that the model does a better job of explaining some units than others
- Time effects (such as socialization, institutionalization, learning, or other such dynamics) cause the model to do a better or worse job of explaining _Y_ over time,
- Omitted variables lead to correlation in the residuals, either across units (because time matters) or (more commonly) over time (because units matter).

---
### Two types of errors

We can actually have two different error terms that capture different types of unobserved heterogeneity.
1. `$u_i$` the unit-specific error to account for between unit variation
  - unobserved predictors of `$Y$` that are specific to the unit and therefore time-constant.
2. `$e_{it}$` the time-varying error to account for within unit variation
  - unobserved predictors of `$Y$` that are specific to the time point and the unit.
  
--

Now, this wouldn't be a problem (OLS would not be biased) if this unobserved heterogeneity would be independent of the explanatory variables in the model, but:
- most likely still serial (i.e. auto-)correlation
- residuals may still correlate on the basis of unobserved unit-specific heterogeneity, even if uncorrelated with variables in the model.

???

Key point: measurements over time are almost never independent so OLS assumptions violated; i.e. don't use OLS...

---
class: center, middle

# Fixed Effects

.pull-1[.circleoff[![](https://www.club.cc.cmu.edu/~cmccabe/image/crazy_clock.jpg)]]
.pull-1[.circleon[![](https://www.fabrikat.ch/media/catalog/product/cache/4/thumbnail/447x/9df78eab33525d08d6e5fb8d27136e95/r/o/rollins_leatherrip_hammer_9_1.jpg)]]
.pull-1[.circleoff[![](https://www.random.org/analysis/randbitmap-wamp.png)]]

---
## Modeling this

Panel data models differ depending on where we think errors might correlate.

We’ll concentrate on .red[fixed effects] and .red[random effects] today, 
since they are the building blocks to understanding the rest.

When random effects are used together with fixed effects, we call this a .red[mixed effects model].

There are other names and similar models out there too that you may encounter,
such as .red[hierarchical models] (a multilevel model with a single nested hierarchy) 
and .red[multilevel models] (a hierarchical model with multiple non-nested hierarchies).

---
## Introduction to Fixed Effects

Fixed effects (FE) explore the relationship between predictor and outcome variables within an entity (country, person, company, etc.).

Each entity has its own individual characteristics that may or may not influence the predictor variables.
- For example, identifying as male or female could influence the opinion towards a certain issue;
- The political system of a particular country could have some effect on trade or GDP;
- Or the business practices of a company may influence its stock price.

This can be a source of unit-level unobserved heterogeneity/idiosyncratic error, `$u_i$`
- Easy to fix if we have information about them. Simply put them as another independent variable into our regression model.
- But what about those factors that are hard to measure or those which we have not yet considered?

???

This unobserved heterogeneity constitute the unobserved factors that influence the DV that changes across units and time.

We do not want to multiply the number of variables making the model too complex, when this could be dealt with all at once.

Note the separate intercept term.
i.e., some clusters tend to have higher values of Y than others.
This is known as the .red[variable intercept model].
You can think of it as a model of individual-level heterogeneity (which matters if we have omitted variable bias).

---

.pull-left[
## The LSDV Method

Treating the unit effects `$\alpha_i$` as fixed values is, in many respects, the simplest thing we can do.

Consider first a general model in which we replace the general intercept
with individual (unit-) level effects,
i.e. that some units/clusters have higher or lower levels of the outcome variable than others:

`$$Y_{it} = \alpha_i + \beta X_{it} + u_{it}$$`

This is also called the .red[least-squares dummy variables] (LSDV) method.
That is, we simply estimate the equation from earlier by including separate dummy variables for each unit in the model along with the covariates.

This dummy works like a sponge – it “soaks up” all potential error that is due to unobserved country-specific characteristics.

]
.pull-right[
.small[

```r
*lsdv <- lm(sstran ~ 0 + factor(id) + csh_m + leftvot, data1)
tab_model(lsdv, rm.terms = "factor(id) [BEL,CAN,FIN,FRG,IRE,ITA,LUX,JPN,NOR,SPA,SWE,UKM,FRA,GRE,POR]")
```

</table>

]
]

???

Basically, we create dummy variables for each country, the dummy for Belgium works that all the observations pertaining to Belgium get a 1 and all the observations from other countries get a 0.

The O at the front of the formula (could also be a -1 at the end) is removing all intercepts (as shown in the table) and the creating one time-invariant intercept for each unit with the dummies.

---
### Comparing Pooled OLS and LSDV

.pull-left-1[

A pooled OLS model ignores the panel structure of the data. 
- but observations belonging to the same country are not independent!
- two variables probably do not capture all country-specific heterogeneity.

An LSDV model includes `$u_i$` (error to account for the specificities of each country) to adjust for that.
- Once we add the country-dummies, the effect of imports is smaller and not statistically significant.

]

.pull-right-2[
.tiny[
<table style="border-collapse:collapse; border:none;">
<tr>
<th style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; text-align:left; ">&nbsp;</th>
<th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">OLS</th>
<th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">LSDV</th>
</tr>
<tr>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; text-align:left; ">Predictors</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">p</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">p</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">(Intercept)</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">10.016</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Imports</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;3.586</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;0.614</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.392</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Votes for leftist parties</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.066</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;0.064</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)AUL</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">10.377</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)AUS</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">20.890</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)DEN</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">19.192</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)NET</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">16.281</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)NZL</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">14.085</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)SWZ</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">12.739</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)USA</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">11.283</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm; border-top:1px solid;">Observations</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">738</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">738</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">R2 / R2 adjusted</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.154 / 0.151</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.983 / 0.982</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">AIC</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">3789.171</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">3069.280</td>
</tr>

</table>

]
]

???

Pooled OLS estimation is simply an OLS technique run on panel data.

Comparing POLS and LSDV, we see that the latter has no intercept 
and that the significance of some of our variables of interest are no longer present.

---

### Comparing Pooled OLS and LSDV

.pull-left[

]

.pull-left[

<img src="STAT_PanelOutcomes_files/figure-html/lsdvPlot1-1.png" width="504" />
]

.pull-left[

<img src="STAT_PanelOutcomes_files/figure-html/olsPlot2-1.png" width="619.2" />
]

.pull-right[

<img src="STAT_PanelOutcomes_files/figure-html/lsdvPlot2-1.png" width="619.2" />
]

???

To come back to the Simpson paradox, 
OLS told us that votes for leftist parties were positively correlated (and significant) with the social expenditure per capita, 
but when we remove the intercept/control for the unit-specific characteristics then the effect is actually negative. 
We might want to rethink our hypothesis/variables.

---
## Within Estimator Method

.pull-left[

For large datasets, an alternative is to simply leave the intercept in 
and then the first country (i.e. Australia) is the baseline.

It is called the .red[within estimator].

We measure how the observations in the respective country deviate on average from this country.

We can write this model as:

`$$Y_{it} = \alpha_i + \beta_B \bar{X}_i + \beta_W (X_{it} - \bar{X}_i) + u_{it}$$`

Why are the `$R^2$` statistics different?
- Since LSDV uses the original data, `$R^2$` measures the explained proportion of the overall variance.
- Since the within estimator model used time-demeaned data, `$R^2$` measures the explained portion of the within variance.

]

.pull-right[
.tiny[

```r
library(lme4)
fe <- lm(sstran ~ csh_m + leftvot + factor(id), data = data1)
```

<table style="border-collapse:collapse; border:none;">
<tr>
<th style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; text-align:left; ">&nbsp;</th>
<th colspan="3" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">LSDV</th>
<th colspan="3" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">Within</th>
</tr>
<tr>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; text-align:left; ">Predictors</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">CI</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">p</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">CI</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; col7">p</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)AUL</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">10.377</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">8.975&nbsp;&ndash;&nbsp;11.778</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7"></td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)AUS</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">20.890</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">19.334&nbsp;&ndash;&nbsp;22.445</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">10.513</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">9.521&nbsp;&ndash;&nbsp;11.505</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)DEN</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">19.192</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">17.691&nbsp;&ndash;&nbsp;20.693</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">8.815</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">7.837&nbsp;&ndash;&nbsp;9.793</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)NET</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">16.281</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">15.074&nbsp;&ndash;&nbsp;17.487</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">5.904</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">4.973&nbsp;&ndash;&nbsp;6.835</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)NZL</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">14.085</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">12.716&nbsp;&ndash;&nbsp;15.454</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">3.709</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">2.812&nbsp;&ndash;&nbsp;4.605</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)SWZ</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">12.739</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">11.642&nbsp;&ndash;&nbsp;13.836</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">2.362</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">1.303&nbsp;&ndash;&nbsp;3.421</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)USA</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">11.283</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">10.627&nbsp;&ndash;&nbsp;11.940</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.907</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;0.603&nbsp;&ndash;&nbsp;2.417</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">0.239</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Imports</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;0.614</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;2.020&nbsp;&ndash;&nbsp;0.792</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.392</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;0.614</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;2.020&nbsp;&ndash;&nbsp;0.792</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">0.392</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Votes for leftist parties</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;0.064</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;0.091&nbsp;&ndash;&nbsp;-0.037</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;0.064</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;0.091&nbsp;&ndash;&nbsp;-0.037</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">(Intercept)</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">10.377</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">8.975&nbsp;&ndash;&nbsp;11.778</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm; border-top:1px solid;">Observations</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="3">738</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="3">738</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">R2 / R2 adjusted</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="3">0.983 / 0.982</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="3">0.699 / 0.689</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">AIC</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="3">3069.280</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="3">3069.280</td>
</tr>

</table>

]
]

???

Now, the regression output actually does report a constant here, which is simply the baseline category Australia.

What we see is that our adjusted `$R^2$` is much better for the LSDV, 
because with LSDV we are interested in the variance overall, 
while with our within-estimation model explains only the variance within a single unit/country.

Another difference is the estimates for the dummy variables, 
but if we look closely, in our LSDV model, this is just estimates+intercept.

Our estimates for our other variables are the same.

```r
plot_model(lsdv, type = "pred", terms = c("leftvot","id"))
```

```r
plot_model(fe, type = "pred", terms = c("leftvot","id"))
```

They are the same.

---
## Advantages and disadvantages

With FE, we assume some unit characteristics may impact or bias IVs or DVs and we need to control for this.
FE removes the effect of time-invariant unit characteristics to assess predictors’ net effect on the outcome.
They are therefore used to study the causes of changes within each unit (and are thus broadly used in e.g. economics).

**Advantages**
1. FE estimates always .blue[BLUE], even if unit effects correlate with a predictor.
1. They’re generally widely used, well established, and (almost always) non-controversial.
1. Use fixed-effects (FE) whenever you are only interested in analyzing the impact of variables that vary over time.

**Disadvantages**
1. FE models inefficient, using up many degrees of freedom and affecting standard errors.
1. FE will not work well with data for which within-cluster variation is minimal or for investigating variables that do not change over time or only change slowly (e.g. race or regime type) because they will be highly collinear with the fixed effects.
1. No out of sample predictions because you are linking your estimates to these particular units.

---
class: center, middle

# Random Effects

.pull-1[.circleoff[![](https://www.club.cc.cmu.edu/~cmccabe/image/crazy_clock.jpg)]]
.pull-1[.circleoff[![](https://www.fabrikat.ch/media/catalog/product/cache/4/thumbnail/447x/9df78eab33525d08d6e5fb8d27136e95/r/o/rollins_leatherrip_hammer_9_1.jpg)]]
.pull-1[.circleon[![](https://www.random.org/analysis/randbitmap-wamp.png)]]

---
## Introduction to Random Effects

An alternative to treating units with fixed effects is to treat them with .red[random effects].
- fixed effects essentially zooming in on the relation between the outcome and your predictor variables by accounting for the idiosyncrasies of our units. 
- random effects best defined as noise in your data (arising from uncontrollable variability within the sample).

With random effects, 
across unit variation is assumed to be random and uncorrelated with the predictor or independent variables included in the model.
That is, heterogeneity in our unit-specific error `$u_i$` is a sort of random disturbance.

In the RE model, the `$u_i$`s are now seen as one component of the stochastic part of the model (as the realization of a random variable).
Unlike in the fixed model, 
we want to estimate error terms `$u_i$` and `$e_{it}$` which together constitute `$\epsilon_{it}$`:

`$$Y_{it} - \lambda  \bar{Y}_{i}=  \beta (X_{it}- \lambda \bar X_{i})... + \epsilon_{it}- \lambda \bar \epsilon_{i}$$`

???

So in the example for today, one could think that the political system and the unionization covary with the our composite error term which includes both the specificities of the countries themselves and the action of time.

The random effects will create more bias but less variance, which could lead to the estimate being closer to the real parameter value.

A way to think about RE is to see them in-between FE and pooled OLS. 
The lambda is partly demeaning the data but not completely. 
We are taking off a fraction of the time demeaned values.

---
## RE assumptions

RE assumes that:
- `$u_i$` and `$e_{it}$` are uncorrelated with the explanatory variables, 
- have constant variances `$\sigma^2_{u}$` and `$\sigma^2_{e}$`,
- and are independent of each other and across units.

Given this we arrive at `$\epsilon_{it} = u_i + e_{it}$`.

Because the overall error term `$\epsilon_{it}$` is split into two components, `$u_i$` (unit-specific error) and `$e_{it}$` (time-varying error),
this model is also called the error or .red[variance component model].

Then we can get the expected amount of serial correlation by calculating:

`$$\lambda = 1-\sqrt{\frac{\sigma^2_{u}}{\sigma^2_{u} + T\sigma^2_{e}}}$$`

???

If the `$\lambda$` of this equation is 0, 
then we should consider pooled OLS rather than RE 
and if it is closed to 1 then we should go with FE.

Everything is concentrated in this `$T\sigma^2_{e}$`:
- if the number is close to infinity then we need the fixed effect to “soak up” all those specificities linked to the countries. 
- if the result is 0 than it means that the `$\sigma^2_{e}$` is also close to 0 and that our error is unimportant and that our model explains the variation already then we can use OLS.

If it is in-between 0 and 1 (usually the case), we should consider using RE. With RE, we are taking a part of the mean of our variable but not all (quasi time-demeaned system).

---
.small[
.pull-left-2[

.panelset[
.panel[
.panel-name[RE Code]

```r
rei <- lme4::lmer(sstran ~ csh_m + leftvot + (1|id), data1)
res <- lme4::lmer(sstran ~ csh_m + leftvot + (1 + leftvot|id), data1)
rep <- lme4::lmer(sstran ~ csh_m + leftvot + (1 + leftvot + year|id), data1)
tab_model(rei, res, rep, dv.labels = c("1","2","3"), rm.terms = "id [BEL,CAN,FIN,ITA,LUX,JPN,NOR,NET,SPA,SWE,SWZ,NZL,UKM,FRA,GRE,POR]", digits = 3, show.aic = T, show.ci = FALSE)
```

]

.panel[
.panel-name[RE Table]

</table>

]
]
]
]
.pull-right-1[

`$\sigma^2$` is the residual variance

`$\tau_{00}$` is the random intercept for our units

`$\tau_{11}$` is the random slopes for our units by variable

`$\rho_{01}$` is the correlation between the intercepts and the coefficients

`$ICC$` is the *Intraclass Correlation Coefficient*, the ratio of the between-cluster variance to the total variance

`$N$` is our number of units (22 countries)

Marginal `$R^2$` only considers the variance of the fixed effects

Conditional `$R^2$` takes both fixed and random effects into account

`$AIC$` is as we’ve already discussed

]

???

We are going to use the `lme4` package here to fit linear models with random effects.
Please note though that there are a host of other packages out there,
some of which are more flexible or specific.
`lme4` is a pretty standard package though, so that’s what we’ll be using here.

Note that in the formulas we have some new additions to our typical syntax.
In the first model, we’re adding a random intercept effect for countries.
The 1 before the bar/pipe/mid indicates that we want an intercept for our random effect.
In the second model, we’re adding random slopes for the effect of leftist parties by each country.
And in the third model, we’re adding time as an extra random slope.
To suppress the random intercept, you would do (year - 1|...).

If `$\partial_{01}$` is positive, it suggests that those groups with larger intercepts will have steeper slopes,
whereas if `$\partial_{01}$` is negative, it suggests those groups with larger intercepts will have flatter slopes.

The ICC can help determine whether a linear mixed model is even necessary.
If the correlation were 0, then observations within clusters are no more similar than observations from different clusters,
and we may as well use a pooled OLS.
However here it is high (>.75), which suggests there is good reason to be using this.
There is consistency in grouped observations.

Note that we are treating year as a continuous variable here, 
because to treat as a factor would be to use up too many degrees of freedom.

---
### Varying intercepts and slopes

.pull-left[

```r
plot_model(res, type = "pred", 
           terms = c("leftvot","id"), 
           pred.type="re")
```

<img src="STAT_PanelOutcomes_files/figure-html/plotRES-1.png" width="504" />
]

.pull-right[

```r
plot_model(rep, type = "pred", 
           terms = c("leftvot","id"), 
           pred.type="re")
```

<img src="STAT_PanelOutcomes_files/figure-html/plotREP-1.png" width="504" />
]

???

---
### Why might we want to do this?

**Advantages**
1. REs provide gains in error variance as fewer parameters need to be estimated.
1. REs allow time-invariant variables to play a role as explanatory variables (e.g. gender).
  - in a FE model, these variables would be absorbed by the dummy variables.
1. REs allow us to make generalizations to (some) other contexts.
  - Under the RE model `$u_i$` is assumed to be a random draw 
  from the universe of all possible values of a random variable having a certain distribution 
  (e.g., the normal distribution).
  - Under the FE model `$u_i$` is assumed to be a parameter 
  that is to be estimated from the data of the sampled unit 
  (and hence, may be different in another sample).

**Disadvantages**

1. REs require that all unmeasured factors that go in to `$\alpha_i$` are uncorrelated with some of the `$X$`s that are in the model.
  - Unless experiments, some variables may not unavailable or unknown, leading to OVB.
1. You need to specify those individual characteristics that may or may not influence the predictor variables. 
There should be no omitted variables.

---
## Fixed or Random?

There is a lot of discussion about which one to use.
(And a lot of [confusion about the terms](https://statmodeling.stat.columbia.edu/2005/01/25/why_i_dont_use/) themselves).

One option is to use a Hausman test to test how much coefficients under each model differ.

The idea is the following: 
If two estimators are consistent under a given set of assumptions, 
their estimates should not differ significantly. 
But if only one of the two estimators provides consistent estimates, 
then the estimates from both estimators should differ significantly.

The Hausman test calculates the standard error of the difference between FE and RE and then can be used for a _t_-test.
The null hypothesis is that the preferred model is random effects versus the alternative the fixed effects. 
It basically tests whether the unique errors `$u_i$` are correlated with the independent variables, 
the null hypothesis is they are not.

---
.panelset[
.panel[
.panel-name[Hausman Test]

```
## 
## 	Hausman Test
## 
## data:  data1
## chisq = 60.97, df = 3, p-value = 3.647e-13
## alternative hypothesis: one model is inconsistent
```
]

.panel[
.panel-name[Hausman Code]

```r
hausman_test <- function (lmerMod, lmMod, ...) { ## changed function call
 coef.wi <- coef(lmMod)
 coef.re <- fixef(lmerMod) ## changed coef() to fixef() for glmer
 vcov.wi <- vcov(lmMod)
 vcov.re <- vcov(lmerMod)
 names.wi <- names(coef.wi)
 names.re <- names(coef.re)
 coef.h <- names.re[names.re %in% names.wi]
 dbeta <- coef.wi[coef.h] - coef.re[coef.h]
 dvcov <- vcov.re[coef.h, coef.h] - vcov.wi[coef.h, coef.h]
 stat <- abs(t(dbeta) %*% as.matrix(solve(dvcov)) %*% dbeta) ## added as.matrix()
 pval <- pchisq(stat, df = length(dbeta), lower.tail = FALSE)
 names(stat) <- "chisq"
 parameter <- length(dbeta)
 names(parameter) <- "df"
 res <- list(statistic = stat, p.value = pval, parameter = parameter, 
 method = "Hausman Test", alternative = "one model is inconsistent",
 data.name=deparse(getCall(lmerMod)$data)) ## changed
 class(res) <- "htest"
 return(res)
}
hausman_test(res, fe)
```
]
]

???

As we have seen the coefficients are different depending on the RE and FE models, 
the question is how much different are they?

If the p-value is significant (i.e. <0.05) then use fixed effects, if not use random effects.

---
### Slow-Moving Variables

Now, the Hausman test can only tell you if there is a difference in the coefficients, 
but that does not mean that the consequence of a significant result is that you **HAVE** to use the FE specification.

One case in which we want to weigh the trade-offs between FE and RE more closely, are slow-moving/.red[sluggish variables], 
i.e. where there is little within-unit variation over time.
Remember, sluggish variables will be highly collinear with the fixed effects.

The inclusion of fixed effects would be problematic, 
as it potentially discards much of the information 
and leads to imprecise estimates and large standard errors (Barro, 2012).

Clark and Linzer (2015) suggest that in deciding between FE and RE, 
one should also consider the sample size and the correlation between the covariate and unit effects.
- In particular in small datasets, and in presence of sluggish variables, 
the random-effects model will tend to produce superior estimates of `$\beta$` when there are few units or observations per unit, 
and when the correlation between the independent variable and unit effects is relatively low.
- Otherwise, the fixed-effects model may be preferable, 
as the random-effects model does not induce sufficiently high variance reduction to offset its increase in bias.

---
.tiny[

<table style="border-collapse:collapse; border:none;">
<tr>
<th style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; text-align:left; ">&nbsp;</th>
<th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">Pooled OLS</th>
<th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">LSDV</th>
<th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">Within</th>
<th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">1</th>
<th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">2</th>
<th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">3</th>
</tr>
<tr>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; text-align:left; ">Predictors</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">p</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">p</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; col7">p</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; col8">Estimates</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; col9">p</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; 0">Estimates</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; 1">p</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; 2">Estimates</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; 3">p</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">(Intercept)</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">10.016</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">10.377</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col8">15.531</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col9">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 0">15.413</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 1">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 2">15.501</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 3">&lt;0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Imports</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;3.586</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;0.614</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.392</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;0.614</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">0.392</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col8">&#45;0.876</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col9">0.207</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 0">&#45;1.101</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 1">0.126</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 2">&#45;0.811</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 3">0.250</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Votes for leftist parties</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.066</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;0.064</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&#45;0.064</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col8">&#45;0.054</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col9">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 0">&#45;0.057</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 1">0.054</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 2">&#45;0.053</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 3">0.001</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)AUL</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">10.377</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col8"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col9"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 0"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 1"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 2"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 3"></td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)AUS</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">20.890</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">10.513</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col8"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col9"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 0"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 1"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 2"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 3"></td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)DEN</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">19.192</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">8.815</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col8"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col9"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 0"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 1"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 2"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 3"></td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)FRG</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">19.588</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">9.211</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col8"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col9"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 0"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 1"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 2"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 3"></td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)IRE</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">13.209</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">2.832</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col8"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col9"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 0"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 1"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 2"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 3"></td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">factor(id)USA</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">11.283</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">&lt;0.001</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.907</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col7">0.239</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col8"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; col9"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 0"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 1"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 2"></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; 3"></td>
</tr>
<tr>
<td colspan="13" style="font-weight:bold; text-align:left; padding-top:.8em;">Random Effects</td>
</tr>

<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">&sigma;2</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">&nbsp;</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">&nbsp;</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">&nbsp;</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">3.62</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">3.41</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">3.59</td>

<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">&tau;11</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">&nbsp;</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">&nbsp;</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">&nbsp;</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">&nbsp;</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.01 id.leftvot</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.00 id.leftvot</td>

<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">N</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">&nbsp;</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">&nbsp;</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">&nbsp;</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">22 id</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">22 id</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">22 id</td>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm; border-top:1px solid;">Observations</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">738</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">738</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">738</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">738</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">738</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">738</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">R2 / R2 adjusted</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.154 / 0.151</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.983 / 0.982</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.699 / 0.689</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.037 / 0.740</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.038 / 0.779</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.040 / 0.708</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">AIC</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">3789.171</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">3069.280</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">3069.280</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">3155.161</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">3141.780</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">3162.989</td>
</tr>

</table>

]

???

Pooled OLS not the best option since observations within the same countries are not independent.

LSDV, we removed the intercept and we see that some of the IV are not longer significant. High R-square.

Within, Australia is our baseline to create the dummy variables (intercept), estimates are the same for our IV, but not for our dummies.

Mixed effects/RET, both time and countries are treated with random effects, coefficients are not the same, the question is how different are they? See Hausman test

---
### Which model to choose?

In many cases, you also have to decide what is of particular theoretical interest for you.
- If you are theoretically interested in cross-national variation over variation within states across time, 
then RE is more appropriate.
- If you are theoretically interested in variation across time and want to make causal inferences, 
then FE is more appropriate.
- If your key IV is time-constant or sluggish, 
FE will drop this variable from the estimation and you cannot say anything about it.
- In RE models, we still may have unit-specific autocorrelation and heteroskedasticity.
These need to be corrected as well.

For non-linear models, RE are the predominant approach: The RE model is more parsimonious and FE is less efficient if it does not capture the true model.

---
## Extensions

.pull-left[
Here we have used fixed and random effects in a panel modelling context.

But there are various extensions to...
- non-continuous dependent variables (generalized linear mixed effect models)
- non-linear models
- hierarchical models
- nested models
- multilevel models
- etc.

Mixed effects are the basis for a lot of much broader and more flexible models,
and really deserve their own course...

]

.pull-right[

![](https://www.stats.ox.ac.uk/~snijders/mlbook2.png)

]

???

For more, please see Snijders and Bosker (1999).

---
class: center, middle

# Summary