Statistics for International Relations Research II

class: center, middle, inverse, title-slide

# Statistics for International Relations Research II
## Network Models
### <large>James Hollway</large>

---

class: center, middle

.pull-1[.circleon[![](https://upload.wikimedia.org/wikipedia/commons/thumb/6/60/Graph_betweenness.svg/220px-Graph_betweenness.svg.png)]]
.pull-1[.circleon[![](https://static01.nyt.com/images/2020/08/29/world/29boseman-react-sub/29boseman-react-sub-mediumSquareAt3X-v2.jpg)]]
.pull-1[.circleon[![](https://www.guidesiena.it/wp-content/uploads/2020/07/piazza-del-campo-siena-610x610.jpg)]]

---
class: center, middle
# Social Networks

.pull-1[.circleon[![](https://upload.wikimedia.org/wikipedia/commons/thumb/6/60/Graph_betweenness.svg/220px-Graph_betweenness.svg.png)]]
.pull-1[.circleoff[![](https://static01.nyt.com/images/2020/08/29/world/29boseman-react-sub/29boseman-react-sub-mediumSquareAt3X-v2.jpg)]]
.pull-1[.circleoff[![](https://www.guidesiena.it/wp-content/uploads/2020/07/piazza-del-campo-siena-610x610.jpg)]]

---
## What is/are “social networks”?

.pull-left[
.center[
![](https://upload.wikimedia.org/wikipedia/commons/thumb/6/60/Graph_betweenness.svg/220px-Graph_betweenness.svg.png)
]]

--
- Not **a** social network
  - not a type of actor, but structures

--
- Not a social **network**
  - not a type of structure, but any structure

--
- Not a **social** network
  - not any structure of a specific type of relation, but structures of any relations

Social networks concerns the abstraction, theorising, and analysis of relations into structures.

These relations can be key to understanding (changes in) macro-structures or micro-structures,
the distribution of attributes within those structures, or even what those structures consist of.

---
## Terminology

.pull-left[

- .red[Graph] or .red[network], `$G = (V, E)$`
- .red[Vertices] or .red[nodes] from a .red[node set] or .red[mode], `$V = \{a,b,c,...\}$`
- .red[Edges] or .red[ties] (also links, lines, connections, arcs), `$E = \{\{a,b\}, \{b,c\}, ...\}$`

![:scale 75%](STAT_L9_Networks_files/figure-html/example-1.png)
]

.pull-right[
- undirected
- weighted
- signed
- labelled
- directed
- multiplex
- complex
- two-mode
- multilevel
]

---
.pull-left-3[
- .red[graphs] pretty and easy to interpret
- quickly difficult to discern though and results may vary...
]

.center[![:scale 55%](GraphMat.png)]

.pull-left[
- .red[edgelists] 2(+) (ordered) columns of numbered/labelled pairs
- easy in Excel, incl. attributes, and memory efficient
- more complicated data, statistics, and analysis difficult though
]

.pull-right[
- .red[matrices]' rows are senders and cols recipients
- memory inefficient for sparse networks and somewhat incomprehensible
- encodes all relational info and quick, flexible analysis
]

???

- dimensions
- density
- outdegrees
- indegrees
- isolates
- reciprocity
- transitivity

---
## Dependency

.pull-left[![:scale 75%](GraphDep.png)]

A lot of emphasis has been put on networks' pretty graphs
and identifying nodes that are central and/or in groups together.

These aspects of .red[social network analysis] are important and powerful tools...

...but the real opportunity in social networks is the ability to identify, theorise, and estimate structures of social .red[dependence].

.pull-right[.pull-down[
Some examples:
- outdegrees
- indegrees
- reciprocity
- transitivity
- homophily
- etc.
]]

---
## Network modelling

We want to know the degree to which such mechanisms are responsible for the structures we see.

### Why network _dynamics_ then?

Because we want to know _why_ there are associations
- say, why are depressed people more likely to have depressed friends (Schaefer et al 2012) or why democratic countries are unlikely to be at war with one another...

Competing explanations tend to involve dynamic mechanisms:
- because depressed adolescents prefer depressed friends
- because they are avoided by non-depressed people
- because they withdraw from friendly interactions which destroys all other friendships
- because depression is contagious along friendships

---
.panelset[
.panel[
.panel-name[Reading in data]

```r
friendship.t1 <- as.matrix(read.table(file = "friendship.network.t1.dat"))
friendship.t1[friendship.t1 >= 1] <- 1
friendship.t2 <- as.matrix(read.table(file = "friendship.network.t2.dat"))
friendship.t2[friendship.t2 >= 1] <- 1
nActors <- dim(friendship.t1)[1]

library(igraph)
graph1 <- graph.adjacency(friendship.t1)
graph2 <- graph.adjacency(friendship.t2)
```
]

.panel[
.panel-name[Visualising networks]

```r
graph12 <- graph.adjacency(friendship.t1 + friendship.t2)
myLayout <- layout.fruchterman.reingold(graph12)

par(mfrow = c(1, 2))
igraph::plot.igraph(graph1,
     vertex.color = "seagreen",
     edge.color = "black",
     edge.width = 1.5,
     edge.arrow.size = 0.6,
     vertex.size = 10,
     vertex.label = "",
     layout = myLayout,
     main = "Network wave 1")
igraph::plot.igraph(graph2,
     vertex.color = "seagreen",
     edge.color = "black",
     edge.width = 1.5,
     edge.arrow.size = 0.3,
     vertex.size = 10,
     vertex.label = "",
     layout = myLayout,
     main = "Network wave 2")
```
]
.panel[
.panel-name[Plot]
![](STAT_L9_Networks_files/figure-html/plotting-1.png)
]

.panel[
.panel-name[Now what?]
.center[.red[Which forces shape this social network's evolution?]]
]
]

???

- same group of actors (some composition change allowed)
- same relational variable (states, not events)
- some, but not too much change...

The data used are from the Children of Immigrants Study,
(c) MZES Mannheim, Manfred Kalter.

- social network ties are costly? (low outdegree)
- individuals form and maintain reciprocal ties?
- transitivity leads to clustering
- status hierarchy shapes friendship networks (ties to popular actors)
- gender/ethnic homophily?

---
.panelset[
.panel[
.panel-name[Gender]

```r
gender <- unlist(read.table(file = "gender.dat"))
par(mfrow = c(1, 2))
igraph::plot.igraph(graph1,
     vertex.color = ifelse(gender == 1, "pink", "blue"),
     vertex.shape = ifelse(gender == 1, "square", "circle"),
     edge.color = "black",
     edge.width = 2,
     edge.arrow.size = 0.6,
     vertex.size = 10,
     vertex.label = NA,
     layout = myLayout,
     main = "Network wave 1")
igraph::plot.igraph(graph2,
     vertex.color = ifelse(gender == 1, "pink", "blue"),
     vertex.shape = ifelse(gender == 1, "square", "circle"),
     edge.color = "black",
     edge.width = 2,
     edge.arrow.size = 0.6,
     vertex.size = 10,
     vertex.label = "",
     layout = myLayout,
     main = "Network wave 2")
```
]

.panel[
.panel-name[Plot]
![](STAT_L9_Networks_files/figure-html/gender-1.png)

]

.panel[
.panel-name[Country of Origin]

```r
coo <- unlist(read.table(file = "coo.dat"))
par(mfrow = c(1, 2))
igraph::plot.igraph(graph1,
     vertex.color = coo,
     vertex.shape = ifelse(gender == 1, "square", "circle"),
     edge.color = "black",
     edge.width = 2,
     edge.arrow.size = 0.6,
     vertex.size = 10,
     vertex.label = "",
     layout = myLayout,
     main = "Network wave 1")
igraph::plot.igraph(graph2,
     vertex.color = coo,
     vertex.shape = ifelse(gender == 1, "square", "circle"),
     edge.width = 2,
     edge.color = "black",
     edge.arrow.size = 0.6,
     vertex.size = 10,
     vertex.label = "",
     layout = myLayout,
     main = "Network wave 2")
```
]

.panel[
.panel-name[Plot]
![](STAT_L9_Networks_files/figure-html/ethnic-1.png)
]

]

---
## Modelling thoughts

A _statistical approach_ is necessary here to control for alternative explanations

A _complete network approach_ is necessary because selection can only be studied when the complete pool of candidates is known

A _longitudinal approach_ is necessary to link antecedents with consequences

A (weak: see Udehn 2002) _methodologically individualist approach_ is useful for bringing the model close to theory

---
class: center, middle
# Stochastic Actor Oriented Models

.pull-1[.circleoff[![](https://upload.wikimedia.org/wikipedia/commons/thumb/6/60/Graph_betweenness.svg/220px-Graph_betweenness.svg.png)]]
.pull-1[.circleon[![](https://static01.nyt.com/images/2020/08/29/world/29boseman-react-sub/29boseman-react-sub-mediumSquareAt3X-v2.jpg)]]
.pull-1[.circleoff[![](https://www.guidesiena.it/wp-content/uploads/2020/07/piazza-del-campo-siena-610x610.jpg)]]

---
## Intuition

![](SAOMIntuition.png)

---
## SAOMs

- SAOMs (stochastic actor-oriented models) are a .red[continuous-time] network model
  - They model change in social networks in continuous- time using empirical panel data with SIENA (Simulation Investigation for Empirical Network Analysis)
  - See [Block et al 2018](https://www.sciencedirect.com/science/article/abs/pii/S0378873317300035)
  
- SAOMs are an .red[actor-oriented] network model
  - They model change as a function of individuals’ choices about whom they want to relate to and how they want to behave
  - See [Block et al 2019](https://journals.sagepub.com/doi/full/10.1177/0049124116672680)
  
---
.center[
![:scale 75%](ContinTime.png)
]

.pull-left[
## Why continuous-time?

Complex patterns emerge from simple(r) mechanisms

New ties may be realisation-contingent on other ties.

Cannot easily model compound emergence in discrete-time.
]

--
.pull-right[
## Why actor-oriented?

All social network change is brought about by individual or collective agents that decide to send or drop a tie (homophily, withdrawal, avoidance, etc)

As the actor is the locus of control, we should model the tie changes from its perspective.
]

---
class: center, middle

![](SequenceIntuition.png)

???

This is one potential path how the network develops from `$t_1$` to `$t_2$`

Each of these represents a mini-step

At each mini-step an actor receives an opportunity to toggle a tie

They decide what action is most appealing and act accordingly

---
## Two processes in each ministep

.pull-left[
.red[Rate Function]
  - essentially _who gets how many opportunities to make changes_ between waves (periods)

`$$\lambda_i(x) = \exp\bigl(\sum_k \rho_k r_{ik} (x) \bigr)$$`
  - `$\rho_k$` weights statistics `$r_{ik}(x)$` that express local configurations 
  that may correlate with more, `$\rho_k>0$`, or less, `$\rho_k<0$`, change
  - technically, `$\lambda_i(x)$` a (non-homog) .red[Poisson process]
    - models how much change between `$t_1$` and `$t_2$`
    - more change requires more ministeps
    - can mean more ministeps than changes
  - call actor offered oppt .red[ego] or focal actor
    - studies typically assume a period-wise constant rate (though see [Hollway 2020](https://link.springer.com/chapter/10.1007/978-3-030-46769-2_4)...)
]

.pull-right[
.red[Evaluation Function]
  - essentially, once ego chosen, _who/what_ -- alters could be people or something else -- _do they choose_?

`$$f_i(x) = \sum_k \beta_k s_{ik} (x)$$`
  
  - `$\beta_k$` weights statistics `$s_{ik}$` that express local configurations 
  that may be desired, `$\beta_k>0$`, or avoided, `$\beta_k<0$`
  - technically, `$f(i(x))$` is part of a .red[multinomial logit] model for discrete,
  probabilistic choice
    - models attractiveness of different network states `$x$` to actor `$i$` 
    reachable within one step of the current network
    - this is where the action typically is. Helps us answer whether we prefer happy friends or avoid depressed people...
]

???

SAOMs' secret sauce is really in joining these two functions in a simulation framework to enable estimation.

---
## Example of an actor's decision

.pull-left[
<img src="STAT_L9_Networks_files/figure-html/example-1.png" width="504" />
]

.pull-right[

|           | outdg| recip| trans| cycle| color|
|:----------|-----:|-----:|-----:|-----:|-----:|
|Status quo |     3|     2|     2|     2|     0|
|Drop A     |     2|     1|     0|     1|     0|
|Drop B     |     2|     1|     0|     1|     0|
|Drop C     |     2|     2|     2|     2|     0|
|Add D      |     4|     3|     2|     2|     1|
|Add E      |     4|     2|     2|     3|     0|
|Add F      |     4|     2|     2|     3|     1|
|Add G      |     4|     2|     2|     3|     1|
]

---
## Evaluating the options

.pull-left[

Now, let's say that we know that the average focal actor weights the options as follows:

`$$\beta_{\text{outdg}} = -2.6$$`

`$$\beta_{\text{recip}} = 1.8$$`
`$$\beta_{\text{trans}} = 0.4$$`
`$$\beta_{\text{cycle}} = -0.7$$`
`$$\beta_{\text{color}} = 0.8$$`

Using the underlying multinomial, we can ultimately calculate the probability our focal actor makes each choice...

`$$p_{i\leadsto j}(x, \beta) = \frac{\exp(f(x^{i\leadsto j}, \beta))}{\sum_{k=1}^n \exp(f(x^{i\leadsto j}, \beta))}$$`

]

.pull-right[

|           | Evaluation|   Exp|Prob |
|:----------|----------:|-----:|:----|
|Status quo |       -4.8| 0.008|5%   |
|Drop A     |       -4.1| 0.017|10%  |
|Drop B     |       -4.1| 0.017|10%  |
|Drop C     |       -2.2| 0.111|68%  |
|Add D      |       -4.8| 0.008|5%   |
|Add E      |       -8.1| 0.000|0%   |
|Add F      |       -7.3| 0.001|1%   |
|Add G      |       -7.3| 0.001|1%   |
]

???

Therefore dropping tie to alter 3 is the most likely choice for ego in this context.

---
class: center, middle
# RSiena

.pull-1[.circleoff[![](https://upload.wikimedia.org/wikipedia/commons/thumb/6/60/Graph_betweenness.svg/220px-Graph_betweenness.svg.png)]]
.pull-1[.circleoff[![](https://static01.nyt.com/images/2020/08/29/world/29boseman-react-sub/29boseman-react-sub-mediumSquareAt3X-v2.jpg)]]
.pull-1[.circleon[![](https://www.guidesiena.it/wp-content/uploads/2020/07/piazza-del-campo-siena-610x610.jpg)]]

---
## Estimation

So now we have a well-defined probability model, from which we can simulate networks using defined parameters, `$\beta$`.
But what we usually want to do is estimate parameters from observed data!
We do this using .blue[RSiena]
 - Simulation
 - Investigation for
 - Empirical
 - Network
 - Analysis
 
.center[![:scale 60%](https://lp-cms-production.imgix.net/2019-06/d048d91e99c8fa099c5e3fb544fd237d-piazza-del-campo.jpg?auto=format&fit=crop&ixlib=react-8.6.4&h=520&w=1312)]

???

1. Start with wave 1

2. Simulate lots of ministeps/chains with current parameters

3. Check whether simulated networks are similar in terms of the statistics associated with the effects specified to wave 2
  1. If no, adjust parameters and start again at step 2
  1. If yes, calculate various metrics to establish

---
## Installing RSiena

.pull-left[![:scale 80%](https://raw.githubusercontent.com/snlab-nl/rsiena/main/inst/rsienalogo.png)]

.pull-right[
First we load the necessary packages.
To get the most recent version of RSiena, 
you will need to download it from Github:

```r
# library("remotes")
# remotes::install_github("snlab-nl/rsiena")
library("RSiena")
```
]

---
## Create internal SIENA objects

```r
# Create dependent network variable
friendship.dependent <- sienaDependent(
  array(c(friendship.t1, friendship.t2), 
         dim = c(nActors, nActors, 2) ) )

# Create constant actor covariates
coo.coCovar <- coCovar(coo)
gender.coCovar <- coCovar(gender)

# Put all the data together so that SIENA can call it as necessary:
mySienaData <- sienaDataCreate(friendship.dependent,
                               coo.coCovar,
                               gender.coCovar)

# Now we print a report to the harddrive to keep track of how the data
# is constructed and, once we begin running the model, the output
print01Report( mySienaData, modelname = 'mannheim_network_1' )
```

---
## Statistics and effects

By finding out how effects are weighted (the parameters), 
we can answer our research questions.

Each effect (“IV”) has an effect statistic which defines it
- Are smokers popular? Alter attribute effect: `$s_i(x) = \sum_j x_{ij}v_j$`
- Do students ethnically segregate? Homophily effect: `$s_i(x) = \sum_j x_{ij} I\{v_i = v_j\}$`

They can depend on network configurations (i.e. the position of `$j$` in the network), or attributes (i.e. a characteristic of `$j$` or whether it is the same as `$i$`), or both

Some covariates rely on exogenous information, of which there are four types:
- constant, monadic: `coCovar`
- constant, dyadic: `coDyadCovar`
- changing, monadic: `varCovar`
- changing, dyadic: `varDyadCovar`

There are **_heaps_** of different structural and covariate-based effects available...

---
## Specify the SIENA model

.panelset[
.panel[
.panel-name[Structural Effects]

```r
mySienaEffects <- getEffects(mySienaData)
mySienaEffects
```

```r
mySienaEffects <- includeEffects(mySienaEffects, inPop, transTrip, cycle3)
```

```
##   effectName            include fix   test  initialValue parm
## 1 transitive triplets   TRUE    FALSE FALSE          0   0   
## 2 3-cycles              TRUE    FALSE FALSE          0   0   
## 3 indegree - popularity TRUE    FALSE FALSE          0   0
```

]

.panel[
.panel-name[Covariate Effects]

```r
mySienaEffects <- includeEffects(mySienaEffects, egoX, altX, sameX, interaction1 = "gender.coCovar")
```

```
##   effectName           include fix   test  initialValue parm
## 1 gender.coCovar alter TRUE    FALSE FALSE          0   0   
## 2 gender.coCovar ego   TRUE    FALSE FALSE          0   0   
## 3 same gender.coCovar  TRUE    FALSE FALSE          0   0
```

```r
mySienaEffects <- includeEffects(mySienaEffects, egoX, altX, sameX, interaction1 = "coo.coCovar")
```

```
##   effectName        include fix   test  initialValue parm
## 1 coo.coCovar alter TRUE    FALSE FALSE          0   0   
## 2 coo.coCovar ego   TRUE    FALSE FALSE          0   0   
## 3 same coo.coCovar  TRUE    FALSE FALSE          0   0
```

]

.panel[
.panel-name[Final Specification]

```r
mySienaEffects # Check parameters before estimation
```

```
##    effectName                                include fix   test  initialValue parm
## 1  basic rate parameter friendship.dependent TRUE    FALSE FALSE    6.14244   0   
## 2  outdegree (density)                       TRUE    FALSE FALSE   -1.06194   0   
## 3  reciprocity                               TRUE    FALSE FALSE    0.00000   0   
## 4  transitive triplets                       TRUE    FALSE FALSE    0.00000   0   
## 5  3-cycles                                  TRUE    FALSE FALSE    0.00000   0   
## 6  indegree - popularity                     TRUE    FALSE FALSE    0.00000   0   
## 7  coo.coCovar alter                         TRUE    FALSE FALSE    0.00000   0   
## 8  coo.coCovar ego                           TRUE    FALSE FALSE    0.00000   0   
## 9  same coo.coCovar                          TRUE    FALSE FALSE    0.00000   0   
## 10 gender.coCovar alter                      TRUE    FALSE FALSE    0.00000   0   
## 11 gender.coCovar ego                        TRUE    FALSE FALSE    0.00000   0   
## 12 same gender.coCovar                       TRUE    FALSE FALSE    0.00000   0
```

]

???

Next we need to make an object that keeps track of which effects
we are including in the model and which, if any, parameters are set

As you can see, RSiena includes some basic terms by default -
a rate parameter that models the rate of network change in each period 
as actors' (homogenous) opportunities to change their ties, 
an outdegree parameter that operates as the unconditional probability of a tie,
and (because this is a directed, one-mode network) reciprocity.

We also want to include an effect for preferential attachment/cumulative advantage:
We include a new effect by specifying it after the current effects object 
in the function includeEffects()

If they are defined the same way, we can even include multiple effects at once.
Let's include effects for transitivity and 3-cycles.

If you've done something wrong, you can restore the defaults using getEffects()

Now we also want to include some effects for homophily, 
controlling for general tendency of these variables to define
particularly active or particularly popular students.

interaction1="" helps the function link the egoX, altX, and sameX terms 
to variables in the SIENA data object.

Researchers usually come with theory or at least hypotheses to specify a model 
(SAOMs are not really for exploration...)

Beware spuriousness...
- Attribute vs centrality (popularity)
- Homophily vs cohesion (reciprocity, transitivity)

---
## Choose SIENA algorithm

We need to set a few settings before we begin.
The following function creates a link to the .out(put) file
specified in the `print01Report()` function above.
There are other, more complex options here too.
Here we will constrain the simulations so that each actor is limited to 5 ties.

```r
mySienaAlgorithm <- sienaAlgorithmCreate(projname = "mannheim_network_1",
                                         MaxDegree = c(friendship.dependent = 5) )
```

Here is also where we would establish the method of estimation.

- .red[Method of Moments] (MoM)
  - Take the network at the first time point and simulate a certain number of mini-steps with some initial 𝛽 values
  - Compare the simulated networks to the observed network at the second time point
  - According to the differences between observed and simulated networks, we update the 𝛽 values
  - Rinse and repeat until the simulated networks “closely” resemble the observed one
- .red[Maximum Likelihood] (ML)
  - Actually connects two observations by chains of ministeps and estimates parameters from these chains
- .red[Bayesian] (Bayes)
  - For multilevel analysis of networks and _enthusiasts_

---
## Now we are ready to estimate a SAOM

```r
result <- siena07(mySienaAlgorithm, data = mySienaData, effects = mySienaEffects, 
                  batch = TRUE, # if batch=F then you will get a GUI
                  returnDeps = TRUE) # we need this for the GOF
```

```
## 
## Start phase 0 
## theta: -1.06  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00 
## 
## Start phase 1 
## Phase 1 Iteration 1 Progress: 0%
## Phase 1 Iteration 2 Progress: 0%
## Phase 1 Iteration 3 Progress: 0%
## Phase 1 Iteration 4 Progress: 0%
## Phase 1 Iteration 5 Progress: 0%
## Phase 1 Iteration 10 Progress: 0%
## Phase 1 Iteration 15 Progress: 0%
## Phase 1 Iteration 20 Progress: 1%
## Phase 1 Iteration 25 Progress: 1%
## Phase 1 Iteration 30 Progress: 1%
## Phase 1 Iteration 35 Progress: 1%
## Phase 1 Iteration 40 Progress: 1%
## Phase 1 Iteration 45 Progress: 1%
## Phase 1 Iteration 50 Progress: 1%
## theta: -1.44902  0.22459  0.23419  0.00606 -0.01107  0.00838 -0.04087 -0.03972  0.07336 -0.00623  0.21524 
## 
## Start phase 2.1
## Phase 2 Subphase 1 Iteration 1 Progress: 17%
## Phase 2 Subphase 1 Iteration 2 Progress: 17%
## theta -1.6851  0.5533  0.5086  0.0600 -0.0421  0.0144 -0.0692  0.0115  0.2344  0.0338  0.3127 
## ac -0.9461 12.5161 -1.8518 -1.6151 -0.5092 -0.7331 -1.1782  1.7979 -0.3040  0.0775 -5.1757 
## Phase 2 Subphase 1 Iteration 3 Progress: 17%
## Phase 2 Subphase 1 Iteration 4 Progress: 17%
## theta -1.9877  1.4511  0.6138 -0.0865 -0.1031  0.0635  0.0213  0.6829 -0.0600  0.0559  0.6216 
## ac  0.22331 -0.19047 -0.23059 -0.14280  0.19312  0.00511 -0.19098  1.01493  0.56834 -0.79737 -0.11610 
## Phase 2 Subphase 1 Iteration 5 Progress: 17%
## Phase 2 Subphase 1 Iteration 6 Progress: 17%
## theta -2.0665  1.7324  0.4903 -0.3169 -0.1462  0.0908  0.0346  1.3235 -0.2417  0.0805  0.9098 
## ac  0.3243 -0.1501 -0.2170 -0.1388  0.2575 -0.0237 -0.1261  0.8974  0.1863 -0.7475 -0.1104 
## Phase 2 Subphase 1 Iteration 7 Progress: 17%
## Phase 2 Subphase 1 Iteration 8 Progress: 17%
## theta -2.1303  1.7744  0.4836 -0.4595 -0.1997  0.1204  0.0167  1.4037 -0.4795 -0.1446  0.9960 
## ac  0.2194 -0.2286 -0.2783 -0.2065  0.1675  0.0205 -0.0609  0.9054  0.2014 -0.3382 -0.2258 
## Phase 2 Subphase 1 Iteration 9 Progress: 17%
## Phase 2 Subphase 1 Iteration 10 Progress: 17%
## theta -2.2732  1.7322  0.7429 -0.4953 -0.2480  0.1504 -0.0243  1.2310 -0.4020 -0.2773  1.0413 
## ac  0.1843 -0.2201 -0.2804 -0.2084  0.1401  0.0196 -0.2448  0.9202  0.2117  0.3028 -0.2364 
## theta -2.7459  2.1677  0.9334 -0.9149 -0.2724  0.1370  0.0452  1.4319 -0.3061  0.0282  1.2445 
## ac -0.1889 -0.5318 -0.6323 -0.6107 -0.2336 -0.0834 -0.3263 -0.0222 -0.2837 -0.2354 -0.2805 
## theta: -2.7459  2.1677  0.9334 -0.9149 -0.2724  0.1370  0.0452  1.4319 -0.3061  0.0282  1.2445 
## 
## Start phase 2.2
## Phase 2 Subphase 2 Iteration 1 Progress: 24%
## Phase 2 Subphase 2 Iteration 2 Progress: 24%
## Phase 2 Subphase 2 Iteration 3 Progress: 24%
## Phase 2 Subphase 2 Iteration 4 Progress: 24%
## Phase 2 Subphase 2 Iteration 5 Progress: 24%
## Phase 2 Subphase 2 Iteration 6 Progress: 24%
## Phase 2 Subphase 2 Iteration 7 Progress: 24%
## Phase 2 Subphase 2 Iteration 8 Progress: 24%
## Phase 2 Subphase 2 Iteration 9 Progress: 24%
## Phase 2 Subphase 2 Iteration 10 Progress: 24%
## theta -2.62488  2.06073  0.91354 -0.84295 -0.27479  0.13052  0.02661  1.38371 -0.22033 -0.00908  1.17493 
## ac -0.2523 -0.3005 -0.4343 -0.4183 -0.2559  0.0291 -0.0918 -0.1074 -0.3283 -0.2342 -0.1976 
## theta: -2.62488  2.06073  0.91354 -0.84295 -0.27479  0.13052  0.02661  1.38371 -0.22033 -0.00908  1.17493 
## 
## Start phase 2.3
## Phase 2 Subphase 3 Iteration 1 Progress: 33%
## Phase 2 Subphase 3 Iteration 2 Progress: 33%
## Phase 2 Subphase 3 Iteration 3 Progress: 33%
## Phase 2 Subphase 3 Iteration 4 Progress: 33%
## Phase 2 Subphase 3 Iteration 5 Progress: 33%
## Phase 2 Subphase 3 Iteration 6 Progress: 33%
## Phase 2 Subphase 3 Iteration 7 Progress: 33%
## Phase 2 Subphase 3 Iteration 8 Progress: 33%
## Phase 2 Subphase 3 Iteration 9 Progress: 33%
## Phase 2 Subphase 3 Iteration 10 Progress: 33%
## theta -2.58829  2.02304  0.91220 -0.84023 -0.26844  0.13497  0.02821  1.36294 -0.23143  0.00815  1.14965 
## ac -0.18734 -0.02152 -0.26650 -0.21436 -0.24888  0.03771 -0.09630 -0.04013  0.00247  0.02107 -0.19437 
## theta: -2.58829  2.02304  0.91220 -0.84023 -0.26844  0.13497  0.02821  1.36294 -0.23143  0.00815  1.14965 
## 
## Start phase 2.4
## Phase 2 Subphase 4 Iteration 1 Progress: 46%
## Phase 2 Subphase 4 Iteration 2 Progress: 46%
## Phase 2 Subphase 4 Iteration 3 Progress: 46%
## Phase 2 Subphase 4 Iteration 4 Progress: 46%
## Phase 2 Subphase 4 Iteration 5 Progress: 46%
## Phase 2 Subphase 4 Iteration 6 Progress: 46%
## Phase 2 Subphase 4 Iteration 7 Progress: 46%
## Phase 2 Subphase 4 Iteration 8 Progress: 46%
## Phase 2 Subphase 4 Iteration 9 Progress: 46%
## Phase 2 Subphase 4 Iteration 10 Progress: 46%
## theta -2.6179  2.0398  0.9034 -0.8324 -0.2559  0.1281  0.0283  1.3424 -0.2368  0.0114  1.1407 
## ac  0.02749 -0.05547 -0.14471 -0.11912  0.01372 -0.05645 -0.02204 -0.06930 -0.00564 -0.02594 -0.05044 
## theta: -2.6179  2.0398  0.9034 -0.8324 -0.2559  0.1281  0.0283  1.3424 -0.2368  0.0114  1.1407 
## 
## Start phase 3 
## Phase 3 Iteration 500 Progress 86%
## Phase 3 Iteration 1000 Progress 100%
```

???

If the model does not "converge" after the default/however many iterations set,
then you may wish to either revisit your model specification or
restart estimation with more sensible starting values (i.e. pick up where you left off).

This latter option can be achieved by adding `prevAns = result` to the `siena07()` call.

---
## Reading results

```r
result
```

```
## Estimates, standard errors and convergence t-ratios
## 
##                                  Estimate   Standard   Convergence 
##                                               Error      t-ratio   
## 
## Rate parameters: 
##   0        Rate parameter         9.0172  ( 1.9905   )             
## 
## Other parameters: 
##    1. eval outdegree (density)   -2.6179  ( 0.6183   )    0.0044   
##    2. eval reciprocity            2.0398  ( 0.4376   )    0.0005   
##    3. eval transitive triplets    0.9034  ( 0.1732   )    0.0254   
##    4. eval 3-cycles              -0.8324  ( 0.3113   )    0.0181   
##    5. eval indegree - popularity -0.2559  ( 0.1236   )    0.0333   
##    6. eval coo.coCovar alter      0.1281  ( 0.1021   )    0.0423   
##    7. eval coo.coCovar ego        0.0283  ( 0.1050   )   -0.0025   
##    8. eval same coo.coCovar       1.3424  ( 0.3032   )   -0.0708   
##    9. eval gender.coCovar alter  -0.2368  ( 0.3660   )   -0.0480   
##   10. eval gender.coCovar ego     0.0114  ( 0.3467   )   -0.0250   
##   11. eval same gender.coCovar    1.1407  ( 0.3471   )    0.0041   
## 
## Overall maximum convergence ratio:    0.1323 
## 
## 
## Degrees constrained to maximum values:
## friendship.dependent : 5 
## 
## 
## Total of 2926 iteration steps.
```

```r
# You can export and view the results in html using the following function.
# xtable(result, type = "html", file = "results.html")
```

???

While the model is more complicated, RSiena spits out a table at the end, 
the second part of which can be interpreted like that of a multinomial regression

- Each parameter estimate has a standard error

- If the t-ratio ( = \beta/se) ≥ 2, then we can say that we can reject the null hypothesis of there being no effect

First we check whether the model has converged. This is most important.
Do not bother interpreting a model that has not converged yet. Rerun it.
The first convergence test is to see whether every convergence t-ratio
is under 0.1. Then current best practice is to also see whether the 
overall maximum convergence ratio (for linear combinations of effects)
is less than 0.25.

---
## Parameter interpretation

- Estimated parameters need to be interpreted as within ministeps
- So we interpret the parameters as: 
  - when a chosen ego `$i$` is faced with a decision to form a tie to either of two alters, 
  `$j_1$` or `$j_2$`, that differ only on one statistic value, then the odds ratio is as follows:

`$$\frac{p_{i\leadsto j1}}{p_{i\leadsto j2}} = \frac{\exp(f(x_{i\leadsto j1}, \beta))}{\exp(f(x_{i\leadsto j2}, \beta))} = \frac{\exp(\beta s_{j1})}{\exp(\beta s_{j2})}$$`

- So, say `$i$` can send a tie to `$j_1$` or `$j_2$`, which only differ in that `$j_1$` sends a tie to `$i$` and `$j_2$` does not, 
then given a reciprocity parameter of 2,

`$$\frac{\exp(2x1)}{\exp(2x0)} = 7.39$$`

- `$i$` is 7.39 times more likely to send a tie to `$j_1$` than `$j_2$`

---

---
.pull-left[
## Diagnostics and Goodness of Fit

What does good mean? Can't we just use `$R^2$`?

Could simulate individual networks and compare their graphs to that of the observed network,
but perhaps just _that particular_ simulated network looks similar/different in that way?

Alternative is to simulate _lots_ of networks and test how macro features of the simulated networks, 
e.g. degree distribution, compare to empirical network(s).

There are some important differences between .red[goodness of fit] (GOF) and `$R^2$`:
- must be some unmodelled (macro) feature
- no single GOF score, but there are different features you could be interested in
- one semi-useful summary is the `$p$`-value, but .red[which way do you want it here]?
]

```r
indegree_gof <- sienaGOF(result, 
                   IndegreeDistribution, 
                   varName = "friendship.dependent")
```

```
##   > Completed  1000  calculations
```

```r
plot(indegree_gof)
```

???

While MoM aims at creating networks that have statistics close to those of the target network...
- more formally, parameters `$\theta = \{\rho, \beta\}$` that generate networks for which `$E_\theta = \{Z\}$` and are stable have converged

```r
RSiena:::getTargets(data = mySienaData,
                  effects = mySienaEffects)
```

```
##             [,1]
##  [1,]  86.000000
##  [2,]  99.000000
##  [3,]  72.000000
##  [4,] 164.000000
##  [5,]  47.000000
##  [6,] 403.000000
##  [7,]   1.620690
##  [8,]  -6.379310
##  [9,]  47.000000
## [10,]  -5.034483
## [11,]  -4.034483
## [12,]  90.000000
```

But it is important to recognise that, with a converged model, we approximate these targets very closely _by design_ (of the estimation procedure).
So we don't actually know anything about what kind of residual variance is left...

---
## Extensions

SAOMs have been around for a good decade or two by now, 
and are very popular in sociology and increasingly used in political science.

> E.g. Snijders et al (2010) has been cited nearly 2000 times

Extensions have been developed for all sorts of special cases...

.pull-right[
- single -> multiplex
- binary -> ordered
- directed -> undirected
- one-mode -> two-mode
- change -> creation/deletion
- single network -> multiple networks
- network -> behaviour
]

---
## Network modelling

Where to go from here?

Well, first stop is **Social Networks Theories and Methods** next semester.

There we go into _much_ more detail about social networks theories and methods.
This includes more detail about how to create, describe, and visualise networks in different ways,
as well as all sorts of different models that involve networks:
- networks as dependent variable (as here)
- networks as independent variable (e.g. diffusion)
- networks and behaviour as coevolving dependent variables...

| Panels | Events
--------|---------|---------
Tie-based | (ST)ERGMs | REMs
Actor-oriented | SAOMs | DyNAMs

---
class: center, middle
## Some helpful packages (I think...)

![:scale 25%](https://raw.githubusercontent.com/snlab-nl/rsiena/main/inst/rsienalogo.png)
![:scale 25%](https://raw.githubusercontent.com/snlab-ch/migraph/main/inst/migraph.png)
![:scale 25%](https://raw.githubusercontent.com/snlab-ch/goldfish/main/inst/hexlogo_goldfish.png)

---
## Social Networks Recap

Social networks includes .red[network theory], .red[network analysis] and .red[network modelling]

**Assumption**: social life is associative and relations are meaningful

**Premise**: how social entities are connected matters

**Argument**: more interdependent and contextual than traditional quantitative or qualitative work

**Promise**: to help understand social, relational life

???

Evolution has gone hand-in-hand.

---
class: center, middle