Overview of Social Network Models

George G. Vega Yon
ggvy.cl

Center for Applied Network Analysis (CANA)
Department of Preventive Medicine

September 12, 2018

Context: Identifying Contagion in Networks

Ideally we would like to estimate a model in the form of \[ Y_i = f\left(\mathbf{X}, Y_{-i}, \mathbf{G}; \theta\right) \]

where \(Y_{-i}\) is the behavior of individuals other than \(i\) and \(\mathbf{G}\) is a graph.
The problem: So far, traditional statistical models won’t work, why? because most of them rely on having IID observations.
On top of that, when it comes to explain behavior, “Homophily and Contagion Are Generically Confounded” (Shalizi and Thomas 2011).
This has lead to the development of an important collection of statistical models for social networks.

Spatial Autocorrelation Models (SAR a.k.a. Network Auto-correlation Models)

Spatial Auto-correlation Models are mostly applied in the context of spatial statistics and econometrics.
A wide family of models, you can find SA equivalents to Probit, Logit, MLogit, etc.
The SAR model has interdependence built-in using a Multivariate Normal Distribution:

\[ \begin{align} Y = & \alpha + \rho W Y + X\beta + \varepsilon,\quad\varepsilon\sim MVN(0,\sigma^2I_n) \\ \implies & Y = \left(I_n -\rho W\right)^{-1}(\alpha + X\beta + \varepsilon) \end{align} \]

Where \(\rho\in[-1,1]\) and \(W=\{w_{ij}\}\), with \(\sum_j w_{ij} = 1\)

Spatial Autocorrelation Models (SAR) (cont.)

This is more close than we might think, since the \(i\)-th element of \(Wy\) can be expressed as \(\frac{\sum_j a_{ij} y_j}{\sum_j a_{ij}}\), what we usually define as exposure in networks, where \(a_{ij}\) is an element of the \(\{0,1\}\)-adjacency matrix .
Notice that \((I_n - \rho W)^{-1} = I_n + \rho W + \rho^2 W^2 + \dots\), hence there autocorrelation does consider effects over neighbors farther than 1 step away, which makes the specification of \(W\) no critical. (see LeSage 2008)
These models assume that \(W\) is exogenous, in other words, if there’s homophily you won’t be able to use it!
But there are solutions to this problem (using instrumental variables).

Exponential Random Graph Models (ERGMs)

The distribution of \(\mathbf{Y}\) can be parameterized in the form

\[ \Pr\left(\mathbf{Y}=\mathbf{y}|\theta, \mathcal{Y}\right) = \frac{\exp{\theta^{\mbox{T}}\mathbf{g}(\mathbf{y})}}{\kappa\left(\theta, \mathcal{Y}\right)},\quad\mathbf{y}\in\mathcal{Y} \tag{1} \]

Where \(\theta\in\Omega\subset\mathbb{R}^q\) is the vector of model coefficients and \(\mathbf{g}(\mathbf{y})\) is a q-vector of statistics based on the adjacency matrix \(\mathbf{y}\).

Model (1) may be expanded by replacing \(\mathbf{g}(\mathbf{y})\) with \(\mathbf{g}(\mathbf{y}, \mathbf{X})\) to allow for additional covariate information \(\mathbf{X}\) about the network. The denominator,
\[ \kappa\left(\theta,\mathcal{Y}\right) = \sum_{\mathbf{z}\in\mathcal{Y}}\exp{\theta^{\mbox{T}}\mathbf{g}(\mathbf{z})} \] 0
Is the normalizing factor that ensures that equation (1) is a legitimate probability distribution.
Even after fixing \(\mathcal{Y}\) to be all the networks that have size \(n\), the size of \(\mathcal{Y}\) makes this type of models hard to estimate as there are \(N = 2^{n(n-1)}\) possible networks! (Hunter et al. 2008)

How does ERGMs look like (in R at least)

network ~ edges + nodematch("hispanic") + nodematch("female") +
  mutual +  esp(0:3) +  idegree(0:10)

Here we are controlling for:

edges: Edge count,
nodematch(hispanic): number of homophilic edges on race,
nodematch(female): number of homophilic edges on gender,
mutual: number of reciprocal edges,
esp(0:3): number of shared parterns (0 to 3), and
indegree(0:10): indegree distribution (fixed effects for values 0 to 10)

(See Hunter et al. 2008).

Separable Exponential Random Graph Models (a.k.a. TERGMs)

A discrete time model.
Estimates a set of parameters \(\theta = \{\theta^-, \theta^+\}\) that capture the transition dynamics from \(\mathbf{Y}^{t-1}\) to \(\mathbf{Y}^{t}\).
Assuming that \((\mathbf{Y}^+\perp\mathbf{Y}^-) | \mathbf{Y}^{t-1}\) (the model dynamic model is separable), we estimate two models: \[ \begin{align} \Pr\left(\mathbf{Y}^+ = \mathbf{y}^+|\mathbf{Y}^{t-1} = \mathbf{y}^{t-1};\theta^+\right),\quad \mathbf{y}^+\in\mathcal{Y}^+(\mathbf{y}^{t-1})\\ \Pr\left(\mathbf{Y}^- = \mathbf{y}^-|\mathbf{Y}^{t-1} = \mathbf{y}^{t-1};\theta^-\right),\quad \mathbf{y}^-\in\mathcal{Y}^-(\mathbf{y}^{t-1}) \end{align} \]
So we end up estimating two ERGMs.

Latent Network Models

Social networks are a function of a latent space (literally, xyz for example) \(\mathbf{Z}\).
Individuals who are closer to each other within \(\mathbf{Z}\) have a higher chance of been connected.
Besides of estimating the typical set of parameters \(\theta\), a key part of this model is find \(\mathbf{Z}\).
Similar to TERGMs, under the conditional independence assumption we can estimate:

\[ \Pr\left(\mathbf{Y} =\mathbf{y}|\mathbf{X} = \mathbf{x}, \mathbf{Z}, \theta\right) = \prod_{i\neq j}\Pr\left(y_{ij}|z_i, z_j, x_{ij},\theta\right) \]

See Hoff, Raftery, and Handcock (2002)

Estimation of ERGMs

In statnet, the default estimation method is based on a method proposed by Geyer and Thompson (1992), Markov-Chain MLE, which uses Markov-Chain Monte Carlo for simulating networks and a modified version of the Newton-Raphson algorithm to do the parameter estimation part.

Estimation of ERGMs (cont’ d)

In general terms, the MC-MLE algorithm can be described as follows:

Initialize the algorithm with an initial guess of \(\theta\), call it \(\theta^{(t)}\)
While (no convergence) do:
1. Using \(\theta^{(t)}\), simulate \(M\) networks by means of small changes in the \(\mathbf{Y}_{obs}\) (the observed network). This part is done by using an importance-sampling method which weights each proposed network by it’s likelihood conditional on \(\theta^{(t)}\)
2. With the networks simulated, we can do the Newton step to update the parameter \(\theta^{(t)}\) (this is the iteration part in the ergm package): \(\theta^{(t)}\to\theta^{(t+1)}\)
3. If convergence has been reach (which usually means that \(\theta^{(t)}\) and \(\theta^{(t + 1)}\) are not very different), then stop, otherwise, go to step a.

For more details see Lusher, Koskinen, and Robins (2012);Admiraal and Handcock (2006);T. A. Snijders (2002);Wang et al. (2009) provides details on the algorithm used by PNet (which is the same as the one used in RSiena). Lusher, Koskinen, and Robins (2012) provides a short discussion on differences between ergm and PNet.

The main problems with ERGMs are:

Computational Time: As the complexity of the model increases, it gets harder to achieve convergence, thus, more time is needed.
Model degeneracy: Even if convergence is achieved, model fitness can be very bad

Example of problems encountered in the estimation process of ERGMs: No convergence (left), and model degeneracy (right).

Stochastic Actor Oriented Models (SOAMs)

Also known as Siena: Simulation Investigation for Empirical Network Analysis.
Models both, structure and behavior as a time-continuous Markov process where changes happen one at a time (as a poisson process).
In other words, individuals choose between states \(x\) and \(x'\) in which either a tie changes, or their behavior changes.
Ultimately, we maximize the following function:

\[ \frac{\exp{f_i^Z(\beta^z,x, z)}}{\sum_{Z'\in\mathcal{C}}\exp{f_i^{Z}(\beta, x, z')}} \]

Like ERGMs, the denominator is what makes estimating this models hard.

See Tom A B Snijders, Bunt, and Steglich (2010);Lazega and Snijders (2015);Ripley et al. (2011).

Network Matching: Aral et al. (2009)

Built on top of the Rubin Causal Model (RCM). Uses matching (non-parametric method) to estimate the effect that changes on exposure has over behavior
As a difference from RCM, we don’t have one but multiple treatments
In the dynamic case, for each time \(t\), we can build multiple levels of treatments, in particular, given that individual \(i\) had an exposure \(E_{t-1}=j-1\) at \(t-1\), we write:

\[ T_{itj} = \left\{\begin{array}{ll} 1 &\mbox{if }E_t = j \\ 0 &\mbox{otherwise.} \end{array}\right. \]
Based on the previous equation, we can use some matching algorithm to build counter factuals and estimate a simil to Average Treatment Effect on the Treated (ATT).
For more on matching methods see Imbens and Wooldridge (2009);Sekhon (2008);King and Nielsen (2016) (special attention to the last one).

Other Models

GERGM: Generalized Exponential Random Graph Models (using weighted graphs, see Desmarais (2012)).
SERGMs: Statistical Exponential Random Graph Models, suitable for large graphs, uses sufficient statistics. (see Chandrasekhar and Jackson 2012)
DyNAM: dynamic network actor models (see Stadtfeld, Hollway, and Block 2017).
REM: Relational Event Models (see Butts 2008), which are very similar to DyNAMs.
ALAAM: Autologistic actor attribute models (see Daraganova and Robins 2013; Kashima et al. 2013)

Some other models can be found in Tom A. B. Snijders (2011).

Summary

Model	Focus	Temporal	Scalable
SAR	Behavior	Can be (discrete)	Yes
ERGM	Network	Can be (discrete)	No (but SERGM can)
SOAM (Siena)	Behavior+Network	Yes (continuous)	No
Net Match	Behavior	Can be (discrete)	Yes
Latent Net	Network	?	No

References

Admiraal, Ryan, and Mark S Handcock. 2006. “Sequential Importance Sampling for Bipartite Graphs with Applications to Likelihood-Based Inference.” Department of Statistics, University of Washington.

Butts, Carter T. 2008. “4. a Relational Event Framework for Social Action.” Sociological Methodology 38 (1): 155–200. doi:10.1111/j.1467-9531.2008.00203.x.

Chandrasekhar, A. G., and M. O. Jackson. 2012. “Tractable and Consistent Random Graph Models.” ArXiv E-Prints, October.

Daraganova, G., and G. Robins. 2013. “Autologistic Actor Attribute Models.” Exponential Random Graph Models for Social Networks: Theory, Methods and Applications, 102–14.

Desmarais, Skyler J., Bruce A. AND Cranmer. 2012. “Statistical Inference for Valued-Edge Networks: The Generalized Exponential Random Graph Model.” PLOS ONE 7 (1). Public Library of Science: 1–12. doi:10.1371/journal.pone.0030136.

Geyer, Charles J., and Elizabeth A. Thompson. 1992. “Constrained Monte Carlo Maximum Likelihood for Dependent Data.” Journal of the Royal Statistical Society. Series B (Methodological) 54 (3). [Royal Statistical Society, Wiley]: 657–99. http://www.jstor.org/stable/2345852.

Hoff, Peter D, Adrian E Raftery, and Mark S Handcock. 2002. “Latent Space Approaches to Social Network Analysis.” Journal of the American Statistical Association 97 (460). Taylor & Francis: 1090–8. doi:10.1198/016214502388618906.

Hunter, David R., Mark S. Handcock, Carter T. Butts, Steven M. Goodreau, and Martina Morris. 2008. “ergm : A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks.” Journal of Statistical Software 24 (3). doi:10.18637/jss.v024.i03.

Imbens, Guido W., and Jeffrey M. Wooldridge. 2009. “Recent Developments in the Econometrics of Program Evaluation.” Journal of Economic Literature 47 (1): 5–86. doi:10.1257/jel.47.1.5.

Kashima, Yoshihisa, Samuel Wilson, Dean Lusher, Leonie J. Pearson, and Craig Pearson. 2013. “The Acquisition of Perceived Descriptive Norms as Social Category Learning in Social Networks.” Social Networks 35 (4): 711–19. doi:https://doi.org/10.1016/j.socnet.2013.06.002.

King, Gary, and Richard Nielsen. 2016. “Why Propensity Scores Should Not Be Used for Matching.”

Lazega, Emmanuel, and Tom AB Snijders. 2015. Multilevel Network Analysis for the Social Sciences: Theory, Methods and Applications. Vol. 12. Springer.

LeSage, James P. 2008. “An Introduction to Spatial Econometrics.” Revue d’économie Industrielle 123 (123): 19–44. doi:10.4000/rei.3887.

Lusher, Dean, Johan Koskinen, and Garry Robins. 2012. Exponential Random Graph Models for Social Networks: Theory, Methods, and Applications. Cambridge University Press.

Ripley, Ruth M., Tom AB Snijders, Paulina Preciado, and Others. 2011. “Manual for RSIENA.” University of Oxford: Department of Statistics, Nuffield College, no. 2007. https://www.uni-due.de/hummell/sna/R/RSiena{\_}Manual.pdf.

Sekhon, Jasjeet S. 2008. “The Neyman-Rubin Model of Causal Inference and Estimation via Matching Methods.” The Oxford Handbook of Political Methodology 2. Oxford University Press Oxford.

Shalizi, Cosma Rohilla, and Andrew C Thomas. 2011. “Homophily and Contagion Are Generically Confounded in Observational Social Network Studies.” Sociological Methods & Research 40 (2): 211–39. doi:10.1177/0049124111404820.

Snijders, Tom A B, Gerhard G. van de Bunt, and Christian E G Steglich. 2010. “Introduction to stochastic actor-based models for network dynamics.” Social Networks 32 (1): 44–60. doi:10.1016/j.socnet.2009.02.004.

Snijders, Tom A. B. 2011. “Statistical Models for Social Networks.” Annual Review of Sociology 37 (1): 131–53. doi:10.1146/annurev.soc.012809.102709.

Snijders, Tom AB. 2002. “Markov Chain Monte Carlo Estimation of Exponential Random Graph Models.” Journal of Social Structure 3.

Stadtfeld, Christoph, James Hollway, and Per Block. 2017. “Dynamic Network Actor Models: Investigating Coordination Ties Through Time.” Sociological Methodology 47 (1): 1–40. doi:10.1177/0081175017709295.

Wang, Peng, Ken Sharpe, Garry L. Robins, and Philippa E. Pattison. 2009. “Exponential Random Graph (P*) Models for Affiliation Networks.” Social Networks 31 (1): 12–25. doi:https://doi.org/10.1016/j.socnet.2008.08.002.