Before we start, a review of the concepts we will be using here
Exposure: Proportion/number of neighbors that has adopted an innovation at each point in time.
Threshold: The proportion/number of your neighbors who had adopted at or one time period before ego (the focal individual) adopted.
Infectiousness: How much \(i\)’s adoption affects her alters.
Susceptibility: How much \(i\)’s alters’ adoption affects her.
Structural equivalence: How similar are \(i\) and \(j\) in terms of position in the network.
Simulating diffusion networks
We will simulate a diffusion network with the following parameters:
Will have 1,000 vertices,
Will span 20 time periods,
The initial adopters (seeds) will be selected random,
Seeds will be a 10% of the network,
The graph (network) will be small-world,
Will use the WS algorithmwith \(p=.2\) (probability of rewire).
Threshold levels will be uniformly distributed between [0.3, 0.7]
To generate this diffusion network, we can use the rdiffnet function included in the package:
# Setting the seed for the RNGset.seed(1213)# Generating a random diffusion networknet <-rdiffnet(n =1e3, # 1.t =20, # 2.seed.nodes ="random", # 3.seed.p.adopt = .1, # 4.seed.graph ="small-world", # 5.rgraph.args =list(p=.2), # 6.threshold.dist =function(x) runif(1, .3, .7) # 7. )
# Warning in (function (graph, p, algorithm = "endpoints", both.ends = FALSE, :
# The option -copy.first- is set to TRUE. In this case, the first graph will be
# treated as a baseline, and thus, networks after T=1 will be replaced with T-1.
The function rdiffnet generates random diffusion networks. Main features:
Simulating random graph or using your own,
Setting threshold levels per node,
Network rewiring throughout the simulation, and
Setting the seed nodes.
The simulation algorithm is as follows:
If required, a baseline graph is created,
Set of initial adopters and threshold distribution are established,
The set of t networks is created (if required), and
Simulation starts at t=2, assigning adopters based on exposures and thresholds:
For each \(i \in N\), if its exposure at \(t-1\) is greater than its threshold, then adopts, otherwise continue without change.
# Warning in (function (graph, p, algorithm = "endpoints", both.ends = FALSE, :
# The option -copy.first- is set to TRUE. In this case, the first graph will be
# treated as a baseline, and thus, networks after T=1 will be replaced with T-1.
summary(diffnet_rumor)
# Diffusion network summary statistics
# Name : A diffusion network
# Behavior : Random contagion
# -----------------------------------------------------------------------------
# Period Adopters Cum Adopt. (%) Hazard Rate Density Moran's I (sd)
# -------- ---------- ---------------- ------------- --------- ----------------
# 1 25 25 (0.05) - 0.01 -0.00 (0.00)
# 2 78 103 (0.21) 0.16 0.01 0.01 (0.00) ***
# 3 187 290 (0.58) 0.47 0.01 0.01 (0.00) ***
# 4 183 473 (0.95) 0.87 0.01 0.01 (0.00) ***
# 5 27 500 (1.00) 1.00 0.01 -
# -----------------------------------------------------------------------------
# Left censoring : 0.05 (25)
# Right centoring : 0.00 (0)
# # of nodes : 500
#
# Moran's I was computed on contemporaneous autocorrelation using 1/geodesic
# values. Significane levels *** <= .01, ** <= .05, * <= .1.
plot_diffnet(diffnet_rumor, slices =c(1, 3, 5))
# We want to use igraph to compute layoutigdf <-diffnet_to_igraph(diffnet_rumor, slices=c(1,2))[[1]]pos <- igraph::layout_with_drl(igdf)plot_diffnet2(diffnet_rumor, vertex.size =dgr(diffnet_rumor)[,1], layout=pos)
# Simulating a scale-free homophilic networkset.seed(1231)X <-rep(c(1,1,1,1,1,0,0,0,0,0), 50)net <-rgraph_ba(t =499, m=4, eta = X)# Taking a look in igraphig <- igraph::graph_from_adjacency_matrix(net)plot(ig, vertex.color =c("azure", "tomato")[X+1], vertex.label =NA,vertex.size =sqrt(dgr(net)))
# Now, simulating a bunch of diffusion processesnsim <-500Lans_1and2 <-vector("list", nsim)set.seed(223)for (i in1:nsim) {# We just want the cum adopt count ans_1and2[[i]] <-cumulative_adopt_count(rdiffnet(seed.graph = net,t =10,threshold.dist =sample(1:2, 500L, TRUE),seed.nodes ="random",seed.p.adopt = .10,exposure.args =list(outgoing =FALSE, normalized =FALSE),rewire =FALSE ) )# Are we there yet?if (!(i %%50))message("Simulation ", i," of ", nsim, " done.")}# Simulation 50 of 500 done.# Simulation 100 of 500 done.# Simulation 150 of 500 done.# Simulation 200 of 500 done.# Simulation 250 of 500 done.# Simulation 300 of 500 done.# Simulation 350 of 500 done.# Simulation 400 of 500 done.# Simulation 450 of 500 done.# Simulation 500 of 500 done.# Extracting propans_1and2 <-do.call(rbind, lapply(ans_1and2, "[", i="prop", j=))ans_2and3 <-vector("list", nsim)set.seed(223)for (i in1:nsim) {# We just want the cum adopt count ans_2and3[[i]] <-cumulative_adopt_count(rdiffnet(seed.graph = net,t =10,threshold.dist =sample(2:3, 500L, TRUE),seed.nodes ="random",seed.p.adopt = .10,exposure.args =list(outgoing =FALSE, normalized =FALSE),rewire =FALSE ) )# Are we there yet?if (!(i %%50))message("Simulation ", i," of ", nsim, " done.")}# Simulation 50 of 500 done.# Simulation 100 of 500 done.# Simulation 150 of 500 done.# Simulation 200 of 500 done.# Simulation 250 of 500 done.# Simulation 300 of 500 done.# Simulation 350 of 500 done.# Simulation 400 of 500 done.# Simulation 450 of 500 done.# Simulation 500 of 500 done.ans_2and3 <-do.call(rbind, lapply(ans_2and3, "[", i="prop", j=))
This can actually be simplified by using the function rdiffnet_multiple. The following lines of code accomplish the same as the previous code avoiding the for-loop (from the user’s perspective). Besides of the usual parameters passed to rdiffnet, the rdiffnet_multiple function requires R (number of repetitions/simulations), and statistic (a function that returns the statistic of insterst). Optionally, the user may choose to specify the number of clusters to run it in parallel (multiple CPUs):
ans_1and3 <-rdiffnet_multiple(# Num of simR = nsim,# Statisticstatistic =function(d) cumulative_adopt_count(d)["prop",], seed.graph = net,t =10,threshold.dist =sample(1:3, 500, TRUE),seed.nodes ="random",seed.p.adopt = .1,rewire =FALSE,exposure.args =list(outgoing=FALSE, normalized=FALSE),# Running on 4 coresncpus =4L )
Example simulating a thousand networks by changing threshold levels. The final prevalence, or hazard as a function of threshold levels.
Problems
Given the following types of networks: Small-world, Scale-free, Bernoulli, what set of \(n\) initiators maximizes diffusion? (solution script and solution plot)