Given an initial graph G₀ and 50 sequences G₁ᵢ, … G₂₀ᵢ :

We embed all of the graphs using NetLSD into R²⁵⁰.
We take each generation i ∈ {1, … 20} and compute a 1-dimensional PCA for all of the embedded points for that generation.
We fit a Gaussian distribution for each generation.
- The idea is that a "good" model would produce points that are tightly clustered around the mean, while a bad model would produce a widely-dispersed point cloud or several clusters.
We can then summarize these distributions by computing a z-score for each generation.
We can finally summarize the sequence of z-scores for a given graph-model pair by computing a weighted sum or weighted average of the z-scores across the generations.

Provide feedback

Saved searches