You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Given a network G, a subset of nodes S, and a particular node N, the diffusion reaching N through G from nodes S is strongly dependent on the degree sequence of S. Simply sampling nodes from G, or randomizing edges of G, does not appear to provide meaningful p-values for the diffusion score reaching N. We could investigate various methods of sampling G in a random manner intended to mimic the degree sequence of S. We could then evaluate the methods in terms of their abilities to generate meaningful p-values. This could be done by studying various criteria for binning the degree sequence of G, followed by stratified sampling of the bins driven by the degree sequence of S. It appears that this is an unexplored problem and it seems to have practical value when studying node sets in gene/protein networks.
Another approach could be to start from the set S and grow a "window" around each node in the set, such that there are some minimum number of nodes in each window. Windows for small degree nodes would be narrow, for high degree nodes would naturally be larger. Then sample one from each window.
And, there could be a rejection-sampling approach in which one samples nodes at random from G, but only accept them if their degree is within some distance from a node in S. (This needs more thought).
The text was updated successfully, but these errors were encountered:
Given a network G, a subset of nodes S, and a particular node N, the diffusion reaching N through G from nodes S is strongly dependent on the degree sequence of S. Simply sampling nodes from G, or randomizing edges of G, does not appear to provide meaningful p-values for the diffusion score reaching N. We could investigate various methods of sampling G in a random manner intended to mimic the degree sequence of S. We could then evaluate the methods in terms of their abilities to generate meaningful p-values. This could be done by studying various criteria for binning the degree sequence of G, followed by stratified sampling of the bins driven by the degree sequence of S. It appears that this is an unexplored problem and it seems to have practical value when studying node sets in gene/protein networks.
Another approach could be to start from the set S and grow a "window" around each node in the set, such that there are some minimum number of nodes in each window. Windows for small degree nodes would be narrow, for high degree nodes would naturally be larger. Then sample one from each window.
And, there could be a rejection-sampling approach in which one samples nodes at random from G, but only accept them if their degree is within some distance from a node in S. (This needs more thought).
The text was updated successfully, but these errors were encountered: