Social Graph Paper: social networking theory

Showing posts with label social networking theory. Show all posts

Tuesday, October 11, 2011

Basic Social Network Analysis Criterion

Just finished two interesting papers which analyze social networks: Planetary-Scale Views on a Large

Instant-Messaging Network (Leskovec, Horvitz) and Statistical Analysis of Real Large-Scale Mobile Social Network (Zhengbin Dong, Guojie Song, KunqingXie, Ke Tang, JingyaoWang).

The former was an a an analysis of a month's worth of MSN Messenger traffic and network structure. The latter, an analysis of chinese phone log and corresponding network structure.

Though the results were interesting (I won't share them here), I was actually looking for the criterion they analyzed:

Degree: simply put, the number of connections a user (node) has.
Shortest Path: the fewest number of users between two users.
Diameter: the largest shortest path in a network.
Clustering Coefficient: the ratio of actual connections a user has to potential connections. Measures the transitivity of a network (ie the propensity for your friends to also be friends themselves).
Betweenness Centrality: the ratio of the count of shortest paths (between user A and user B) that pass through a user (user C) to all shortest paths (between user A and user B).
K-Core Distribution of Component Size: gives us an idea of how quickly the network shrinks as we move towards the core. Or, how large (number of users/ nodes) is the core component when a constraint of the minimum degree (k) is applied. (ie for a network where nodes have degree, k >20, how many total nodes in the component?)

Most of these characteristics are represented as a distribution (ie what is the degree distribution of all nodes in a network?) and tend to provide insight into the stability and density of a network. For example, a network with a higher-than-average skewed degree distribution (ie people have a lot of friends), will tend to be more stable (ie be more resilient to the k-core test), have shorter paths (on average) and therefore a smaller diameter, will be clustered more, and have higher betweenness centrality.

This is really nerdy stuff...

Saturday, April 16, 2011

Balanced Signed Networks in Social Media

In previous posts, something that sat uncomfortably with me in social networking theory is the lack of description and weight of edges (ties). I should have kept reading because, of course, the related concept exists in the research; ties between individuals can be positive (friends) or negative (enemies).

An interesting analysis is a paper "The slashdot zoo: mining a social network with negative edges" [Kunegis, Lommatzsc, Bauckhage] is interesting because Slashdot, a popular 'geek culture' site allows members to tag each other as friends or foes.

Further research discusses the concept of balancing these graphs. For example, a network of 3 people, A, B, C is only balanced if all are friends, or only A-B are friends (C is a common enemy). Imbalance occurs when A-B and A-C are friends, but B-C are enemies - this creates a sort of structural instability.

What continues to sit uncomfortably is that the reading seems to overly simplify nuances in real social dynamics and the way that these dynamics are represented online:

This representation of friend/ foe ignores the context of the measurement. For example, the signs of a graph may reverse if the context is "I agree with what you have to say" vs. "I respect what you have to say".
The ties themselves are really an aggregate of 'types' and 'weights' of ties. Consider a political corporate environment where allegiances (friend/foe lines) are formed on power dynamics and corporate structure as well as on personal similarity/friendships. The model doesn't take the mix into account.
In online social networks, there seems to be little explicit definition of 'foe'. For example, you 'friend' people on Facebook, you don't have the concept of 'foe'. An interesting research area might be to determine implicit foes based on friend data (ie if A-B are friends on Facebook and A-C are friends, then you'd think B-C should be friends. If they're not, does that mean they're real-life enemies?).

A related talk on the topic by Jure Leskovec at Microsoft Research (video).

Saturday, February 19, 2011

Social Influence vs. Selection

First, a couple of definitions:
* Selection is a person's characteristics (mutable or immutable) that drive link (friendship) creation.
* Social Influence is the propensity of a person's friendships to drive characteristics.

The first, for example, would be an ethnic group finding a neighborhood where members of the same group live. The second would be how the discovery of a new music act by one friend drives the adoption of the same act by their friends.

I came across some research that looks at selection vs social influence in the context of page editors of wikipedia. The question posed is, how does friendship influence the pages editors work on? How Crandall et al approached this was to look at the similarity (ie the number of common articles they edited) between two editors pre- and post- meeting.

The following graph is an aggregate/ average of many pairs of editors, but the surprising conclusion is the positive-linear nature of similarity and the ramp up/down surrounding when the meet. Of course, there's a level of historical retrospective going on here (looking at behaviors of people that met in the past), but it's interesting to see the build up pre-meeting (selection) and the continued ramp post-meeting (social influence).

Here's the full presentation from "Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining":

Feedback Effects Between Similarity And Social Influence In Online Communities

View more presentations from Paolo Massa.

Representations of Social Networks

In my study of social networks, I keep asking myself why they are commonly represented so simply? The concept of a "graph" is simple enough, and many of the natural extensions I'd like to see never seem to come up.

An artificial graph, below, contains 3 nodes (people), 2 edges (friendships), and 1 "non friendship).

Let's assume you're trying to assess triadic closure (the propensity for B-C to become friends if A-B and A-C are friends). What would be helpful for this graph would be:
1. The nature of the edge between A-B and A-C.
- Are the edges representing true friendships?
- Are the edges actually a blend of two types of edges (professional/ affiliate and personal)? "A" may be great personal friends with "B", but belong to the same club as "C". This is unlikely to drive triadic closure.
2. The weight of the edge between A-B and A-C. Something that has always made me uncomfortable is the qualitative nature of how edges are described. Perhaps because this is, historically, difficult to quantify. Even so, a 1= acquaintence, 2=best friend would add a more comfortable quantitative layer.
3. The nature of the edges between pairs are unidirectional, when stated/ perceived relationships by the individuals within the pairs may not be reciprocal. I'd like to see every edge actually be made up of 2 edges: one for the nature perceived by each node.

Lastly, flying in the face of the triadic closure concept, I'd like to see ties with a weight ranging from -1 (avoidance) to +1 (closeness); 0 would represent non-friendship/ non-tie. If A-B are friends and B-C are friends, and A is a drug dealer, and C is a recovering addict, triadic closure likely won't result here.

I'd think adding these details would provide a more nuanced analysis. Perhaps as I dig a bit deeper, these practices will surface.

Social Graph Paper