Sunday, October 9, 2011

Facebook's Research: Leveraging Friendship for Determining Location

A chapter in Networks, Crowds, and Markets on small-world phenomenon describes a generalized relationship between rank-based distance and friendship probability. In research leveraging data from the site LiveJournal, the relationship is approximately: Probability of friendship is = 1/r. So, co-present people are basically 100% likely to have a tie, and the 100th person closest to you, your probability of friendship is 1/100. A quick note on using rank instead of geographic distance - numerically this approach is more meaningful because geographic distance is non-uniformly distributed (eg. in the US, the major of people are on the coasts, not spread uniformly over the area of the country).

Researchers at Facebook, Lars Backstrom; Eric Sun; Cameron Marlow, leveraged this research to further investigate using this relationship between friendship and 'distance' and to develop reasonable algorithms to use this relationship to predict friend location.

As an aside, the typical approach used by smart phone application developers to determine geographic location is to leverage the handset's IP address. For example, Skyhook allows developers to submit handset IP address and Skyhook will return a geograph lat-long. This approach provides a reasonable estimation of location, however due to the 'reassignment' nature of IP addresses, this approach is error-prone and, according to Backstrom, Sun, Marlow, is only accurate 57.2% of the time.

Amazingly, in Find me if you can: Improving geographical prediction with social and spatial proximity, Backstrom, Sun, Marlow leveraged the location of one's friends to determine your location to accuracy greater than that of IP geolocation. With 16 friends sharing location, they were able to determine your location within 25miles ~67.5% of the time!

The obvious implication of this research is that the historical approach of lat-long:IP relationships could technically be augmented with friendship only data to improve results. More interested, potentially controversial, is where many users may opt to not explicitly share their location with vendors such as Skyhook, approaches exist, more accurately, to determine YOUR location if your friends share theirs (ie your friends are indirectly providing services such as Facebook your locations when they 'check in').

No comments: