Communities, Blogs, and Social Networks
This page describes threads of research performed by myself and my
collaborators. The goal is to put our work into a meaningful
sequence, instead of just a list of papers; the goal is not to give an
overview of the field. At some later time, this page may turn into a
survey, but at this point the many wonderful contributions of others
are not represented here.
Social network studies
Several colleagues and I have studied the structure of the social
network corresponding to an online blogging community, drawn from the
LiveJournal blog hosting organization. A high level paper appeared in the
Communications of the ACM detailing the connections between the social
network and attributes such as age, gender, interests, and geography
of the members.
Since then, we've been working on models to capture the specific
relationship between geography and friendship. There's a paper in PNAS describing
the model, and a paper in ESA
with a more detailed presentation of theorems.
Identity
We have done some work studying bulletin board postings to understand
how effective privacy mechanisms are for keeping different aliases of
the same person distinct. Our findings indicate that there are
signficant privacy concerns based on content analysis of posts by
different aliases of the same person. See the paper on anti-aliasing for
details.
Trust and Reputation
We worked on propagation of trust
and distrust through a social network. The introduction of
distrust presents as an added challenge that standard iterative
techniques may result in complex entries in the principal eigenvector,
as the Perron-Forbenius theorem does not hold when distrust is modeled
as negative trust.
Visualizing communities
Within a social network graph, we often wish to understand the
"connection" between two individuals. All too often, this connection
is taken to be simply an edge between the individuals, or the shortest
path between them. In fact, two people who are "nearby" in a social
network are typically connected by a complex web of
interrelationships, and the problem of finding this web is more
accurate cast as a subgraph discovery problem. This paper gives such a
formulation, with a set of algorithms to address it.
Blogs
In late 2002 we wished to understand the growth of blogspace, the set
of all blogs and their relationships. We introduced a new
combinatorial object called a time graph to perform this
study, and showed how to track the evolution of both macroscopic
properties of blogspace (like its connectivity) and microscopic
properties (like the burstiness of a particular community) over time.
The results are given here.
Subsequently, we considered information flow through these blogs,
showing some simple techniques for tracking and factoring "memes"
flowing from one blog to another, and gave an approach for learning
the pathways through blogspace that are most commonly taken: the blogs
which most effectively introduce and disseminate ideas. The results
are here.
In a follow-on paper, we moved this analysis from the influence of
blogs on other blogs to the influence (or at least predictiveness) of
blogs on the outside world. We showed that spikes in blog postings
about a particular book may predict spikes in sales of the same book.
Results are here.
More recently, we looked at the dynamics of some global graph structures for the social network graphs of Flickr and Yahoo! 360. We showed that a significant number of users over time exist in small but non-trivial components of the graph, and that these components are well modeled as stars rather than more well-connected stuctures. The results are shown here.
Back to Andrew Tomkins
homepage.