Lab Home | Phone | Search
Center for Nonlinear Studies  Center for Nonlinear Studies
 Home 
 People 
 Current 
 Executive Committee 
 Postdocs 
 Visitors 
 Students 
 Research 
 Publications 
 Conferences 
 Workshops 
 Sponsorship 
 Talks 
 Seminars 
 Postdoc Seminars Archive 
 Quantum Lunch 
 Quantum Lunch Archive 
 P/T Colloquia 
 Archive 
 Ulam Scholar 
 
 Postdoc Nominations 
 Student Requests 
 Student Program 
 Visitor Requests 
 Description 
 Past Visitors 
 Services 
 General 
 
 History of CNLS 
 
 Maps, Directions 
 CNLS Office 
 T-Division 
 LANL 
 
Thursday, December 06, 2012
2:00 PM - 2:45 PM
CNLS Conference Room (TA-3, Bldg 1690)

Postdoc Seminar

Inferring origin locations of tweets with quantitative confidence

Reid Priedhorsky
D-4 and CNLS

Twitter and other social internet systems offer a rich and voluminous stream of data which reflects the observations, mood, and knowledge of people distributed around the world. However, specific location information is missing from nearly all messages (e.g., roughly 1% of tweets contain a geotag), meaning that it is very difficult to draw conclusions about specific locales. We are using the content of tweets to infer missing locations, learning on the small fraction of tweets which do have a geotag. Specifically, we parse training tweets into (word, geopoint) pairs and then fit a gaussian mixture model (GMM) to the points associated with each distinct word in the training data. Then, the location estimate for a tweet is a combination of the GMMs previously learned for the words in that tweet. This goes beyond prior work to offer probabilistic, geographic location estimates (rather than a single best point or suggested locale names) which (we expect) will be more accurate than current techniques. We also offer more robust metrics for accuracy, precision, and calibration. We expect these techniques to impact a wide variety of social internet analysis research and applications. This talk will present work in progress, and so feedback, suggestions, and discussion will be greatly appreciated.

Host: Kipton Barros, T-4 and CNLS