Lab Home | Phone | Search
Center for Nonlinear Studies  Center for Nonlinear Studies
 Home 
 People 
 Current 
 Affiliates 
 Alumni 
 Visitors 
 Students 
 Research 
 ICAM-LANL 
 Quantum 
 Publications 
 Publications 
 2007 
 2006 
 2005 
 2004 
 2003 
 2002 
 2001 
 2000 
 <1999 
 Conferences 
 Workshops 
 Sponsorship 
 Talks 
 Colloquia 
 Colloquia Archive 
 Seminars 
 Postdoc Seminars Archive 
 Quantum Lunch 
 CMS Colloquia 
 Q-Mat Seminars 
 Q-Mat Seminars Archive 
 Archive 
 Kac Lectures 
 Dist. Quant. Lecture 
 Ulam Scholar 
 Colloquia 
 
 Jobs 
 Students 
 Summer Research 
 Student Application 
 Visitors 
 Description 
 Past Visitors 
 Services 
 General 
 PD Travel Request 
 
 History of CNLS 
 
 Maps, Directions 
 CNLS Office 
 T-Division 
 LANL 
 
Thursday, December 06, 2012
2:00 PM - 2:45 PM
CNLS Conference Room (TA-3, Bldg 1690)

Postdoc Seminar

Inferring origin locations of tweets with quantitative confidence

Reid Priedhorsky
D-4 and CNLS

Twitter and other social internet systems offer a rich and voluminous stream of data which reflects the observations, mood, and knowledge of people distributed around the world. However, specific location information is missing from nearly all messages (e.g., roughly 1% of tweets contain a geotag), meaning that it is very difficult to draw conclusions about specific locales. We are using the content of tweets to infer missing locations, learning on the small fraction of tweets which do have a geotag. Specifically, we parse training tweets into (word, geopoint) pairs and then fit a gaussian mixture model (GMM) to the points associated with each distinct word in the training data. Then, the location estimate for a tweet is a combination of the GMMs previously learned for the words in that tweet. This goes beyond prior work to offer probabilistic, geographic location estimates (rather than a single best point or suggested locale names) which (we expect) will be more accurate than current techniques. We also offer more robust metrics for accuracy, precision, and calibration. We expect these techniques to impact a wide variety of social internet analysis research and applications. This talk will present work in progress, and so feedback, suggestions, and discussion will be greatly appreciated.

Host: Kipton Barros, T-4 and CNLS