Skip to main content


Twitter incessantly produces copious amount of data. The locations of tweets can help with some interesting questions. One that comes to mind, and one that I plan to do when the time is right is- What part of the world is interested in The Champions League final vs The World Cup final. The rationale for this is the amount of debate currently happening on this topic.
Basically, this post will answer "where in the world are people searching for [something]?" Also, all explanation will be in the comments itself.



# Check to see if R is connected to twitter

# searchString parameter for Twitter API
key <- "#UCLfinal"

# requesting Twitter API
tag <- searchTwitter(key, n = 2000, lang= "en") 

# Tweets data frame
# At this stage it is quite possible to get rid of all the non-geotagged tweets.
# However, a very very small portion of users geotag tweets. Therefore, another approach
# is used here. In the next step, the location in their profile description will
# be extracted.
df <-"rbind", lapply(tag,

# User data frame
userInfo <-"rbind", lapply(lookupUsers(df$screenName),   

# geocoding all users with some sort of location identification
# also creating "Interpreted Place"
# All locations with invalid location will be dropped after this step
# Package dismo used here
# Although using oneRecord=T decreases the size, it produces more
# reliable location data frame
locations <- geocode(userInfo$location, progress="text", oneRecord=T)

# getting rid of all rows with na
locations <- locations[complete.cases(locations),]

# also getting rid of all tweets that only have country name as locations
# For example, it is good to avoid the center of Australia (which is very very sparse) 
# showing massive number of tweets.
# The easiest way to do this is to get rid of all rows that do not have a comma
locations <- locations[grep("\\,",locations$interpretedPlace),]

# Map of the world
# It is also possible to do the same with country/places maps.
# However, it is necessary to make sure that the coordinates are correct
result <- ggplot(map_data("world")) + geom_path(aes(x = long, y = lat, group = group))

# Adding Tweet locations
result <- result + geom_point(data = locations, aes(x = longitude, y = latitude),
                              color = "red", alpha = .2, size = 3)
result <- result + ggtitle(key) + theme_minimal() + theme(axis.text=element_blank(),

# Time Stamping the file name
filename <- paste(format(Sys.Date(),"%d%m%y"),format(Sys.time(), "%H%M%S"),".png",sep="")

# Saving the file
ggsave(filename, units="in", width=8.15, height=5.20, dpi=300)

After coupling the result with the wordcloud code (Click here to go to that blog post) and Photoshop, these are the products.


Popular posts from this blog



The Zorganian Republic has some very strange customs. Couples only wish to have female children as only females can inherit the family's wealth, so if they have a male child they keep having more children until they have a girl. If they have a girl, they stop having children. What is the ratio of girls to boys in Zorgania?
The ratio of girls to boys in Zorgania is 1:1. This might be a little counter-intuitive at first. Here are some ways of tackling this problem. 1. Monte Carlo Simulation: Although, Monte Carlo simulation does not necessarily show why the result is 1:1, it is appropriate because of the very counter-intuitive nature of the problem. At the very least, it helps us see that the result is indeed 1:1. Therefore, this is a good start.
The following R code estimates the probability of a child being a boy in Zorgania. 
couples <-100000 boycount <-0for (i in1:couples){ # 0: boywhile (sample(c(0,1),1) ==0) { boycount=boycount+1 } } probability <- boycount/(co…

Simple Launcher

A simple minimal launcher application for Android devices that shows battery percentage using lzyzsd's CirclProgress library (ArchProgress used in this case) and BroadcastReciever for battery state, Android's clock widgets, a built-in flash light switch and an app list view that can be toggled. Currently, the toggle simply filters all the app that I am working on at present. Future implementation can allow users to select their favorite apps or populate second toggle based on the most used applications.