top of page

COVID-19 Sentiments

Authors: Ammar Plumber, Elaina Lin, Kim Nguyen, Ryan Karbowicz, and Meghan Aines

Acknowledgments: The tweets that we use in this analysis were obtained from the following GitHub repository by Emily Chen, a computer science Ph.D. student at USC: https://github.com/echen102/COVID-19-TweetIDs

Full Project on Storyboard: COVID-19 Tweets in Mar 2020 & 2021

This website was produced as a final project for BDS 516: Data Science and Quantitative Modeling, a graduate course taught by Alex Shpenev at the University of Pennsylvania.

 

 

 

 

 

 

 

 

 

A. Motivation

 

COVID-19 has wreaked unprecedented havoc around the world. From a data analytics perspective, never before has a pandemic occurred during a time in history when almost any human can publicly share their thoughts on a global platform. More specifically, Twitter offers real-time insight on the attitudes, beliefs, and general moods of a populace. In this analysis, we compare the sentiments of COVID-19 related tweets at the beginning of the pandemic on March 30, 2020, to the sentiments exactly one year later on March 30, 2021. We select this time frame to capture two of the key events during this pandemic, the onset of stay-at-home orders and vaccine availability. As opposed to other data collection methods, such as interviews and surveys (and the numerous response biases that come along with them), sentiment analysis through Twitter is better able to capture the raw and unfiltered emotions of people who feel the need to express their views.

By gaining a better understanding of the general sentiment of a given population, policy leaders can become better informed regarding how to more effectively govern people during times of crisis. For instance, if feelings of fear are high, politicians can offer words of reassurance to instill feelings of calmness and ease. Or, if feelings of trust are low, politicians can attempt to mend the public trust by strengthening accountability and transparency within the government. At the end of the day, essentially any policy decision can be better informed by knowing how the general populace feels about the issue at hand.

We also examine which COVID-19 topics are most talked about. Similar to sentiment analysis, topic analysis can help inform policy leaders about which topics garner the most interest and need to be addressed. For instance, in the case of COVID-19, if a popular topic is the lack of ventilators, a good policy leader would be wise to offer updates on the distribution, as well as the known efficacy of ventilators.

We also examine differences in geographic attention between the two dates. More specifically, we look at how often each country is mentioned in each sample period, as well as which words are associated with each country. In the case of COVID-19, this information can be very helpful in terms of gaining a better understanding of general attitudes towards China. Because the corona virus originated in Wuhan, China, people around the world have unfortunately expressed negative sentiment towards Chinese people. Gaining a better understanding of how these attitudes have changed over time can help inform policy leaders as to whether or not extra measures need to be taken to protect and defend people of Chinese origin.

Lastly, we examine which features are most predictive of how many retweets a tweet gets. In general, it is believed that social media posts with more extreme positive/negative valence tend to be more likely to go viral. In fact, the best selling author Seth Godin has remarked that ”One of the problems with social media is that the stronger the view you express, the more likely it will become amplified.” By examining the kinds of sentiments associated with more viral tweets, as well as the sources of these tweets and the textual elements of the tweets, we can either confirm or disconfirm this common belief.

 

Impact: The results that we find can help inform policy leaders about how to craft tweets containing important information in a way that maximizes the likelihood that a large number of people will be exposed to that content. This analysis will also provide valuable insights regarding how the virality of tweets has changed over time, so that policy leaders can adjust their approaches to the climate of the times.

 

B. Research Questions

 

  1. Are there differences in sentiments between the two sample periods—both in original tweets and in retweets?

  2. Which topics are there the most original tweets about, and which are more often the subject of retweets?

  3. Is there a difference in geographic attention between the two dates? For example, was China being discussed more in 2020 or 2021?

  4. Which of these features are most predictive of how many retweets a tweet gets?

  5. Are there certain sentiments or topic-specific words that are most likely to attract retweets?

bottom of page