Social Media Mining for Toxicovigilance: Automatic Monitoring of Prescription Medication Abuse from Twitter

Abeed Sarker (1), Karen O’Connor (1), Rachel Ginn (2), Matthew Scotch (2) (3), Karen Smith (4), Dan Malone (5), Graciela Gonzalez (1)

(1) Department of Biostatistics, Epidemiology and Informatics
Perelman School of Medicine
University of Pennsylvania
Philadelphia, PA, USA

(2) Department of Biomedical Informatics
Arizona State University
Scottsdale, AZ, USA

(3) Center for Environmental Security
Biodesign Institute
Arizona State University
Tempe, AZ, USA

(4) Rueckert-Hartman College for Health Professions
Regis University
Denver, CO, USA

(5) Department of Pharmacy Practice & Science
University of Arizona
Tucson, AZ, USA

Abstract

Introduction:

Prescription medication overdose is the fastest growing drug-related problem in the USA. The growing nature of this problem necessitates the implementation of improved monitoring strategies for investigating the prevalence and patterns of abuse of specific medications.

Objectives:

Our primary aims were to assess the possibility of utilizing social media as a resource for automatic monitoring of prescription medication abuse and to devise an automatic classification technique that can identify potentially abuse-indicating user posts.

Methods:

We collected Twitter user posts (tweets) associated with three commonly abused medications (Adderall, oxycodone, and quetiapine). We manually annotated 6400 tweets mentioning these three medications and a control medication (metformin) that is not the subject of abuse due to its mechanism of action. We performed quantitative and qualitative analyses of the annotated data to determine whether posts on Twitter contain signals of prescription medication abuse. Finally, we designed an automatic supervised classification technique to distinguish posts containing signals of medication abuse from those that do not and assessed the utility of Twitter in investigating patterns of abuse over time.

Results:

Our analyses show that clear signals of medication abuse can be drawn from Twitter posts and the percentage of tweets containing abuse signals are significantly higher for the three case medications (Adderall: 23 %, quetiapine: 5.0 %, oxycodone: 12 %) than the proportion for the control medication (metformin: 0.3 %). Our automatic classification approach achieves 82 % accuracy overall (medication abuse class recall: 0.51, precision: 0.41, F measure: 0.46). To illustrate the utility of automatic classification, we show how the classification data can be used to analyze abuse patterns over time.

Conclusions:

Our study indicates that social media can be a crucial resource for obtaining abuse-related information for medications, and that automatic approaches involving supervised classification and natural language processing hold promises for essential future monitoring and intervention tasks.

Quick Downloads

 Data set
Download tweets script
Annotation guidelines (evolving)