SMM4H 2018 shared tasks

The proposed SMM4H shared tasks involve NLP challenges on social media mining for health monitoring and surveillance and in particular pharmacovigilance. This requires processing noisy, real-world, and substantially creative language expressions from social media. The proposed systems should be able to deal with many linguistic variations and semantic complexities in various ways people express medication-related concepts and outcomes.
Similar to the first and second runs of the shared tasks, the data will include of medication-related posts on Twitter. The training data (which includes the train and test sets from previous runs) is already prepared and will be available to the teams registering to participate. We will prepare the evaluation data in the following months. There will be a one-week window during which teams will be able to run their systems on a blind evaluation data. The shared task will include four subtasks:

  1. Automatic classification of tweets describing medication intake. Participants are expected to build a three-way classification system to distinguish between tweets that express definite vs possible medication intake, and non-intake (even though the tweets mentioned the medication names). This is the second run of this task and approximately 18,000 annotated tweets will be made available for training with around 5,000 tweets for evaluation.
  2. Automatic classifications of adverse effects mentions in tweets. The designed system for this sub-task should be able to distinguish tweets reporting an adverse effect (AE) from those that do not, taking into account subtle linguistic variations between adverse effects and indications (the reason to use the medication). This is a rerun of the popular classification task organized in 2016 and 2017. The data set will consist of approximately 25,000 tweets for training and 5000 for evaluation.
  3. Extraction of AE mentions. As a follow-up step of subtask ii, this task will include and identifying the text span of the reported AEs and distinguishing AEs from similar non-AE expressions. AEs are multi-token, descriptive, expressions, so this subtask would require advanced named entity recognition approaches. The data for this sub-task includes 2000+ tweets which are fully annotated for mentions of AEs and indications. This set contains a subset of the tweets from sub-task ii tagged as hasADR plus a random set of 800 nonADR tweets. The nonADR subset was annotated for mentions of indications, in order to allow participants to develop techniques to deal with this confusion class.
  4. Normalization of adverse drug reaction mentions. This is a mapping task where systems must map colloquial mentions of adverse reactions to standard concept IDs in the MEDDRA vocabulary (preferred terms). It requires a concept normalization system that receives ADR mentions, understands their semantic interpretations, and mapping those to standard concept IDs. As we have seen in the first run, this task is more challenging and requires a semi-supervised approach. About 9000 annotated mappings will be made available for training and 5000 will be made available for evaluation.

The task presents several interesting challenges including the noisy nature of the data, the informal language of the user posts, misspellings, and data imbalance. It has been shown in past research that automated systems frequently underperform when exposed to social media text because of the presence of novel/creative phrases and misspellings, and frequent use of idiomatic, ambiguous and sarcastic expressions. The tasks will thus act as a discovery and verification process of what approaches work best for social media data.