Task 1 : Classification, extraction and normalization of adverse effect (AE) mentions in English tweets

In this task, modified from previous years, systems developed must develop one or more components to classify tweets that contain an adverse effect (AE) or also known as adverse drug effect (ADE), detect the text span of reported ADEs in tweets and map these colloquial mentions to their standard concept IDs in the MedDRA vocabulary (preferred terms). This task presents multiple challenges. Firstly, the classification task needs to take into account class imbalance where only around 7% of the tweets contain ADEs. Secondly, span detection will require advanced named entity recognition approaches. Finally, the resolution task will additionally require choosing a normalized concept from more than 23,000 MedDRA preferred terms.

Participants will be provided with a labeled training set containing tweet texts and ADE annotations with the option of participating in one or more subtasks. This task contains three subtasks in increasing order of complexity:

  • Training data: 18,000 tweets
  • Test data: 10,000 tweets

Register your team here : https://forms.gle/1qs3rdNLDxAph88n6
Link to Codalab : Available Feb 1 2021

Subtask 1a : ADE tweet classification

Given a tweet, participants of this subtask will be required to submit only the binary annotations ADE/noADE. A tweet should be assigned the label ADE if and only if it has one or more mentions of an ADE.

Evaluation Metric : Submissions will be ranked by Precision, Recall and F1-score for the ADE class

Subtask 1b : ADE span detection (includes annotations for Subtask 1a)

Participants of the ADE span detection subtask will be required to submit both the ADE classification labels and spans of expressed ADE.

Evaluation Metric : Submissions will be ranked by Precision, Recall and F1-score for each ADE extracted where the spans overlap either entirely or partially.

Subtask 1c : ADE resolution (includes annotations for Subtask 1a and 1b)

Participants of the ADE resolution subtask will be required to submit ADR classification labels, spans and normalization labels.

Evaluation Metric : Submissions will be ranked by Precision, Recall and F1-score for each ADE extracted where the spans overlap either entirely or partially AND each span is normalized to the correct MedDRA preferred term ID.

Contact information: Arjun Magge (Arjun.Magge@pennmedicine.upenn.edu)

Frequently asked questions (FAQs):

Do I have to participate in all the subtasks?
No. You may choose to participate in one, two or all three subtasks. During evaluation, you will be allowed to make two submissions for each subtask.

How do I access MedDRA preferred terms (PT) and lower level terms (LLT)?
You can download MedDRA from https://www.meddra.org/. It is free for academic institutions and you may subscribe online at https://www.meddra.org/subscription/subscription-form. Once you download MedDRA, you will find PTs, LLTs and their mapping in the llt.asc file.