Social Media Mining for Health Applications (#SMM4H) Shared Task 2021

Call For Participation – Shared Task

(Click here for the #SMM4H ’21 Call For Papers – Workshop)

(Click here for the #SMM4H ’20 Call For Papers – Workshop)

The Social Media Mining for Health Applications (#SMM4H) Shared Task involves natural language processing (NLP) challenges of using social media data for health research, including informal, colloquial expressions and misspellings of clinical concepts, noise, data sparsity, ambiguity, and multilingual posts. For each of the eight tasks below, participating teams will be provided with a set of annotated tweets for developing systems, followed by a three-day window during which they will run their systems on unlabeled test data and upload the predictions of their systems to CodaLab. Information about registration, data access, paper submissions, and presentations can be found in the individual competition pages listed below.

Timeline:

Training and validation set releaseDec 15 2020
Validation set submission due Feb 26
Test set releaseFeb 26 – Mar 1 2021 ^
— Task 1 and Task 2Feb 26
— Task 3 and Task 4Feb 27
— Task 5 and Task 6Feb 28
— Task 7 and Task 8Mar 1
Test set predictions dueFeb 28 – Mar 3 2021 ^
— Task 1 and Task 2Feb 28
— Task 3 and Task 4Mar 1
— Task 5 and Task 6Mar 2
— Task 7 and Task 8Mar 3
Test set evaluation scores releaseMar 8
System descriptions dueMar 15
Acceptance notificationApr 1 
Camera ready system descriptionsApr 12
^ See individual tasks for exact dates * All deadlines are 11:59 PM UTC (3:59 PM PST)

Registration link : https://forms.gle/1qs3rdNLDxAph88n6

If you are facing issues with registration, please email Ivan.Flores@pennmedicine.upenn.edu and Arjun.Magge@pennmedicine.upenn.edu with the subject line “SMM4H 2021 Registration” and in the text please list (1) team name, (2) participant name(s), (3) participant email, (4) advisor name, (5) advisor email, (6) academic/institute affiliations and (7) the tasks you are interested in participating.

Task 1 : Classification, Extraction and Normalization of Adverse Effect mentions in English tweets

This task involves three subtasks: (1) Classification of tweets containing AEs (2) Extraction of AE mentions and (3) Extraction and Normalization of AE mentions. This is one of the oldest tasks making its 5th reappearance. Participants may participate in one or more subtasks. More details available here.

Task 2 : Classification of Russian tweets for detecting presence of AE mentions

This task involves the binary classification of Russian tweets to detect tweets containing AEs. This is the second edition of the task hosted from last year. More details available here.

Task 3 : Classification of change in medications regimen in tweets

This task involves binary classification. The designed system should detect tweets where Twitter users self-declare changing their medication regimen. More details available here.

Task 4 : Classification of tweets self-reporting adverse pregnancy outcomes

This task involves the binary classification of tweets to detect tweets that mention one or more adverse pregnancy outcomes . More details available here.

Task 5 : Classification of tweets self-reporting potential cases of COVID-19

This task involves binary classification of tweets to detect self-reports of COVID-19 cases. More details available here.

Task 6 : Classification of COVID19 tweets containing symptoms

This task involves a three class classification where the system needs to differentiate between COVID19 tweets to determine if they are (1) self-reports (2) non-personal reports or (3) literature/news mentions. More details available here.

Task 7 : Identification of professions and occupations (ProfNER) in Spanish tweets

This task involves two subtasks (1) Classification of tweets containing mentions of occupation/profession in spanish tweets (2) Span extraction of occupation/profession mentions. More details available here.

Task 8 : Classification of self-reported breast cancer posts on Twitter

This task involves binary classification of tweets to determine if the tweet contains a self-report of breast cancer. More details available here.