Task 2 : Classification of Russian tweets for detecting presence of adverse effect (AE) mentions

The designed system for this subtask should be able to distinguish tweets reporting an adverse effect from those that do not, taking into account subtle linguistic variations between adverse effects and indications (the reason to use the medication).

  • Training data: 11,610 tweets
  • Test data: 1000 tweets

Register your team here : https://forms.gle/1qs3rdNLDxAph88n6
After registration approval, you will be invited to join the Google group for the task. Link to the dataset is available in the Google groups banner. If you do not receive the invite please request to join the Google group with team name using the link below.
Google groups : https://groups.google.com/g/smm4h21-task-2
Link to Codalab : https://competitions.codalab.org/competitions/28766

Evaluation Period for Task 2 :

Test Dataset Release26th Feb 2021 12:00am UTC
Predictions Due28th Feb 2021 11:59pm UTC (3:59pm PST)
All submissions are automated and time limits are enforced by Codalab. No extensions will be provided.

Submission format: Please use the format below for submission. Submissions should contain two columns tweet_id and label separated by tabspaces. All other columns will be ignored. Predictions for each task should be contained in a single .tsv (tab separated values) file. This file (and only this file) should be compressed into a .zip file. Please upload this zip file as submission.

tweet_idlabel
1230
5431
2310
1351
4860
2470
Submission format for Task2

We thank Yandex.Toloka for supporting the shared task and providing credits for data annotation in Russian.

Evaluation Metric : F1-score for the AE (1) class

Contact information: Elena Tutubalina (tutubalinaev@gmail.com)

We thank Yandex.Toloka for supporting the shared task and providing credits for data annotation in Russian.

References:

SMM4H 2020 proceedings

The Russian Drug Reaction Corpus and Neural Models

We provide two Python tutorials on how to run classifiers on SMMH4 data:
CNN-based classifier on SMM4H Task 2 data (ADR classification)
BERT-based classifier on SMM4H Task 2 data