Task 5 : Classification of tweets self-reporting potential cases of COVID-19.

This new binary classification task involves automatically distinguishing tweets that self-report potential cases of COVID-19 (annotated as “1”) from those that do not (annotated as “0”). “Potential case” tweets include those indicating that the user or a member of the user’s household was denied testing for, symptomatic of, directly exposed to presumptive or confirmed cases of COVID-19, or has had experiences that pose a higher risk of exposure to COVID-19. “Other” tweets are related to COVID-19 and may discuss topics such as testing, symptoms, traveling, or social distancing, but do not indicate that the user or a member of the user’s household may be infected.

  • Training data: 7,181 tweets
  • Test data: 10,000 tweets

Register your team here : https://forms.gle/1qs3rdNLDxAph88n6
After registration approval, you will be invited to join the Google group for the task. Link to the dataset is available in the Google groups banner. If you do not receive the invite please request to join the Google group with team name using the link below.
Google groups : https://groups.google.com/g/smm4h21-task-5
Link to Codalab : https://competitions.codalab.org/competitions/28766

Evaluation Period for Task 5 :

Test Dataset Release28th Feb 2021 12:00am UTC
Predictions Due2nd Mar 2021 11:59pm UTC (3:59pm PST)
All submissions are automated and time limits are enforced by Codalab. No extensions will be provided.

Submission format: Please use the format below for submission. Submissions should contain two columns tweet_id and label separated by tabspaces. All other columns will be ignored. Predictions for each task should be contained in a single .tsv (tab separated values) file. This file (and only this file) should be compressed into a .zip file. Please upload this zip file as submission.

tweet_idlabel
356560
346371
568440
127351
057450
246770
Submission format for Task4

Evaluation Metric : F1-score for the “potential case” class

Contact information: Ari Klein (ariklein@pennmedicine.upenn.edu)