As a follow-up to #SMM4H 2020 Task 5, which focused on birth defect outcomes, this new binary classification task involves automatically distinguishing tweets that report a personal experience of an adverse pregnancy outcome (annotated as “1”) such as miscarriage, stillbirth, preterm birth, low birthweight, and neonatal intensive care from those that do not (annotated as “0”).
- Training data: 6,487 tweets
- Test data: 10,000 tweets
Register your team here : https://forms.gle/1qs3rdNLDxAph88n6
Link to Codalab : Available Feb 1 2021
Attach screenshot below

Evaluation Metric : F1-score for the “positive” class (i.e., tweets annotated as “1”)
Contact information: Ari Klein (ariklein@pennmedicine.upenn.edu)
References:
- Klein AZ, Cai H, Weissenbacher D, Levine LD, Gonzalez-Hernandez G. A Natural Language Processing Pipeline to Advance the Use of Twitter Data for Digital Epidemiology of Adverse Pregnancy Outcomes. Journal of Biomedical Informatics: X. 2020; 100076.
- Klein AZ, Gonzalez-Hernandez G. An Annotated Data Set for Identifying Women Reporting Adverse Pregnancy Outcomes on Twitter. Data Brief. 2020;32:106249.