This is the course page for the seminar course Challenges in Computational Linguistics at the Department of Linguistics, University of Tübingen.
Computational Linguistics and related fields have a well-established tradition of “shared tasks” or “challenges” where the participants try to solve a current problem in the field using a common data set and a well-defined metric of success. Participation in these tasks is fun and highly educational as it requires the participants to put all their knowledge into practice, as well as learning and applying new methods to the task at hand. The comparison of the participating systems at the end of the shared task is also a valuable learning experience, both for the participating individuals and for the whole field.
This course takes its title literally. The students taking the course are required to participate in a shared task in the field, and solve it as best as they can. The requirement of the course include developing a system to solve the problem defined by the shared task, submitting the results and writing a paper describing the system.
Requirements
The course requires good programming skills, a working knowledge of machine learning and NLP, and strong (self) motivation. This typically means a highly motivated master’s or advanced Bachelor’s student in computational linguistics or related departments (e.g., computer science, artificial intelligence, cognitive science). If you are unsure whether this course is for you, please contact the instructor.
Shared tasks
In principle, any shared task related to computational linguistics is acceptable. Here are a few pointers (some point to earlier events):
- SemEval competitions include a wide range of shared tasks.
- SIGMORPHON hosts shared tasks on morphology and phonology.
- CoNLL hosts regular high-profile shared tasks.
- CLEF also includes CL/NLP related tasks
- Germeval shared tasks are another option, particularly if you’d like to work on German.
- VarDial workshops host regular shared task related to dialects and closely related languages.
- NeurIPS competitions sometimes include NLP/CL related competitions.
Other workshops in ACL, EMNLP, EACL, NAACL, and COLING often include relevant shared tasks (this year’s workshop schedule is not yet known).
This is only a small sample. Some of the shared tasks for the upcoming year are not yet announced. You are recommended to check the earlier instances of and keep an eye on the workshop pages.
Another interesting event similar to the shared tasks above, but has a different approach is the ML Reproducibility Challenge. You are also welcome to participate in this event (You may also want to look at the last year’s challenge website to see example reports).
For the purposes of the class, we prefer a shared task where you finalize your work with a system description paper. If all else fails, or if you have a strong preference, a CL-related Kaggle competition may also be an option (you are still required to write a system description paper).
Earlier instances of the course
The following are the pointers to the papers the participants have published.
Contact
- Instructor: Çağrı Çöltekin
<ccoltekin@sfs.uni-tuebingen.de>
Office hours: Wednesday 16:00 - 17:00 (Keplerstr. 2, room 152)