SIGTYP 2020: The Second Workshop on Computational Research in Linguistic Typology
To be held at EMNLP 2020/Online
The huge diversity of human languages leads to an inherent variation of the cross-lingual data with respect to the categories and structures in the languages’ surface level. This results in poor performances of NLP algorithms when trying to transfer between languages. Linguistic typology provides a systematic, empirical comparison of the languages of the world with respect to a variety of linguistic properties, and therefore holds promise to solve this problem. However, typological information has not yet been fully exploited. Our workshop aims at bridging this gap by encouraging a tighter collaboration of the scientific communities from the areas of linguistic typology and multilingual NLP.
SIGTYP workshop is the first dedicated venue for typology-related research and its integration in multilingual NLP. The workshop is specifically aimed at raising awareness of linguistic typology and its potential in supporting and widening the global reach multilingual NLP. The topics of the workshop will include, but are not limited to:
— Language-independence in training, architecture design, and hyperparameter tuning. Is it possible (and if yes, how) to unravel unknown biases that hinder the cross-lingual performance of NLP algorithms and to leverage the knowledge on such biases in NLP algorithms?
— Integration of typological features in language transfer and joint multilingual learning. In addition to established techniques such as “selective sharing”, are there alternative ways to encoding heterogeneous external knowledge in machine learning algorithms?
— New applications. The application of typology to currently uncharted territories, i.e. the use of typological information in NLP tasks where such information has not been investigated yet.
— Automatic inference of typological features. The pros and cons of existing techniques (e.g. heuristics derived from morphosyntactic annotation, propagation from features of other languages, supervised Bayesian and neural models) and discussion on emerging ones.
— Typology and interpretability. The use of typological knowledge for interpretation of hidden representations of multilingual neural models, multilingual data generation and selection, and typological annotation of texts.
— Improvement and completion of typological databases. Combining linguistic knowledge and automatic data-driven methods towards the joint goal of improving the knowledge on cross-linguistic variation and universals.
— Submission Deadline:
August, 15 2020August, 22 2020 (AoE)
— Retraction of workshop papers accepted for EMNLP: September, 15 2020
— Notification of Acceptance: September, 29 2020
— Camera-ready copy due from authors: October, 10 2020
— Workshop: November, 19 2020
WE ACCEPT EXTENDED ABSTRACTS
These may report on work in progress or may be cross submissions that have already appeared in a non-NLP venue. The extended abstracts are of maximum 2 pages + references. These submissions are non-archival in order to allow submission to another venue. The selection will not be based on a double-blind review and thus submissions of this type need not be anonymized.
The abstracts should use EMNLP 2020 templates.
These should be submitted via softconf: https://www.softconf.com/emnlp2020/sigtyp/