Lab of the Month: The ITU NLP GROUP
by Gülşen Eryiğit
Istanbul Technical University’s Natural Language Processing Group is one of the oldest and biggest groups in Turkey dedicated on Turkish Natural Language Processing and its applications. The group has been established in 2000 and is currently linked to the Language Technologies and Social Robotics Lab of the ITU Faculty of Computer and Informatics Engineering. The group is led by Dr. Gülşen Eryiğit and actively consists of 3 professors, 3 research assistants, and many PhD and MS students as well as volunteer undergrads. The publications of the group members may be found via a Google Scholar Group Page.
The group mainly focuses on different aspects of Turkish related NLP tasks as well as their applications to different sectors such as telecom, banking and law. Some of the outstanding projects carried out with the participation of group members are the following:
Parsing Web 2.0 Sentences, TUBITAK, 2013-2015
A Signing Avatar System for Turkish to Turkish Sign Language Machine Translation, TUBITAK, 2014-2016
Turkish Mobile Personal Assistant, Turkey’s Ministry of Science Industry and Technology SANTEZ, 2013-2014
Dependency Analysis of Turkish, ITU – Scientific Research Projects, 2005-2007
Machine Translation between Turkish-English (2011-2013) and Turkish-Turkmen (2006-2009)
The group has strong relationships with the industry and participated in many industry related projects such as “Dialogue System for Automotive Support Channels”, “Dialogue System for Banking”, “Aspect Based Sentiment Analysis on Telecom Domain”, “Entity Detection and Relation Extraction on Noisy Texts (Banking)”, “Development of Semantic Analysis Components on Legal Documents using NLP”.
The group provides many language tools and resources to the researchers working in the field. Some of the basic NLP tools (such as text normalization, morphological analysis etc…) are also provided as a web service via tools.nlp.itu.edu.tr. The important datasets (such as Turkish Treebanks, MWE and named entity resources, Turkish Sign Language resources etc …) which were produced and shared with the community may be found under the following link.
ITU NLP Group was an active member of the UD (Universal Dependencies), CLARIN (Common language resources and technology infrastructure, EU 7th Framework) and PARSEME (Parsing and Multiword Expressions, European Cost Action IC1207) Projects, and finally is an active member of EnetCollect. One of the research areas on which the group is focused recently is the usage of crowdsourcing techniques for enriching the language resources and their usage in CALL related tasks. ITU NLP Web Service has already been integrated with the LARA platform for Turkish preprocessing tasks. A CrowdFest 2020 task on MWE discovery and context collection via crowdsourcing will be organized by the group at Coimbra, February 2020.