Lab of the Month:
The Ubiquitous Knowledge Processing Lab
by Ji-Ung Lee
The Ubiquitous Knowledge Processing (UKP) Lab is the endowed Lichtenberg Chair in the Department of Computer Science at the Technische Universität Darmstadt, located in the state of Hessia, Germany. We cover a broad range of topics in the field of natural language processing (NLP) with a strong emphasis on deep learning for natural language understanding and innovative uses of NLP to solve hard problems in social media, social sciences, and humanities. Our aim is to develop new approaches to automatically process and manage the knowledge represented in a variety of forms and repositories, with a strong focus on textual information processing and large-scale content analysis on the Web.
The UKP Lab has participated in major European infrastructure projects such as CLARIN, DARIAH and OpenMinTeD and in numerous other projects funded by the German Research Foundation, the German ministries, and the industry. Individual projects concern the following main research areas: Deep Learning in NLP, Argument Mining, Language Technology for Digital Humanities, Lexical-Semantic Resources and Algorithms, Text Mining and Analytics, and Writing Assistance and Language Learning.
In the PhD program KDSL, we developed novel methods for discovering knowledge in scientific literature and researched methods for mining argument structures from scientific articles. Furthermore, the group has published various large-scale datasets related to argument mining, which are extensively used by the international argument mining community. Whereas our past research strongly focused on large-scale knowledge graphs and more recently, argument mining, we also work on newly emerging topics like NLP with a human-in-the-loop with newly arising challenges we need to address. Particular aspects of our works are 1) how to properly involve humans and their feedback in an interactive system and 2) how to aggregate annotations collected from experts and crowdworkers.
Our initial work on finding convincing arguments with Bayesian models derived from crowdsourced, pairwise preferences led to further research in aggregating crowdsourced annotations, e.g., for humor and metaphor detection, and sequential NLP tasks in general. Simultaneously, various members of our group are tackling the challenge of training machine learning systems with a human-in-the-loop, discovering new insights in single and multi-document summarization as well as text compression. Besides crowdsourcing and human-in-the-loop machine learning, our work on the generation and manipulation of second language learning exercises strongly connects us to the key motif of the enetCollect COST Action.
Our lab has a history of turning research results into an eco-system of reusable software components. Large parts of this eco-system are provided as open-source software on GitHub. In the INCEpTION project funded by the German Research Foundation, we are developing an interoperable, semantic text annotation platform with a focus on automatically supporting human annotators with machine learning models. In the VIP+ Validation Project ArgumenText, we are developing an open-domain argument mining platform to find, extract, and validate arguments for controversial topics from heterogeneous web documents. We are happy to contribute to enetCollect in various ways, from collaborative work in working group meetings to open-source software and resources.