General Info   Events   Staff   Research   Scientific Council   Conferences   Seminars   Recent Publications   Library   Publishing Centre   Staff Services   Links 
General Information \ Department of A.I. \ Linguistic Engineering Site Map  

Foundations of
Artificial Intelligence

Statistical Analysis
and Modelling

Game and Decision
Theory

Linguistic
Engineering

 

Department of
Artificial Intelligence

Department of
Theoretical
Foundations of
Computer Science

 

General Information

 
LINGUISTIC ENGINEERING

Group Members
Adam Przepiórkowski, Ph.D., Associate Professor, Head of the Group
Anna Andrzejczuk, M.Sc.
Łukasz Degórski, M.Sc.
Elżbieta Hajnicz, Ph.D., Associate Professor
Łukasz Kobyliński, Ph.D.
Mateusz Kopeć, M.Sc
Anna Kupść, Ph.D.
Małgorzata Marciniak, Ph.D.
Agnieszka Mykowiecka, Ph.D., Associate Professor
Maciej Ogrodniczuk, Ph.D.
Piotr Rychlik, Ph.D.
Jakub Waszczuk, M. Sc.
Aleksander Wawer, M.Sc.
Aleksandra Wieczorek, Ph.D.
Marcin Woliński, Ph.D.
Alina Wróblewska, M.Sc.

Foreign Associates

Tomasz Strzałkowski, Ph.D.
Computer Science Department, University at Albany
Stanisław Szpakowicz, Ph.D.
University of Ottawa

Post-doctoral Practice
Jakub Piskorski, Ph.D.


Research Domain
The Linguistic Engineering Group (Pol. Zespół Inżynierii Lingwistycznej; ZIL) deals with multiple aspects of Natural Language Processing.

ZIL's traditional area of interest is deep syntactic parsing of Polish, with the use of Definite Clause Grammars (DCG) and generative linguistic formalisms, such as Head-driven Phrase Structure Grammar (HPSG) and Lexical Functional Grammar (LFG). For each of these approaches, a grammar of Polish has been developed and implemented, with current work concentrating on DCG and LFG.

Another important focus of the Group's research is widely understood information extraction: many publications have been devoted to the automatic extraction of structured data from domain texts, to named entity recognition and to shallow parsing in general. Related work includes automatic acquisition of linguistic knowledge - including valence frames - from corpus data.

More recently, ZIL has also been dealing with the semantic processing of texts, focusing on word sense disambiguation, coreference resolution and sentiment analysis. Certain elements of semantic processing are present in the LFG parser mentioned above. More application-oriented work within this thread concerns automatic summarisation and text categorisation.

The Group is also active in the area of corpus linguistics. ZIL coordinated the development of the 1.5-billion-word National Corpus of Polish (Pol. Narodowy Korpus Języka Polskiego; NKJP), based to some extent on the earlier IPI PAN Corpus. In the process, the Group created various tools for manual and automatic corpus annotation at multiple linguistic levels, an XML schema for corpus annotation, and a manually annotated 1-million-word subcorpus. This subcorpus is the empirical basis for the Składnica treebank which is currently being developed; Składnica has already been used to train a dependency parser for Polish.

Various tools created by the Group are publicly available as open source software. They include: morphosyntactic taggers, a shallow parser Spejd, a deep parser Świgra, a named entity recogniser Nerf, a word sense disambiguation platform WSDDE, corpus tools Poliqarp and Anotatornia, etc. The Group is also responsible for the development of an open morphological dictionary PoliMorf - to be used in deep parsing and other applications - based on earlier such dictionaries. The above tools and resources are used in applications co-developed by ZIL, e.g., in a multilingual content management system.

ZIL has been and is active in multiple national and international projects. For more information, please visit zil.ipipan.waw.pl.



Go to previous problem group.  Previous Problem Group Department Info  Go to Department's Information.  
  webmaster@ipipan.waw.pl Copyright by ICS PAS - 2003