|
Group Members
 |
Adam Przepiórkowski, Ph.D., Associate Professor, Head of the Group |
 |
Anna Andrzejczuk, M.Sc. |
 |
Łukasz Degórski, M.Sc. |
 |
Elżbieta Hajnicz, Ph.D. |
 |
Łukasz Kobyliński, M.Sc. |
 |
Mateusz Kopeć |
 |
Anna Kup¶ć, Ph.D. |
 |
Michał Lenart |
 |
Małgorzata Marciniak, Ph.D. |
 |
Agnieszka Mykowiecka, Ph.D. |
 |
Maciej Ogrodniczuk, Ph.D. |
 |
Piotr Rychlik, Ph.D. |
 |
Zygmunt Saloni, Ph.D., Professor |
 |
Filip Skwarski, M.Sc. |
 |
Łukasz Szałkiewicz, M.Sc. |
 |
Jan Szejko, M. Sc. |
 |
Marek ¦widziński, Ph.D., Professor |
 |
Jakub Waszczuk, M. Sc. |
 |
Aleksander Wawer, M.Sc. |
 |
Aleksandra Wieczorek, Ph.D. |
 |
Marcin Woliński, Ph.D. |
 |
Beata Wójtowicz, Ph.D. |
 |
Alina Wróblewska, M.Sc. |
 |
Bartosz Zaborowski, M. Sc. |
 |
Sebastian Żurowski, Ph.D.
|
|
|
Foreign Associates
Post-doctoral Practice
 |
Jakub Piskorski, Ph.D. |
|
|
Research Domain
The Linguistic Engineering Group (Pol. Zespół Inżynierii Lingwistycznej;
ZIL) deals with multiple aspects of Natural Language Processing.
ZIL's traditional area of interest is deep syntactic parsing of
Polish, with the use of Definite Clause Grammars (DCG) and generative
linguistic formalisms, such as Head-driven Phrase Structure Grammar
(HPSG) and Lexical Functional Grammar (LFG). For each of these
approaches, a grammar of Polish has been developed and implemented,
with current work concentrating on DCG and LFG.
Another important focus of the Group's research is widely understood information
extraction: many publications have been devoted to the automatic
extraction of structured data from domain texts, to named entity
recognition and to shallow parsing in general. Related work
includes automatic acquisition of linguistic knowledge -
including valence frames - from corpus data.
More recently, ZIL has also been dealing with the semantic processing of texts,
focusing on word sense disambiguation, coreference resolution and
sentiment analysis. Certain elements of semantic processing are
present in the LFG parser mentioned above. More application-oriented work
within this thread concerns automatic summarisation and text
categorisation.
The Group is also active in the area of corpus linguistics. ZIL
coordinated the development of the 1.5-billion-word National Corpus
of Polish (Pol. Narodowy Korpus Języka Polskiego; NKJP), based to some
extent on the earlier IPI PAN Corpus. In the process, the Group created various
tools for manual and automatic corpus annotation at multiple
linguistic levels, an XML schema for corpus annotation, and a manually
annotated 1-million-word subcorpus. This subcorpus is the
empirical basis for the Składnica treebank which is currently being developed;
Składnica has already been used to train a dependency parser for Polish.
Various tools created by the Group are publicly available as open source software.
They include: morphosyntactic taggers, a shallow
parser Spejd, a deep parser ¦wigra, a named entity recogniser Nerf, a
word sense disambiguation platform WSDDE, corpus tools Poliqarp and
Anotatornia, etc. The Group is also responsible for the development
of an open morphological dictionary PoliMorf - to be used in deep
parsing and other applications - based on earlier such dictionaries.
The above tools and resources are used in applications co-developed by
ZIL, e.g., in a multilingual content management system.
ZIL has been and is active in multiple national and international
projects. For more information, please visit
zil.ipipan.waw.pl.
|
|
 |
 |