University of Southern Denmark
World of VISL -> PANORAMA  Visual Interactive Syntax Learning  
Syddansk Universitet
 
 

PANORAMA (PArallel NORdic Annotated Multilingual corporA) is a cross-Nordic project aimed at providing high-quality parallel text resources for all Nordic languages: Danish, Norwegian, Swedish, Finnish, Icelandic, Faroese, Sami and Greenlandic, with major EU languages like English or German as optional reference languages. As part of the initiative, the necessary annotation tools (taggers/parsers), as well as corpus formatting and searching tools, will be enhanced or - if necessary - developed from scratch (depending on the language). Both in its methods and its goals, PANORAMA has a strong focus on Human Language Technology (HLT) and strives to further the development of independent Nordic HLT resources, such as multilingual dictionaries and internet search tools, machine translation etc. All corpus data will be made freely accessible/searchable through a specially designed web interface.

PANORAMA was launched in 2007 as a 4-year cooperation between the universities of Odense (University of Southern Denmark), Oslo and Tromsø. All participating researchers are experts in corpus linguistics and contribute with expertise from earlier HLT projects. The project has been supported by the Nordic Council of Ministers with a 1-year networking grant under the NordplusSprog framework.

Administration and central programming is done by VISL staff at the ISK, University of Southern Denmark.The following institutions participate directly in the project (responsible researcher in parenthesis):

  • Institute of Language and Communication, University of Southern Denmark (Eckhard Bick): Danish, Swedish, English and German annotation
  • The Text Laboratory, University of Oslo (Janne Bondi Johannessen): Norwegian annotation (Kristin Hagen), parallel search interface (Lars Nygaard)
  • Tromsø University: Sami, Faroese and Greenlandic annotation (Trond Trosterud), Icelandic annotation (Gunnar Hrafn Hrafnbjargarson)

PANORAMA has organized 2 workshops in 2007 and has plans for further regular workshops to coordinate activities. The workshops are in principle open to students and interested researchers from other institutions, and PANORAMA may actively invite HLT specialists for specific languages.

  • Odense, 9.-1. March 2007
  • Fefor, 16.-18. November 2007 (pictures)

PANORAMA-related tools and ressources

  • The project has adopted the Glossa search interface infrastructure, which also handles parallel data. PANORAMA is contributing to the design and programming of the Glossa tool.
  • For its annotation and alignment work, PANORAMA is also using preexisting text collections, such as the Europarl corpus and the OPUS parallel text collection. Here, the project has a special interest in the OpenSubtitles collection.
  • PANORAMA is also collecting new parallel data, both from EU publications and various Internet sites. In 2007, alignment experiments were conducted for a parallel text collection of EU commission publications.
  • Alignment tools: PANORAMA uses several different approaches to sentence alignment, among them a new implementation of the Church & Gale algorithm and the Translation Corpus Aligner (TCA). For word and chunk alignment, we are currently testing various strategies, among them a new approach based on Constraint Grammar dependency annotation.
  • The project

PANORAMA annotation standards

For the alignment of bilingual corpora, an xml-style annotation is used, with corpus information in a header section and alignment links at 3 levels: 1. sentence (link), 2. word (wlink) and 3. chunk (clink). Chunks reflect syntactic structural units exploiting constituent or dependency information. In the xml-scheme, word and chunk lines are optional.

<link id="..." xtargets="1;1">
  <clink id="..." xtargets="2 3;3 4 5"/>
  <clink id="..." xtargets="4 5;1 2"/>
  <wlink id="..." xtargets="1;1"/>
  <wlink id="..." xtargets="2;3"/>
  <wlink id="..." xtargets="3;5"/>
  <wlink id="..." xtargets="5;2"/>
</link>
<link id="..." xtargets="2;2">
  <clink id="..." xtargets="7 8 9 10;9 11 12 13 14"/>
  <clink id="..." xtargets="9 10;9 12 13 14"/>
  <wlink id="..." xtargets="1;1"/>
  <wlink id="..." xtargets="2;2"/>
  <wlink id="..." xtargets="3 4;3"/>
  <wlink id="..." xtargets="5;4 5 8"/>
  <wlink id="..." xtargets="6;9"/>
  <wlink id="..." xtargets="7;10"/>
  <wlink id="..." xtargets="8;11"/>
  <wlink id="..." xtargets="10;14"/>
  <wlink id="..." xtargets="11;15"/>
</link>

PANORAMA-related texts:

 


Copyright 1996-2014 | Report a Problem / Contact Us | Printable Version