Hercules Dalianis, KTH
Koenraad de Smedt Univ in Bergen
Jürgen Wedekind, CST Copenhagen
Janne Bondi Johannessen,Univ in Oslo
Helge Dyvik, Univ in Bergen
Henrik Holmboe, Norfa
Victoria Rosén, Univ i Bergen
Veronika Haderlein, FAST Search and Transfer, Oslo
Lotte Weilgaard, Syddansk Univ, Kolding
Margrete H. Møller, Syddansk Univ, Kolding
Hercules described the current Swedish text summarizer, see also
previous minutes First
ScandSum meeting Åre
Janne Bondi Johannessen described the Oslo-Bergen tagger which can be used
for tagging Norwegian texts. The Oslo-Bergen tagger can be adapted to generate SGML
based tags. These tags can be received by the coming new version of the SweSum summarization
engine and hence summarize Norwegian texts.
The tagger can be tested here.http://decentius.hit.uib.no:8005/cl/cgp/test.html
We plan to use a server based tagger that can be communicated through SSH
protocoll so the new Summarization system will be fully distributed and hence easy
to support and update.
The same will be with the Granska Tagger.
The tagger format will be similar to the following:
<text>
<paragraph>
<sentence>
<clause>
<word lemma="ha" tag="verb">Har</word>
...
</clause>
</sentence>
</paragraph>
</text>
Though if the text structure is not alsways hierarchical we must be able to treat
it,
adding for example these tags
<div type="s"> between sentences
<word> eksemplet <lemma type="eksempel" cat="subst">
</word>
Victoria Rosén described the lexicon from the Scarrie project. This
lexicon contains relations between possible lexical variants in Bokmål, for
example between høyesterett and høgsterett. This offers great potential
for grouping variants of keywords. The SCARRIE lexicon can be used as a basis for
constructing a word list for the current SweSum architecture, or it can be used to
define keyword links on the output of the tagger as well as on user keywords.
A paper on SCARRIE is available at
http://ling.uib.no/~desmedt/papers/MONS8-paper.html
Helge Dyvik talked about Word senses based on parallel corpora.
Helge decribed a method where going back and forth between the processed parallel
corpora starting with one lexical item and finding the most closed related in the
other corpora / language and then back again to the other corpora / language and
hence finding the lexical items which are closest semantically related. This method
with enough large parallel corpora can make the building of wordnets / ontologies
partly automatized.
Viggo Kann KTH
Bergen people
Janne Bondi Johannessen / Paul Meurer Univ of Bergen
Ari Pirkola or Kalervo Järvelin University in Tampere Finland
Kristin Bjaradottir, Inst. of Lexicography, Island
Tiit Roosma University of Tartu, Estonia
Everita Milconoka or Inguna Skadina Univ of Latvia
Vidas Daudaravicius CLC at Vytautas Magnus University, Kaunas, Lithuania.
Veronika Haderlein FAST Search and Transfer Oslo
* 13-15 Sept 2002, Skagen, Denmark
* 25-28 Jan 2003, Geilo or Voss, Norway
* 5-8 April 2003, Åre
Invite new nodes / people to Skagen
Make the Norwegian tagger work
Prepare diskussion of Danish resources
Hercules OH-slides (SweSum) and Bildspel
Janne Bondi Johannessen OH-slides
Viktoria Rosén
Helge Dyvik Documentation: A Translational Basis for Semantics*
Latest change Aug 15 , 2002