Semantics in Automatic Language Understanding

Semantics in Automatic Language Understanding (Seminar)

Contents

Due to the huge amount of unstructured text in today’s growing information society, the automatic understanding of natural (human) language increasingly gains importance. For the improvement of the internet search, the automatic summarization or translation of text documents, or the language-based human-computer interaction, computers must understand the semantics of words. A standard approach to automatically extract semantic knowledge is the analysis of huge corpora, i.e. comprehensive text collections. In contrast, current methods use machine-readable language resources, such as WordNet, Wikipedia, or Wiktionary. Here, collaboratively constructed knowledge sources (Wikipedia or Wiktionary) are of special interest as they grow quickly and provide an increasing amount of valuable knowledge for automatic language understanding.

The goal of the seminar is to learn about different approaches to extract and apply semantic knowledge for automatic language understanding. Topics covered amongst others:

  • Automatic understanding of word senses
  • Semantic relatedness measures
  • Generation of paraphrases
  • Ontologies for automatic language understanding
  • Wikipedia and Wiktionary as knowledge sources.

The seminar provides opportunities for students to acquire key skills, such as basics of scientific work and presentation skills.

Timetable

  • Seminar sessions: Tuesday, 15:20 – 17:00, Room S2/02/E102, starting Oct 13, 2009
  • Office hours: Thursday, 15:00 – 16:00, Room S2/02/E202, starting Oct 15, 2009. Please send an email in advance! Minor issues may also be discussed using our forum.
  • Submission deadline for written reports: Monday, March 15, 2010

Presentation

  • For general advice on presenting your topic, please have a look at these guidelines.
  • Templates: Presentation (PowerPoint), Report (LaTeX)

Sessions

All presentations can be found in our Moodle!

  • Oct 13, 2009: Introduction & Topic Assigment
    Sprachtechnologie: Ein Überblick (Prof. Dr. Iryna Gurevych)
  • Oct 20, 2009: Introduction II
    Einführung in Natural Language Processing (Daniel Bär)
  • Oct 27, 2009: Introduction III
    Einführung in lexikalisch-semantische Ressourcen (Elisabeth Wolf)
  • Nov 3 + 10, 2009: Preparation Phase
    No seminar sessions!
  • Nov 17, 2009
    Alexander Stumpf:
    D. Milne and I. H. Witten:
    An Effective, Low-Cost Measure of Semantic Relatedness Obtained from Wikipedia Links
    In Proceedings of the First AAAI Workshop on Wikipedia and Artificial Intelligence, Chicago, IL, 2008.
    Mateusz Parzonka:
    E. Gabrilovich and S. Markovitch:
    Computing Semantic Relatedness Using Wikipedia-Based Explicit Semantic Analysis
    In Proceedings of the 20th International Joint Conference on Artificial Intelligence, pp. 1606–1611, 2007.
  • Nov 24, 2009
    Andre Wisplinghoff:
    T. Zesch, C. Müller, and I. Gurevych:
    Using Wiktionary for Computing Semantic Relatedness
    In Proceedings of the 23rd AAAI Conference on Artificial Intelligence, Chicago, IL, pp. 861–867, 2008.
  • Dec 1, 2009
    Holger Pontow:
    Magnini, B., Strapparava, C., Pezzulo, G., and Gliozzo, A.
    The Role of Domain Information in Word Sense Disambiguation.
    Natural Language Engineering. 8, 4 (Dec. 2002), 359-373, 2002.
  • Dec 8, 2009
    Benedikt Conrad:
    Banerjee, S. and Pedersen, T.
    An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet.
    In Proceedings of the Third international Conference on Computational Linguistics and intelligent Text Processing. A. F. Gelbukh, Ed. Lecture Notes In Computer Science, vol. 2276. Springer-Verlag, London, 136-145. 2002

    Rizwana Yasmeen:
    R. Mihalcea.
    Using Wikipedia for automatic word sense disambiguation.
    In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics , Rochester, New York, April 2007.
  • Dec 15, 2009
    Christian Kirschner:
    Agirre, E. and Soroa,
    A. Personalizing PageRank for Word Sense Disambiguation.
    In Proceedings of the 12th conference of the European chapter of the Association for Computational Linguistics (EACL). Athens, Greece. 2009.

    Christian Fritz:
    E. Yeh, D. Ramage, C. D. Manning, E. Agirre, and A. Soroa:
    WikiWalk: Random Walks on Wikipedia for Semantic Relatedness
    In Proceedings of the Workshop on Graph-based Methods for Natural Language Processing, Singapore, August 2009.
  • Jan 12, 2010
    Anqi Wang:
    Lapata, M. and Keller, F.
    An information retrieval approach to sense ranking.
    In Proceedings of NAACL-2007, pages 348--355, Rochester, 2007.

    Daniel Reker:
    McCarthy, D., Koeling, R., Weeds, J., and Carroll, J.
    Finding predominant word senses in untagged text.
    In Proceedings of the 42nd Annual Meeting on Association For Computational Linguistics. Annual Meeting of the ACL. Association for Computational Linguistics, Morristown, NJ, 2004.
  • Jan 19, 2010
    Rouven Röhrig:
    C. Corley and R. Mihalcea:
    Measuring the Semantic Similarity of Texts
    In Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, Ann Arbor, June 2005.

    Tobias Wieschnowsky:
    L. C. Wee and S. Hassan:
    Exploiting Wikipedia for Directional Inferential Text Similarity
    In Proceedings of the 5th International Conference on Information Technology: New Generations, pp. 686–691, April 2008.
  • Jan 26, 2010
    Christoph Korn:
    Navigli, R. and Lapata, M.
    Graph Connectivity Measures for Unsupervised Word Sense Disambiguation.
    In Proceedings of the 20th International Joint Conference on Artificial Intelligence, 1683-1688. Hyderabad, India. 2007.

    Artem Uspenskiy:
    Erk, K. and McCarthy, D.
    Graded Word Sense Assignment.
    In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 440-449, 2009.
  • Feb 2, 2010
    Henning Koes:
    Ö. Uzuner, B. Katz, and T. Nahnsen:
    Using Syntactic Information to Identify Plagiarism
    In Proceedings of the 2nd Workshop on Building Educational Applications Using NLP, Ann Arbor, pp. 37–44, June 2005.

    Sebastian Freutel:
    M. D. Lee, B. Pincombe, and M. Welsh:
    An Empirical Evaluation of Models of Text Document Similarity
    In Proceedings of the 27th Annual Conference of the Cognitive Science Society, pp. 1254–1259, 2005.

Lecturers

  • Prof. Dr. Iryna Gurevych
  • Daniel Bär
  • Elisabeth Wolf

Literature

  • Iryna Gurevych. Das World Wide Web als computerlinguistische Ressource.In: Ralf Klabunde and Kai-Uwe Carstensen and Christian Ebert and Cornelia Endriss and Hagen Langer and Susanne Jekat: Computerlinguistik und Sprachtechnologie – Eine Einführung, p. (to appear), Springer Verlag, January 2009.
  • Torsten Zesch, Iryna Gurevych, and Max Mühlhäuser. Analyzing and Accessing Wikipedia as a Lexical Semantic Resource. In: Data Structures for Linguistic Resources and Applications, p. 197--205, Gunter Narr, Tübingen, April 2007.
  • Torsten Zesch, Christof Müller, and Iryna Gurevych. Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary. In: Proceedings of the Conference on Language Resources and Evaluation (LREC), electronic proceedings, May 2008.