Text Analytics: Cross-lingual Link Discovery

Text Analytics: Cross-Lingual Link Discovery

Description

Cross-lingual text analysis is an essential research topic in both Natural Language Processing (NLP) and Information Retrieval (IR), and it has long been well received in the research communities.This seminar course studies the task of Cross-lingual link discovery (CrossLink), which has become a hot topic in the recent years.

CrossLink is a task of “automatically finding potential links between documents in different languages,” where such system “actively recommends a set of meaningful anchors in the source document and … establishes links with documents in other languages” (NTCIR-9 CrossLink task description).

The course consists of two parts. The general approaches to CrossLink will be addressed in student presentations during the first few weeks of the class. In the second part, as hands-on experiments, all students are expected to participate in an in-class evaluation workshop; until the end of the semester, students will develop their own CrossLink system using a subset of English and German Wikipedia corpus and submit runs on a set of test topics. Submissions will be evaluated automatically and the systems will be analyzed based on their results.

Expectation

Students are expected to perform the following tasks:

  • present a talk on a previous work of their choice
  • develop their own CrossLink system
  • submit a run produced by their CrossLink system
  • present a talk and submit a report on their CrossLink system

Organization

  • Lectures are on Thursdays, 13:30 – 15:10 in S2|02|C110.
  • This course is instructed in English, and student presentations and term papers are to be conducted in English.
  • TUCaN
  • Moodle
  • Paper sign-up for student presentation on previous work: 25.04.2013
  • CrossLink run, system report, and presentation due: 11.07.2013

Literature

Timetable

Week Date Topic
1 Apr 18 Welcome and course organization
2 Apr 25 Introduction to CrossLink (paper sign-up for student presentation on previous work)
3 May 02 Introduction to CrossLink
4 May 09 No Class (Ascension)
5 May 16 Student presentations on previous work
6 May 23 Student presentations on previous work
7 May 30 No Class (Corpus Christi)
8 Jun 06 Student presentations on previous work
9 Jun 13 Student presentations on previous work
10 Jun 20 No Class (Lecturer on Business Trip)
11 Jun 27 Technical Q & A
12 Jul 04 Technical Q & A
13 Jul 11 Student presentations on their CrossLink system
(Due: CrossLink runs and system reports)
14 Jul 18 Evaluation results released / discussion / wrap-up

Lecturers

  • Dr. Jungi Kim (office hours: Monday 4-5pm, please resigster by e-mail)
  • Prof. Dr. Iryna Gurevych