Together with TU Darmstadt, researchers at DIPF were involved in several projects.

Extracting, analyzing, and visualising information

The field of ”extraction and visualisation of information“ focuses on an automated extraction of metadata and other information from unstructured texts that are relevant for search and analysis, as well as effective visualisation and interpretation of such information. Users are thus enabled to carry out targeted search queries on the web and in the DIPF information systems.

Developed automated procedures, for instance, include keyword extraction methods, automated indexing, search query expansion or the identification of research methods described in the relevant publications. Furthermore, clustering and classification were applied to structure information; Infometrical procedures served to deliver information regarding educational research and educational science, based on literature and research information available at DIPF.

In this field, research was subject to a close co-operation with the DIPF graduate programme “Knowledge Discovery in Scientific Literature” (KDSL), TU Darmstadt, DIPF infrastructure and its users.

Educational Text Analytics: Automatically Grading Text Response

Initial work in automatically grading text responses using text analytics concerned participation in the 2013 SemEval challenge on textual entailment, applying automatic grading methods to a novel corpus of children's essays, and publishing a survey article on the state of the art of short answer grading methods.

Educational Monitoring on the web – Identifying and Following Educationally Relevant Controversies

The project aimed at putting into practice a monitoring system for public opinion on the most important controversially discussed educationally relevant topics that could be found on the internet. Therefore, tools were developed that provide new opportunities for tracking and analyzing different controversies that are relevant to education.

Feature-based Visualization and Analysis of Natural Language Documents (VisADoc)

This project, implemented in cooperation with the University of Konstanz, aimed to investigate novel textual features for modeling content-related text properties, to develop an interactive feature engineering approach for complex user-defined semantic properties, and to develop visual analysis tools that support the exploration of large document collections with respect to a certain text property.

Knowledge Discovery in Scientific Literature

The main topic of this PhD program was knowledge discovery in the vast amount of scientific literature ubiquitously available on the Web and in historical texts. This research employed methods of intelligent identification and analysis of structures in scientific texts on all scales, enabling completely new, previously unforeseen forms of access to scientific information.

Visualising Complex Data in Education

The project aimed at providing support to educational information and educational research in the assessment of complex data. Focus was placed on natural language data processing as found in many forms in the field of education, such as free text questions in studies or publications.

Text mining and information search

Focused on a preferably intuitive, high-quality search technique for the web and information portals offered by the Information Center for Education. Furthermore, editors from the German Education Server were supplied with intelligent tools to support their work on compiling contents.

In the field of search, the scientists investigated semantic methods that enable users to find more relevant documents. Here, the large-scale lexical-semantic resource UBY for German and English was applied with its semantic linkages.

Concerning the creation of content, high-quality editing as it is performed by the German Education Server involved a high effort invested by the respective editors. The field of research aimed at investigating methods that can support work that is so far mostly done manually. For example, this pertains to the retrieval of further links on the basis of existing ones, assignment of index terms or the compilation of automated summaries of websites bearing relevance to education. Such methods are based on machine learning and linguistically inspired text characteristics.

A high quality of found resources and of created contents was assured by the development of automated quality analysis procedures as well as final quality checks by the editors of the German Education Server.

Automatic Coding of Free Text Formats for Elaborated Educational Measurement

AKTeur pursued the central objective of creating an interdisciplinary co-operation network and a broad technological basis for automated coding of free text responses in manifold scenarios.

Information Extraction From Spoken and Informal Language

In this project various aspects of extracting and using information based on data that contains spoken or informal language were examined. These aspects cover among others elements such as classification (What makes a good answer?) Segmentation and Keyphrase extraction (in the context of transcripts of school lessons) and Summarization of data for Eduserver. In each sub-project either the source of the data was from the educational domain or the goal was to provide information or tools to researchers in the area of educational research.

Contextualized Information Processing of Educational Content Using Automatic Summarization (Methods)

Subject to a co-operation with DIPF, a Master’s thesis at Technical University Darmstadt presented first approaches to developing a method offering machine-based support of manual summaries of data collections in the field of educational research.

Innovative services for educational research data

The innovative services for educational research data were aimed at reducing the manual effort of analysis invested by educational researchers, thus enabling them to gain more time for the actual research matters. Research in this area was conduced in cooperation with DIPF.