Demo Paper on Ad-hoc Structured Text Exploration accepted to ACM SIGMOD/PODS 2022

2022/03/07

Demo Paper on Ad-hoc Structured Text Exploration accepted to ACM SIGMOD/PODS 2022

Our demo paper "Demonstrating ASET: Ad-hoc Structured Exploration of Text Collections" by Benjamin Hättasch and Jan-Micha Bodensohn will be presented at the International Conference on Management of Data 2022 in Philadelphia

In this demo, we present ASET, a novel tool to explore the contents of unstructured data (text) by automatically transforming relevant parts into tabular form.

ASET works in an ad-hoc manner without the need to curate extraction pipelines for the (unseen) text collection or to annotate large amounts of training data. The main idea is to use a new two-phased approach that first extracts a superset of information nuggets from the texts using existing extractors such as named entity recognizers. In a second step, it leverages embeddings and a novel matching strategy to match the extractions to a structured table definition as requested by the user. This demo features the ASET system with a graphical user interface that allows people without machine learning or programming expertise to explore text collections efficiently. This can be done in a self-directed and flexible manner, and ASET provides an intuitive impression of the result quality.

Learn more about ASET in this video:

Demo Video

go to list