Demo Paper on Ad-hoc Structured Text Exploration accepted to ACM SIGMOD/PODS 2022

Our demo paper “Demonstrating ASET: Ad-hoc Structured Exploration of Text Collections” by Benjamin Hättasch and Jan-Micha Bodensohn will be presented at the International Conference on Management of Data 2022 in Philadelphia

2022/03/07

In this demo, we present ASET, a novel tool to explore the contents of unstructured data (text) by automatically transforming relevant parts into tabular form.

ASET works in an ad-hoc manner without the need to curate extraction pipelines for the (unseen) text collection or to annotate large amounts of training data. The main idea is to use a new two-phased approach that first extracts a superset of information nuggets from the texts using existing extractors such as named entity recognizers. In a second step, it leverages embeddings and a novel matching strategy to match the extractions to a structured table definition as requested by the user. This demo features the ASET system with a graphical user interface that allows people without machine learning or programming expertise to explore text collections efficiently. This can be done in a self-directed and flexible manner, and ASET provides an intuitive impression of the result quality.

Learn more about ASET in this video:

Demo Video

Recommended external content

We have selected external content from YouTube for you and would like to show it to you right here. To do this, you must reveal it with one click. You can hide the external content at any time with another click.

External content

I agree to external content from YouTube being shown to me. This may result in personal data being transmitted to third-party platforms. You can find more information in our Privacy Policy.