Visual cluster analysis and dimensionality reduction combined with visual-interactive interfaces build the basis for exploratory data analysis support. At a glance, many techniques are related to Data Mining and Interactive Machine Learning. Research into example-based and sketch-based querying facilitated with visual information retrieval adds the search component. Combining exploration and search leads to powerful interactive user interfaces, tools, and systems that have been put into practice in several research projects.
One example project is VisInfo, a digital library project that enables domain experts in Earth and environmental science to access time-oriented research data from large data repositories. For the first time, domain experts are now able to conduct content-based search and exploration in time-oriented primary data, making use of the Exploratory Search principle.
Exploratory Search explained
Exploration can be described as an information-seeking concept where analysts initially neither know what to seek nor where to seek. Regarding large and complex data sets, this is a frequent, but particularly challenging task that is also difficult to support from a data scientists’ perspective. A central goal of explorers is to gain an overview of large data collections. Insights gained from the overview often include structural information about data sets such as patterns (clusters, outliers) or relations in/between the data.
Search refers to the formal description of a precise information need (such as a textual query in a search engine). Challenges particularly occur if the information need of searchers relates to the content of documents itself, rather than to associated metadata. While for textual content this challenge is manageable, most other types of content is still subject to intensive research. Research is, e.g., put into visual-interactive interfaces that allow the formalization of content-based queries (query-by-sketch, query-by-example). Given that, users can, e.g., draw the shape of a time series, execute the query request, and automatically retrieve a result of time series patterns in a database matching the given query.