Further Reading

Since this application was developed in a research project, several publications are available that discuss the concepts used in the visualization.

Exploring Big Data Landscapes with a Glyph-based Zoomable User Interface (September 2018)

High-dimensional data sets are hard to explore using common spreadsheet environments. However, data scientists need to develop appropriate clustering and classification algorithms to make sense of big data repositories. Even sophisticated analysis tools often focus on mathematical tasks and offer only basic data visualization with few interactive features. In order to gain more sophisticated insights and test hypotheses with regards to high-dimensional data sets, we developed an interactive zoomable user interface using glyph-based visualizations. The visualization is based on a two-dimensional plot of the data space using multi-dimensional reduction. The resulting Big Data Landscapes are then explored with various controls, filters, and details on demand.

Dietrich Kammer, Mandy Keck, and Thomas Gründer. 2018. Exploring Big Data Landscapes with a Glyph-based Zoomable User Interface. In: Dachselt, R. & Weber, G. (Hrsg.), Mensch und Computer 2018 – Workshopband. Bonn: Gesellschaft für Informatik e.V.

Big data landscapes: improving the visualization of machine learning-based clustering algorithms (May 2018)

With the internet, massively heterogeneous data sources need to be understood and classified to provide suitable services to users such as content observation, data exploration, e-commerce, or adaptive learning environments. The key to providing these services is applying machine learning (ML) in order to generate structures via clustering and classification. Due to the intricate processes involved in ML, visual tools are needed to support designing and evaluating the ML pipelines. In this contribution, we propose a comprehensive tool that facilitates the analysis and design of ML-based clustering algorithms using multiple visualization features such as semantic zoom, glyphs, and histograms.

Dietrich Kammer, Mandy Keck, Thomas Gründer, and Rainer Groh. 2018. Big data landscapes: improving the visualization of machine learning-based clustering algorithms. In Proceedings of the 2018 International Conference on Advanced Visual Interfaces (AVI ’18). ACM, New York, NY, USA, Article 66, 3 pages. DOI: https://doi.org/10.1145/3206505.3206556.

Exploring Big Data Landscapes with Elastic Displays (September 2017)

In this paper, we propose a concept to help data analysts to quickly assess parameters and results of cluster algorithms. The presentation and interaction on a flexible display makes it possible to grasp the function- ing of algorithms and focus on the data itself. Two interaction concepts are presented, which demonstrate the strength of elastic displays: a layer concept that allows the recognition of differences between various parameter settings of cluster algorithms, and a Zoomable User Interface, which encourages the in-depth analysis of clusters.

Kammer, D., Keck, M., Müller, M., Gründer, T. & Groh, R. (2017). Exploring Big Data Landscapes with Elastic Displays. In: Burghardt, M., Wimmer, R., Wolff, C. & Womser-Hacker, C. (Hrsg.), Mensch und Computer 2017 – Workshopband. Regensburg: Gesellschaft für Informatik e.V., DOI: http://doi.org/10.18420/muc2017-ws08-0342.

Towards Glyph-based Visualizations for Big Data Clustering (August 2017)

Data Analysts have to deal with an ever-growing amount of data resources. One way to make sense of this data is to extract features and use clustering algorithms to group items according to a similarity measure. Algorithm developers are challenged when evaluating the performance of the algorithm since it is hard to identify features that influence the clustering. Moreover, many algorithms can be trained using a semi-supervised approach, where human users provide ground truth samples by manually grouping single items. Hence, visualization techniques are needed that help data analysts achieve their goal in evaluating Big data clustering algorithms. In this context, Multidimensional Scaling (MDS) has become a prominent visualization tool. In this paper, we propose a combination with glyphs that can provide a detailed view of specific features involved in MDS. In consequence, human users can understand, adjust, and ultimately improve clustering algorithms. We present a thorough glyph design, which is founded in a comprehensive survey of related work and report the results of a controlled experiments, where participants solved data analysis tasks with both glyphs and a traditional textual display of data values.

Mandy Keck, Dietrich Kammer, Thomas Gründer, Thomas Thom, Martin Kleinsteuber, Alexander Maasch, and Rainer Groh. 2017. Towards Glyph-based visualizations for big data clustering. In Proceedings of the 10th International Symposium on Visual Information Communication and Interaction (VINCI ’17). ACM, DOI: https://doi.org/10.1145/3105971.3105979. New York, NY, USA, 129-136.