PhenoLines: Phenotype Comparison Visualizations for Disease Subtyping via Topic Models (2:54 min.)
Video title (x:xx min.)
PhenoLines is a visual analysis tool for the interpretation of disease subtypes, derived from the application of topic models
to clinical data. Topic models enable one to mine cross-sectional patient comorbidity data (e.g., electronic health records) and construct
disease subtypes—each with its own temporally evolving prevalence and co-occurrence of phenotypes—without requiring aligned
longitudinal phenotype data for all patients. However, the dimensionality of topic models makes interpretation challenging, and de
facto analyses provide little intuition regarding phenotype relevance or phenotype interrelationships. PhenoLines enables one to
compare phenotype prevalence within and across disease subtype topics, thus supporting subtype characterization, a task that involves
identifying a proposed subtype’s dominant phenotypes, ages of effect, and clinical validity. We contribute a data transformation workflow
that employs the Human Phenotype Ontology to hierarchically organize phenotypes and aggregate the evolving probabilities produced
by topic models. We introduce a novel measure of phenotype relevance that can be used to simplify the resulting topology. The design
of PhenoLines was motivated by formative interviews with machine learning and clinical experts. We describe the collaborative design
process, distill high-level tasks, and report on initial evaluations with machine learning experts and a medical domain expert. These
results suggest that PhenoLines demonstrates promising approaches to support the characterization and optimization of topic models.
MLA Glueck, M., Naeini, M. P., Doshi-Velez, F., Chevalier, F., Khan, A., Wigdor, D., & Brudno, M. (2018). PhenoLines: Phenotype Comparison Visualizations for Disease Subtyping via Topic Models. IEEE Transactions on Visualization and Computer Graphics. (To Appear)
Visual data representations leverage the power of human perception to process complex information, and through interaction, garner new insights. Our research focuses on visualizing data from a wide variety of domains and fundamentally tackles the question, what makes a visualization effective? We explore novel visual encodings and interaction techniques, multiscale approaches, and even simulation to bridge human and automated analysis of multivariate, time-series, and graph data, ultimately aiding in hypothesis generation, testing, and sense making.