Course: CS 271: Topics in Data Visualization
Course Level: Graduate
Course Description: "This course covers advanced topics in data visualization. Over the course of the semester, we will examine seminal works and recent state-of-the-art research in information visualization, scientific visualization, and visual analytics. Students are encouraged to bring in ongoing or related research. Topics covered in this class include interaction, storytelling, evaluation, color, volume rendering, vector field visualization, visualization in sciences, big data visualization, uncertainty visualization, and visualization for machine learning. Students will work on a semester-long visualization project that will allow them to visualize their own data sets. We will take a structured approach on how to read, analyze, present, and discuss research topics. Furthermore, we will employ peer-feedback and formal design critiques to analyze each other’s work.""
Module Topic: The Ethics and Politics of Data Visualization
Module Author: Marion Boulicault
Semesters Taught: Spring 2020
In this module, we discuss the ethical and political dimensions of data visualization. The module sets the stakes by beginning with a discussion of the social and epistemic power of data visualization in today’s world. It then focuses on a set of commonplace principles for effective data visualization. The political and ethical dimensions of each principle are considered and debated. Finally, the students are asked to expand on the meaning of ‘effective’ by brainstorming alternative data visualization principles that center the ethical and political dimensions.
Connection to Course Material: The course teaches technical skills, strategies, and principles for effective data visualization. The module examines the ethical and political dimensions of these skills, strategies and principles. For example, one of the suggested strategies for effective data visualization is to ‘reduce cognitive load’ for the audience. As part of the module, the Embedded EthiCS TA leads a discussion about one way cognitive load could be reduced: taking advantage of (and therefore potentially reinforcing) problematic commonplace existing stereotypes, such as using the color ‘blue’ to indicate male and the color ‘pink’ to indicate female. By highlighting examples like these, the module provides a lens and set of tools for identifying and analyzing the ethical dimensions of the technical practice of data visualization.
Key Philosophical Questions:
Key Philosophical Concepts:
Lundgard, Alan, Crystal Lee, and Arvind Satyanarayan. “Sociotechnical Considerations for Accessible Visualization Design.”
Sample Class Activity:
In order to get students to feel the force of the ethical questions about predictive algorithms used in recidivism prediction, the module begins with two polls. In the first poll, students are asked to consider a scenario in which they are a judge making a pre-trial decision: they must decide whether to make that decision based on their own judgment or based on a risk assessment produced by a predictive algorithm. In the second poll, they are asked to make the same decision but from the perspective of the detainee about whom the pre-trial decision is being made. After the polls are complete, the students discuss their reasons for their answers. Then, they are asked to consider the common assumption that predictive tools will allow us to pass the buck on certain kinds of responsibilities in high stakes cases. They then discuss the way in which the responsibility of the creator of the predictive algorithm is heightened because of the way in which the algorithms are relied upon.
In a post-module assignment, students are asked to explore recidivism data and corresponding COMPAS scores published by ProPublica. They are then asked to: (1) find correlations and differences between a defendant’s race and various other variables in the data; (2) write a short response to the question, “With respect to these variables, how could bias in the data or data collection be impacting or causing these differences?”; (3) build three predictive models from the data that leave out race and other correlating variables in different ways in order to see what impact different variables are having on the model; and (4) discuss the resulting false positive rates amongst different racial groups in each of their models and what implications this has for the fairness of predictive algorithms.
Since the course had recently covered the technical concepts of calibration and false positive rates, we assumed that spending time reviewing these concepts would be unnecessary. In practice, however, we found that some students were not fluent enough with these concepts to readily apply them in the context of a new discussion about algorithmic fairness. When we teach the module again, we plan to spend more time reviewing these concepts before introducing new material.