Module – CS 279r – Fall 2023 – Embedded EthiCS @ Harvard

Embedded EthiCS @ Harvard Bringing ethical reasoning into the computer science curriculum

We Value Your Feedback! Help us improve by sharing your thoughts in a brief survey. Your input makes a difference—thank you!

Research Topics in Human-Computer Interaction (CS 279r) – 2023 Fall

First time reviewing a module? Click here.

Click ⓘ to access marginalia information, such as reflections from the module designer, pedagogical decisions, and additional sources.

Click “Download full module write-up” to download a copy of this module and all marginalia information available.

Module Topic: Contextual Pressures in Human-AI Interaction
Module Author: Dasha Pruss

Download full module write-up

Course Level: Graduate
AY: 2023-2024

Course Description: “Students will read and discuss papers from HCI and related fields that inform our fast-moving current understanding of how AI systems can work with—or clash against—the strengths and weaknesses of human cognition. Required activities will include pre-class comments and conversations anchored on assigned readings in Perusall, selected relevant application and AI programming toolkit tutorials, in-class student and instructor-led discussions, lectures on relevant methodologies, guest lectures, and a semester-long project, in which students will work together in groups to design, build, and evaluate novel interfaces for human-AI communication and collaboration.
Designed for PhD students from all areas; a diversity of disciplinary backgrounds has greatly benefited past student teams. Several student teams have subsequently iterated on and published their projects in top-tier venues. Masters students and advanced undergraduates are welcome, particularly those who wish to write a thesis or apply for a PhD in an area related to Human-Computer Interaction” (CS 279r course website)

Semesters Taught: Fall 2019, Fall 2020, Fall 2022, Fall 2023

Tags

ⓘ

human-computer interaction [CS]
inductive risk [phil]
algorithmic bias [both]
algorithmic decision-making systems [CS]
responsibility gap [phil]

Module Overview: The ways in which humans interact with AI systems can be difficult to measure in a laboratory. Systems that are designed without taking into account the pressures faced by users in real-world settings might result in unanticipated consequences. Through in-depth discussion of case studies of algorithmic decision-making systems used by judges and child welfare call screeners, students learn about two examples of contextual pressures that affect how the tools are used – pressure from public elections, and pressure from managers and other workers. We analyze why these pressures mediate how these tools are used in practice by examining the role of inductive risk and the responsibility gap. In light of these issues, students are invited to brainstorm alternate designs and follow-up research studies for these high-stakes tools.

Connection to Course Technical Material

ⓘ

The content of this module builds on research conducted by the TA about human users of algorithmic decision-making tools in real-world settings. The professor encouraged the TA to craft a module around this topic.

CS 279R focuses largely on individual-level HCI design challenges arising from limitations in human cognition. The module complicates this picture by examining how HCI systems operate in real-world organizational settings. One of the readings for the module was published in CHI, one of the major human-computer interaction conferences. Students put what they have learned so far in the course into practice in coming up with alternate designs for high-stakes HCI tools.

Goals

Module Goals

Understand that workers who use algorithmic decision-making tools face contextual pressures.
Understand that how much (and which kinds of) error human decision-makers are willing to tolerate depends on these contextual pressures, which in turn influences how people adhere to algorithmic recommendations.
Identify ways that contextual pressures can result in unanticipated consequences in the tools’ implementation (e.g., amplify/introduce algorithmic bias, minimize algorithmic bias, result in algorithm aversion, enable managerial surveillance).
Practice designing algorithmic decision-support systems that take into account the contextual pressures faced by human decision-makers.

Key Philosophical Questions

ⓘ

These questions build on a more general philosophical reflection on inductive risk (potential consequences of type I/II errors) by considering how the relative cost of error is influenced not only by individually held non-epistemic values but also by organization-level factors beyond the control of individual decision-makers.

How can algorithmic bias be affected by human-algorithm interaction?
How do contextual pressures change the relative costs of different kinds of errors (inductive risk)?
How should HCI designers be attentive to the pressures faced by human decision-makers?

Materials

Key Philosophical Concepts

ⓘ

Even within HCI, it’s often assumed that human judgment will effectively be automated by algorithmic decision-making tools. But in practice, human decision-makers differ widely in their adherence to recommendations and follow them inconsistently in different situations. This inconsistent adherence can exacerbate or minimize algorithmic bias, or people might avoid using algorithms altogether (algorithm aversion).

In the assigned papers, human decision-makers wrestle with the relative costs of different kinds of errors, or inductive risk. Child welfare call screeners balance erring on the side of child safety versus erring on the side of family preservation. Judges balance erring on the side of public safety and erring on the side of decarceration. The relative costs of these errors depend on contextual pressures beyond the control of individual decision-makers, such as scrutiny from managers or voters.

The relative cost of errors is also influenced by the responsibility gap: when algorithms are used in decision-making, it becomes less clear who bears responsibility for a bad outcome. Judges take advantage of the responsibility gap to make more lenient decisions, while call screen workers want more of a responsibility gap so they can justify screening decisions that appear to be overly harsh.

Algorithmic bias
Inductive risk
Responsibility gap

Assigned Readings

ⓘ

The readings were chosen to illustrate two examples of contextual pressures that users of algorithmic decision-making systems may face.

Kawakami et al.’s article discusses Pennsylvania’s Allegheny Family Screening Tool (AFST), an algorithm that assesses the risk of a child’s future out-of-home placement in the foster system. It is meant to assist child protection hotline call screeners prioritize among referred cases. Kawakami et al. shadowed and interviewed call screeners and supervisors to analyze their experiences with the AFST. They find that call screeners minimize the algorithm’s bias against Black children by making holistic judgments and adjusting for the algorithm’s limitations, but they feel pressured by managers and case workers to conform to/ignore recommendations against their own judgment.

Albright’s article is an economics paper about the Kentucky Pretrial Risk Assessment (KPRA), which assesses the risk of misconduct of an arrested person awaiting trial. The KPRA is intended to reform the cash bail system by reducing judges’ errors and allowing more people to be released. The paper finds that algorithmic recommendations increase judges’ lenient decisions for low and moderate risk cases because judges perceive lower costs to making an error; this is due to the reputational cover that the algorithmic recommendation gives them. However, judges deviate from lenient recommendations more often for Black defendants than for white defendants with the same risk score. The paper contains several technical sections, which students were instructed to skip.

Kawakami, Anna, Venkatesh Sivaraman, Hao-Fei Cheng, Logan Stapleton, Yanghuidi Cheng, Diana Qing, Adam Perer, Zhiwei Steven Wu, Haiyi Zhu, and Kenneth Holstein. “Improving human-AI partnerships in child welfare: understanding worker practices, challenges, and desires for algorithmic decision support.” In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pp. 1-18. 2022.
Albright, Alex. “The Hidden Effects of Algorithmic Recommendations.” Manuscript, July 2023.

Implementation

Class Agenda

Introduction and tie-in
Background on human-AI interaction in real organizational settings
Think-pair-share
Summary of child welfare and sentencing case studies
Analysis of inductive risk and responsibility gap in both cases
Design brainstorming (small groups)
Discussion (large group)

Sample Class Activity

ⓘ

The list of potential design solutions is used as a springboard for discussing the ethical challenges intrinsic to each context (child welfare, criminal justice), many of which are beyond the control of HCI developers, as well as the ethical challenges that can be fruitfully addressed through more participative and critical design practices.

After summarizing the two papers and explaining the significance of inductive risk and the responsibility gap, the TA instructs students to discuss what they would do if they had to work on the AFST or the KPRA.
In small groups, students are asked to consider:

What design changes would you make?
If you feel you need more information, what follow-up studies would you do?

After discussing in small groups, the TA calls on groups to share what they talked about and collects a list of possible design changes and follow-up studies on the board for large-group discussion.

Module Assignment

ⓘ

Perusall allowed the TA to see in advance areas of confusion and particular interest from the students, which was helpful for structuring class discussion.

Students received credit for the module by engaging with the assigned readings in Perusall, which allows students to highlight and leave comments on sections of the paper.

Lessons Learned

Students listened attentively to the lecture segment and were engaged during the discussion.

The think-pair-share at the beginning of the class warmed up students to talking, which can be valuable when students are interacting with an instructor they are unfamiliar with.
Writing discussed design ideas in the slides encourages students to share and allows the TA to document and disseminate the ideas with students following the module.
In the future, it would be better to have an evaluation component following the class.
If an assigned paper contains technical parts, as the Albright paper does, it’s important to make sure to communicate to students that they can skip those components.
Future iterations of the module could focus on only one case study (Kawakami et al.), which would allow more time for discussion and an alternate design activity.

Home

About
    About Approach
  History
  News

Team
Placement Team
Team Alums

Column block with website map links

Modules
Repository Tips
Module Repository

More Resources

Contact Us

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution 4.0 International License.

Embedded EthiCS is a trademark of President and Fellows of Harvard College | Contact us