Embedded ethics at Harvard: Harvard: bringing ethical reasoning into the computer science curriculum.

Embedded EthiCS @ Harvard: bringing ethical reasoning into the computer science curriculum.
Course Modules / CS 61: Systems Programming and Machine Organization

Repository of Open Source Course Modules


Course: CS 61: Systems Programming and Machine Organization

Course Level: Upper-level undergraduate

Course Description: “CS 61 is a first course in computer systems programming, meaning the creation of high-performance programs that use computer hardware effectively. Although many programs today are written in high-level programming languages—and many programs simply glue together existing components—the best programmers are craftspeople who understand their tools. For software builders, this requires a working knowledge of computer internal organization. It means understanding how machines interpret instructions, how compilers turn programming languages into instructions, and how operating systems combine programs and libraries to create running code. And it requires understanding the factors that affect code performance.

CS 61 introduces you the tools you need to build robust, efficient software and the mental tools you need to understand software systems written by others. We hope you'll discover that systems software development is fun and worth the effort. We intend the course to be broadly accessible, though it will be easier for those who have some experience with systems programming in C++ or other C-like languages. (Course description )"

Module Topic: ASCII, Unicode, and the Ethics of Natural Language Representation

Module Author: Cat Wade

Semesters Taught: Fall 2018


systems CS
Unicode CS
natural language encoding CS
harm phil
representational harm phil
allocative harm phil
stereotypes phil

Module Overview: In this module, we consider the ethics of natural language representation in modern software systems. Software systems play an increasingly central role in how we communicate with one another, and the computer scientists who design these systems are sometimes faced with difficult choices about what representational resources they should make available to their users. To what extent, for instance, should social media platforms support the vast array of languages used throughout the world? To what extent should the developers of smartphone operating systems provide their users with emoji reflecting the diverse identities and communicative needs of members of minority groups?

This module is co-taught by the professor for the course and the Embedded EthiCS TA. After an introduction to the ethical issues considered in the module from the TA, the professor gives a brief presentation on the technical dimensions of the module’s core case study: the shift from ASCII to Unicode, and the associated choices developers made about which languages to support. The TA then leads a discussion of the effects these choices had on members of different linguistic communities, and why those effects matter from an ethical perspective. Finally, students consider various strategies the developers of Unicode might adopt in order to better address the needs of minority communities, consistent with other needs the system is designed to satisfy and relevant technical constraints.

Connection to Course Technical Material: This module occurs during the course’s first unit on data representation and storage. The professor’s presentation during the module expands on the technical material already covered in this unit, considering how it applies to the module’s central case study: the shift from ASCII to Unicode. This sets the TA up to lead a discussion of how the technical issues discussed by the professor interact with broader social and ethical concerns.

We have found that it is important to build modules around real-world case studies that both connect strongly to technical material discussed in the course and raise ethical issues that students can readily appreciate. The shift from ASCII to Unicode has both features. Further, Unicode is the standard for encoding emojis, which provide a particularly intuitive and relatable way to illustrate the module’s core philosophical concepts (see the sample class activity below).

© 2018 by Cat Wade, "ASCII, Unicode, and the Ethics of Natural Language Representation" is made available under a Creative Commons Attribution 4.0 International license (CC BY 4.0).

For the purpose of attribution, cite as: Cat Wade and Eddie Kohler, "ASCII, Unicode, and the Ethics of Natural Language Representation" for CS 61: Systems Programming and Machine Organization, Fall 2018, Embedded EthiCS @ Harvard. CC BY 4.0.


Module Goals:

  1. Familiarize students with the technical aspects of ASCII and Unicode, as well as the social and technical considerations that drove the shift from ASCII to Unicode.
  2. Introduce students to two philosophical concepts that are useful for evaluating formal systems for representing natural language, allocative harm and representational harm.
  3. Give students practice applying these concepts to evaluate choices made by software developers about what representational resources to provide to their users.
  4. Give students practice identifying and evaluating possible strategies for alleviating representational harms in the design of formal systems for representing natural languages.

Key Philosophical Questions:

  1. How should software developers decide what representational resources to make available to their users for use in communication?
  2. In what ways can the choices developers make about what representational resources to make available negatively affect the members of different communities, including minority communities?
  3. What is the difference between 'representational' and 'allocative' harm?
  4. What are stereotypes, and in what ways can relying on stereotypes harm others?
  5. Were the choices made by the developers of ASCII and Unicode the right choices, given the constrains they were operating under, or were there other choices that would have been better from an ethical perspective?


Key Philosophical Concepts:

  • Harm and intent.
  • Representational harms and allocative harms.
  • Stereotypes.


Class Agenda:

  1. An introduction to the ethics of character encoding: should emoji be more inclusive?
  2. Technical material – ASCII, Unicode, UTF-8.
  3. Representational harm vs. allocative harm.
  4. Active learning exercise: how could developers make the current set of emoji more inclusive?
  5. Representational and allocative harms in the development of ASCII and Unicode.
  6. Remaining ethical issues with Unicode, and how best to address them.

Sample Class Activity: After being introduced to the concept of representational harm, students are presented with a slide containing the current set of ‘yellow’ emoji representing families of different kinds. In small groups, students discuss what kinds of families are left out from the current set, and whether those omissions constitute representational harms.