Mapping the shifting canon: Stanford researchers use data to rethink English education

Oct. 31, 2024, 11:33 p.m.

What if the English Language Arts (ELA) canon wasn’t a fixed list of revered classic texts, but a dynamic network, constantly shifting across institutions and regions?

On Tuesday, Sarah Levine, assistant professor in the Graduate School of Education, and Nichole Nomura, English lecturer and associate director of the Literary Lab, explored this idea in their seminar titled “Connecting English Language Arts (ELA) Canons and Curricula.” Sponsored by the Center for Spatial and Textual Analysis (CESTA), the talk took place in Wallenberg Hall with an introduction by senior English lecturer Alice Staveley.

Levine began by giving the audience a rare look behind the scenes, explaining that “research is not linear.” Their project started in 2016 with an exploration of existing data sets, focusing on what was missing or inaccurate. Drawing on their diverse backgrounds in English and education, Levine and Nomura collaborated to navigate and synthesize key questions. As they delved deeper, their questions evolved, shaping an iterative research process.

Nomura then addressed the literary canon itself. “One of the ways we think about the canon is as a list,” she remarked, referring to items like a table of contents, a bibliography, or a works cited section. She then argued that this limited perspective overlooks the canon’s broader implications — especially how it is adapted across diverse educational contexts. Their research introduces a fresh approach: seeing the canon as a network of interconnected authors, texts and institutional influences.

To bring this concept to life, Levine and Nomura introduced their Course Description Archive for Research (CDAR), a tool designed to map relationships across ELA curricula nationwide. This digital archive enables them to turn static lists of texts into visual networks, revealing patterns that both reflect and challenge cultural and educational priorities. Through CDAR, they uncover unexpected connections and contradictions that suggest the canon is anything but fixed, instead it shifts and adapts with each educational setting.

The pair’s findings reveal clusters of canonical authors who frequently appear together in college syllabi, while others stand isolated. Emily Dickinson and Walt Whitman, for example, sit at the center of a dense network of American literature authors, whereas authors like Hans Christian Andersen and the Brothers Grimm exist on more isolated “islands,” not due to their canonical status but rather the selectiveness of their inclusion in course materials — perhaps representing a single class in a single term at a single university in their dataset.

The team’s research also sheds light on the discrepancies between two- and four-year colleges. Using metrics such as author pronouns and race representation, they found that female and non-white authors are more commonly taught in four-year universities, indicating equity gaps that reinforce how the canon and curriculum choices vary with institution type.

An essential aspect of Levine and Nomura’s work lies in their computational approach, a new and evolving method in the humanities that is complicated by the ephemeral nature of digital course descriptions. Many universities delete online course listings each term, posing a challenge for researchers collecting historical data. To tackle this, Levine and Nomura use a combination of web scraping and web archiving — tools that allow them to both collect timely data and maintain a permanent archive. Levine described this dual approach as essential for building a reliable, scalable data pipeline.

Through this method, Levine and Nomura compared ELA curricula across eight U.S. states, including institutions from public, private and community colleges. They analyzed what classes actually fall under the class code of “English” at various institutions, creating a model that captures the nuances of each college’s approach to the subject. Their analysis even revealed surprises, such as Toni Morrison’s centrality in college syllabi networks. In one network model, Morrison is more central than even William Shakespeare, showcasing the bridged gaps between disparate clusters and enabling students to engage with texts from a new perspective.

“Students may be empowered to take another class because they already know Morrison,” Nomura noted. “She might be the first Black author a student encounters in college.”

Levine and Nomura’s study reflects a broader question: how does the ELA canon adapt to modern concerns of representation and diversity? Their network analysis shows that authors like Morrison not only intersect with other different literary works in classrooms but also provide a framework that encourages broader exploration in college-level English. The project’s next steps involve expanding its scope to California’s public universities and secondary schools, examining syllabi and district-level course descriptions to trace the canon’s evolution from high school through to higher education.

In the Q&A that followed, faculty and students probed the researchers on the alignment between course descriptions and actual classroom content. The discussion reflected a shared curiosity among attendees about the complex, sometimes contradictory nature of curricular design in English education.

Levine and Nomura’s work provides an eye-opening look at the ELA canon, asking whether today’s curricular choices offer students a genuinely inclusive literary landscape or simply replicate long standing hierarchies. By rethinking the canon as a network rather than a static list, they offer a fresh perspective on how educational institutions might better adapt their English curricula to a rapidly changing world.

Print Article

Mapping the shifting canon: Stanford researchers use data to rethink English education

Login or create an account