top of page
  • Shannon Honl

What does it mean to do history at scale?



Historians can analyze their sources in many ways. For example, last week I blogged about spatial analysis. This week, I focus on how distance – metaphorically speaking – can shape research and interpretation.

Doing history at scale requires a methodological choice (frequently) dictated by the source. As researchers, we have numerous tools and methods at our disposal. Sometimes a tool is chosen out of preference or familiarity. But often, the sources or data limit the possibilities. For example, it makes sense for a wildlife biologist to examine a cell culture using a microscope. But, on the other hand, if they study the social behaviors of a pod of whales, a microscopic approach will not reveal much. In the latter instance, the naked eye would offer a more productive vantage point for analysis.


This same analogy works for historians conducting textual analysis. When working with a small or limited number of texts, historians can choose to zoom in for a “close” (or microscopic) reading or zoom out for a “distant” perspective. Sometimes the synergy of the two scales can reveal interpretations that neither would on its own. This “close” and “distant” reading description is attributed to literary historian and theorist Franco Moretti.


Close reading allows historians to carefully investigate, study, or unpack a limited amount of data to interpret something much larger. For example, Edward Miller’s A Conspiratorial Life (2021) examines Robert Welch and the John Burch Society (JBS) bulletins. A close evaluation of one figure, and the rhetoric they generated for a mid-20th-century society, allows Miller to interpret the entirety of American conservatism and argue for Welch’s impact on today’s 21st-century political climate.


Other historians of this field have embraced different methodologies. So why did Miller choose a close textual read over a broader social or cultural approach? When asked, Miller explained that he needed to limit the scope if he was going to manage the analysis himself.


But not all projects can be scaled down for human feasibility, and many sources do not lend themselves to close reading. It could be like whale watching through a microscope. When that scenario presents itself, only distant reading will do.


“Big data,” discussed in Geoffrey Rockwell’s and Stefan Sinclair’s Hermeneutica, is one example of a source too big for human scale. Rockwell and Sinclair describe it as “a lot (volume) of heterogeneous data (variety), sometimes coming in very quickly (velocity), from which business or governments can extract new truths (veracity) and make money (value).” In short, digital data is generated exponentially, and researchers risk drowning in it.



How do we grapple with, analyze, and interpret such large quantities of data? When source volume gets so big that it is humanly impossible to analyze it, researchers must step back – distance themselves – and look at the whole. Forgive the figure of speech, but it is a way to see the forest for the trees. With traditional texts, this is accomplished through sampling or (like Miller) a close reading of a small section representative of the whole. But when it comes to digital data, it is done through computerized data processing, allowing researchers to see larger patterns and otherwise indiscernible changes.


Computer-aided distant reading – specific to digital data – generates an entirely new methodology. As Rockwell and Sinclair argue, it is an approach both in response to and made possible by the bulk of data. The data begins to drive the method or tool of choice.


When this occurs, Rockwell and Sinclair admonish researchers about potential risks. Distant reading on this scale can lead to false results or misapplied algorithms. The solution lies in “checking your work.” Just like your math teacher said! Miller can more easily do this by returning to a close reading of the textual sources. But close reading is difficult, if not impossible when working with big data and computer analysis. Thus, humans need to take responsibility for testing and judging the experiment’s validity. The process requires both humans and machines.


Like most things in life, there are pros and cons. Computer-aided “reading” certainly presents possibilities unattainable with other tools. Yet, as Rockwell and Sinclair have pointed out, there are inherent flaws to be aware of. Moreover, I am uncertain how this “new form of history,” as Frederick Gibbs phrased it, merges with my own practice. But I am eager to explore ways others have leveraged the approach in hopes their precedents will inspire my own dabbling.

Comments


bottom of page