DH and the Woman’s Exponent

By June 14, 2019

“The techno-revolution has begun! Soon, robots will scour women’s words and discover the truth about everything.” Or, at least, that’s what I imagine Brigham Young would have said if he had read the University of Utah’s Digital Matters Lab and BYU’s Office of Digital Humanities’ preliminary report on topic modeling the Woman’s Exponent. Sounds like something he’d say.

The “Quick and Dirty Topic Model” is a sneak-peek at a larger project that will be released with Better Days 2020, which is the sesquicentennial celebration of women’s suffrage and the centennial of the 19th Amendment. It sounds like the results of the later slow and thorough topic model will be released in a digital and explorable format with the Better Days celebrations.

If you’re new to topic modeling, then it’s worth reading a few quick words. Topic modeling is one way of “distant reading” (think of normal reading as close reading) a body (corpus) of texts in relation to one another. Topic modeling, in particular, assumes that a corpus contains a set amount of “topics” or “themes.” The digital humanist sets this number of topics for a corpus, which takes some fiddling and tinkering to make sense of different kinds of results. The model, then, takes words within topics and within a certain nearness in the corpus, say 100 words, and figures out which other words are likely to co-occur within each topic. This happens word-by-word. At the end of the study, the model produces a series of topics—typically visualized in word clouds—that hopefully say something about your corpus.[1] The strengths of topic modeling have to do with the ability to read across documents within a corpus, to make associations that might not appear in a close reading context, and to expand the scope of a project. Also, it’s just a fun and cool way to enter a set of texts, and it’s kind of like being a word witch or wizard.

            The “Quick and Dirty Topic Model” presented a few interesting topics. I’ll only highlight one, here. They found what they’ve labeled as “The Pacific Islands” topic. Much of the words in this topic make sense: islands, natives, hawaiian, laie, sandwich, language, etc. Some of the words, though, might take a little more teasing to understand why the model has lumped them together. Appearing in the model, albeit fairly small, are: india, coptic, Crosby, and koran. Historians’ questions, then, might follow the questions that Laurie Maffly-Kipp asks in her chapter “Looking West: Mormonism and the Pacific World.”[2] Working in the language R also allows the authors to track the occurrence of a topic throughout a corpus over time. For the Pacific Islands topic, there seems to be a fairly consistent proportion of the topic in the corpus throughout the 1880s and 1890s. But, in the mid-1890s, the topic dips and then jumps. Then it stays sporadic after the immediate years at the turn of the century. This could be for a variety of reasons. It could have to do with the size of the text itself in relation to each year—maybe the Woman’s Exponent has twice as many articles starting in 1901 than in the years prior (although I don’t think this is the case). It might have to do with a rhetorical shift in language—maybe Mormons started using a different set of words to describe the Pacific Basin as a “marketing strategy” after 1901 (although topic models are pretty good at picking up on this kind of thing, especially if you find a similar-but-different topic that might be nonexistent before 1901 and then become meaningful after 1901). More likely, it has to do with a series of Pacific-facing missionary enterprises among converting success around the turn of the century. I’d love to see how this dataset pairs with some more local concerns, like those raised at Charlotte Hansen Terry’s presentation “Creating Good Mormons in the Pacific During the Nineteenth and Early Twentieth Centuries” at this past MHA.

Other topics include a comparison of National Women’s Suffrage against Utah Women’s Suffrage, and Monogamy & Polygamy. I’m sure the slow and thorough study will reveal as many avenues for study as this quick and dirty sneak-peek did. I, for one, am super-excited to see the use of DH in Mormon Studies and an engagement outside of some insular Mormon-centric avenues. Cheers, Better Days 2020, the U and BYU DH labs, and your future study of the Woman’s Exponent!

[1] Check out these blog posts for further explanation: Matthew Jockers, “The LDA Buffet: A Topic Modeling Fable,” http://www.matthewjockers.net/macroanalysisbook/lda/; and Ted Underwood, “Topic modeling made just simple enough,” https://tedunderwood.com/2012/04/07/topic-modeling-made-just-simple-enough/.

[2] Laurie F. Maffly-Kipp, “Looking West: Mormonism and the Pacific World, in Laurie F. Maffly-Kipp and Reid L. Neilson, eds., Proclamation to the People: Nineteenth-Century Mormonism and the Pacific Basin Frontier (Salt Lake City: University of Utah Press, 2008).

Article filed under Announcements and Events Digital Humanities International Mormonism Methodology, Academic Issues Miscellaneous Mormon History Association Polygamy Race Research Tools Textual Studies Women's History


  1. Thanks for this great post and run-down of topic modeling. Cheers Jeff!

    Comment by Hannah — June 14, 2019 @ 2:23 pm


Recent Comments

wvs on The historiography of adoptive: “Bravo, J.”

J. Stapley on The historiography of adoptive: “Thanks, guys.”

Gary Bergera on The historiography of adoptive: “Really interesting information, J. Thanks.”

John Hajicek on The historiography of adoptive: “THIS is how all Mormon history should be written. Outside of Utah, historians write the history of the story, instead of writing a…”

Jeff T on Digital News: The Woman's: “Thanks Liz and JJohnson! I know that there are efforts to capture data from individual articles, although that might be a while down the pipeline.…”

JJohnson on Digital News: The Woman's: “Such a great source. And the scans are beautiful. Any attempt being made to identify all the authors? I'm particularly interested in A.P. today. …”