The Emergence of Seventh-day Adventism

I told the view to our little band in Portland, who then fully believed it to be of God. It was a powerful time. The solemnity of eternity rested upon us. About one week after this the Lord gave me another view, and shewed me the trials I must pass through, and that I must go and relate to others what he had revealed to me, and that I should meet with great opposition, and suffer anguish of spirit by going. But said the angel “The grace of God is sufficient for you: he will hold you up.”
Ellen White, 18511

And it shall come to pass afterward, that I will pour out my spirit upon all flesh; and your sons and your daughters shall prophesy, your old men shall dream dreams, your young men shall see visions
Joel 2:28, King James Bible

The early years of the nineteenth century were a period of religious innovation and revival in the United States, witnessing the rapid growth of previously small denominations such as Methodists and Baptists as well as the creation of new religious movements such as Barton Stone’s Christians and Joseph Smith’s Latter-day Saints. Known for revivals marked by enthusiastic and emotional displays, the period was characterized by the “democratization” of authority regarding religious truths, with careful individual study of the Bible rather than religious training or position held up as the primary source of authority.2 It was also a period of heightened millennial expectation, as believers from a wide variety of Protestant backgrounds devoted themselves to bringing about the fulfillment of God’s plans through the conversion of sinners, social reform, and personal piety. One often-cited example of this flurry of interest in the second coming was the end-times preaching of William Miller, a one-time “deist” and grandson of a Baptist minister whose study of the Bible led him to conclude that the second coming would take place between 1843 and 1844.3 Coming at the end of the period known as the Second Great Awakening, Miller’s predictions and the responses to them shaped American Protestantism, encouraging conversions and revivals in the years leading up to 1844, while also reinforcing a cultural shift back toward more structured and respectable expressions of religion as time continued.

Miller’s teachings also gave rise to new religious denominations, as during the years after 1844 those who had embraced his message wrestled with the meaning of the continuation of time and how to reconcile it with their belief that Miller’s interpretations were correct. The largest and most successful of these new denominations was Seventh-day Adventism, which grew from a very modest 3,500 estimated members at the time of their incorporation in 1863 to over 20 million members worldwide as of 2016.4 Despite their ongoing growth, Seventh-day Adventism has received little sustained attention from scholars of American religious history. When mentioned, it is generally in relation to the denomination’s roots in Miller’s adventism, as the example of the continuing reach of Miller’s teaching even after the failure of his 1844 prophecies.5 Looking beyond their roots in Miller’s millennial teaching, the study of Seventh-day Adventism offers a lens on the relationship between religion and gender within nineteenth-century America, and particularly on the role of end-times expectation in the cultural development of religious movements.

The development of Seventh-day Adventism presents a unique opportunity to examine the relationship between end-times beliefs and religious culture in American history, particularly as seen in ideas about gender, health, and salvation. On the one hand, Seventh-day Adventism fits many of the patterns that marked religious revivalism of the early to mid-nineteenth century: the embrace of a new understanding of the Bible that brought all into alignment, the charismatic leadership of Miller and then the husband and wife pair of James and Ellen White, a willingness to suspend typical gender norms as part of the evidence of last days and the mission of the denomination, and an urgent emphasis on salvation and the belief that the end times were at hand. However, such similarities also obscure some of the instructive differences between the denomination and nineteenth-century revivalism generally. While echoing the growing emphasis on women’s role in the home, Seventh-day Adventists continued to honor a woman, Ellen White, as prophet and made space for women’s labor as lecturers, missionaries, medical professionals, and teachers. Their approach to health included both an early emphasis on divine healing and a gradual embrace of medical intervention within the framework of following God’s natural law. Their understanding of end-times and salvation positioned the government of the United States as the anticipated adversary, rather than the protagonist of God’s unfolding plan.

This chapter provides a broad historical context for understanding the development of the Seventh-day Adventist denomination. Embedded within the revival movements of the nineteenth-century, Seventh-day Adventists drew from and were a part of the revivalist movements of the time. At the same time, their particular approaches to the social and theological problems of the day help illuminate the variety of American religious expression, as the religious culture they developed stands in contrast to many of the standard categories used to describe nineteenth-century religion. While the embrace of Miller’s teaching fundamentally shaped their understanding of the Bible, the unfolding of time, and their place within God’s plan for the world, the development of the denomination was also shaped by the melding of the diverse religious backgrounds of converts to the SDA, as well as the ongoing problem of reconciling the persistence of time with their interpretations of Scripture.

A note on terms: The labels used to describe the different groups of believers who anticipated the second coming in 1843/1844 vary in usage and require definition. For this study, I use “Millerite” to refer to those Protestant Christians who embraced William Miller’s interpretation of the Bible and expected the second coming (or second advent) during the period from 1843 to 1844. In the years following, a number of religious groups formed out of the Great Disappointment, and I use “adventist” to refer to religious groups with roots in the Millerite cause. Finally, Seventh-day Adventist or SDA refers to those believers who, after 1844, embraced the combination of adventist beliefs about end-times with Seventh Day Baptist teachings on Saturday as the day set aside for Christian worship.

William Miller and the Rise of Adventism

Of the variety of religious movements that developed during the nineteenth-century, one of the more notable was the Millerites. These were American Protestants drawn largely from Baptist and Methodist congregations who believed that the Bible revealed the date of the second coming, or second advent, to be around the year 1843. Miller, a commissioned militia officer during the War of 1812 and, if his biographer is to be believed, a leading citizen of Poultney, Vermont and later the nearby border town of Low Hampton, New York, began his study of the Bible shortly after his conversion from deism following his military service. His goal was to prove the reliability of the Bible by reconciling all apparent inconsistencies he discovered, and in so doing, counter arguments against religion, such as those popularized by Thomas Paine.6 In 1818, he first came to the conclusion that the biblical prophecies, particularly those in the book of Daniel, revealed that the second coming would take place around year 1843. After spending five years confirming his interpretations and calculations, he began to share his conclusions, starting with family and friends, then reaching out to local ministers. Miller understood his findings as reinforcing the teachings of the Protestant churches and as part of the prevailing belief in the second coming. He slowly began to receive invitations to preach in a variety of settings, including Baptist, Methodist, and Congregational churches.7 In 1832, he published his findings in the Vermont Telegraph and in 1833, he was granted a preaching license from the Baptist church in Hampton, New York.8 He began preaching and lecturing in churches throughout New England during the 1830s and into the 1840s, published a 64-page pamphlet of his interpretation in 1834 and a series of sixteen lectures in 1836. Between 1834 and 1839, Miller recorded that he had given about 800 lectures on his interpretation of prophecy and his belief that the time of the second coming was at hand.9

Despite publishing, traveling, and preaching widely, Miller would have remained one of many little known local religious teachers of the nineteenth century if it were not for Joshua V. Himes. Prior to meeting Miller in 1839, Himes had a long history of reform work aiming to bring about the kingdom of God. Licensed to preach by the Christian Church, he was pastor of the Second Christian Church in Boston, was a Temperance lecturer, started a school for boys, was on the board of the Massachusetts Anti-Slavery Society, was an early supporter of William Lloyd Garrison, and distributed periodicals and other printed materials of the Disciples of Christ.10 Anticipating the coming millennium and already working hard to bring it about, Himes found Miller’s teaching that the end would arrive in a mere four years’ time both striking and deeply motivating. With Miller’s blessing, Himes became his chief publicist, launching a print campaign and a series of general conferences and camp meetings aimed at disseminating Miller’s work as quickly and widely as possible.11 Publishing periodicals such as Signs of the Times, The Midnight Cry, the Philadelphia Alarm, and The Western Midnight Cry, as newspapers of the movement, along with tracts, hymns, and prophetic charts, as well as Miller’s The Second Advent and memoirs, Himes flooded the religious landscape with material expounding Miller’s teachings. In doing so, he fostered a sense of religious community among Miller’s followers and created the mechanisms by which Miller’s message could spread.12

It is clear that Himes had an excellent sense of marketing — he knew how to engage people in a cause and how to gain, if not converts, at least attention. He worked to ensure the widest possible reach for Miller’s teachings, establishing periodicals in the key cities of Boston, New York, Philadelphia, and Cincinnati. He also understood the value of showmanship, acting quickly on a decision to build a “great tent,” one that seated four thousand people and was one of the largest to be seen in America. He published and sold colorful charts to illustrate the intricate details of Miller’s interpretation and convinced Miller to offer a more definitive date. And he held regular meetings for those convinced by Miller’s teachings. These gatherings provided a forum for the dispersed community to come together, to work to reach consensus on some of the contested details of Miller’s teaching, and to agree on a strategy for spreading the word.13

Miller’s chart of the world and Daniel’s visions.

Figure 1.1: William Miller’s “Chronological Chart of the World to which is added, a Chart of Daniel’s Visions” (1842). From the Adventist Digital Library. https://adventistdigitallibrary.org/adl-421835/chronological-chart-world. Accessed 6 September 2017.

Fitch’s chart of the visions of Daniel and John.

Figure 1.2: Charles Fitch’s “A Chronological Chart of the Visions of Daniel and John.” (1843). From the Adventist Digital Library. https://adventistdigitallibrary.org/adl-421834/chronological-chart-visions-daniel-and-john. Accessed 6 September 2017.

The efforts of Himes launched Miller and his teachings into the public consciousness and created a community around the belief that the second coming would occur between March 21, 1843, and March 21, 1844. In 1842, just two years after he joined the cause, Himes claimed fifty-thousand readers for Signs of the Times and between three and four hundred ministers distributing the materials.14 By 1844, Himes had overseen the foundation of a number of serial publications, tracts, and pamphlets and had organized, attended, and spoken at conferences around the United States and Canada. This scale of publication and organization is notable. While most evangelical movements of the time utilized the press to spread their message and unite believers, under the leadership of Himes, the Millerites “published tracts, memoirs, and newspapers on a scale never before imagined.”15 As a result of the successful publishing campaigns, Miller became a highly sought-after speaker, claiming to have given about four thousand lectures in five hundred towns between 1840 and 1844. In addition to Miller and Himes, by Miller’s calculations some two hundred ministers in the United States and Canada had embraced his interpretation, and an additional five hundred public lecturers circulated his views.16

Miller did not set out to form a new movement or denomination. Rather, he saw his teachings as within the bounds of orthodoxy, as the logical conclusion of prevailing Protestant beliefs. He made no effort to start his own church and did not wish to be identified as a sectarian leader. He encouraged followers to stay within their home denominations, noting in his description of his work that, as result of his early lecturing, “many churches thereby greatly added to their numbers.”17 Miller saw his teaching of the imminent second coming to be a truth of general interest for all denominations, and one that should be embraced regardless of denominational affiliation.18

Local communities of Millerites emphasized regular attendance at local Protestant Sunday services, but also attended interdenominational Millerite meetings for those who believed the second coming was at hand. Dedicated to sharing the news of the second coming with their neighbors and religious fellows, members of Millerite associations often found themselves at odds with their local religious leaders.19 Finding community among other Millerites and meeting resistance in their local churches was often too much for individual believers, prompting them to leave their home denominations and claim a Millerite identity. While Miller himself was quite distressed by this trend, it set the stage for new denominations to form in the wake of the disappointments of 1844.20

Because of the non-denominational aspects of Miller’s preaching and because multiple dates were eventually given for the Second Advent, it is difficult to determine the total number of people who joined Miller in anticipating the second coming and the end of the world. In the years after 1844, the New York Tribune estimated that there were thirty to forty thousand Millerites at the movement’s peak. The high end of estimates puts the peak at one-hundred thousand, though Miller himself claimed that about fifty thousand embraced his message, a calculation based on the number of conversions he witnessed in his lectures.21 Many were not “poor, powerless, or marginalized” but rather were largely “above average in wealth” and tended to be “ordinary Americans from all walks of life.”22 Additionally, given the massive amount of literature produced by Himes, it is likely that many more quietly marked the passage of 1844 with interested skepticism. What is clear is that, although Miller was able to preach to nearly five-hundred thousand people, what propelled his teaching from local anticipation to an international movement was the coordinated publishing efforts of Himes.23

As would be the case for the early Seventh-day Adventists, Himes recognized the power of print and effectively deployed it as the center around which the Millerite movement formed. Regular publications spread Miller’s interpretation, raising awareness of his teachings. Those same publications provided the forum by which Himes organized the regular conferences, inviting interested people to gather together to coordinate their efforts to spread the news of the coming millennium, and advertised upcoming tent meetings. Reports about the successes of the camp meetings helped the community to take part in the events regardless of geographic proximity, and served as further proof of the truth and successfulness of Miller’s message, for if God were not behind the movement, the events would not have been successful. In addition, Himes strategically started multiple publications in different cultural and geographic centers of the U.S., increasing the coverage that the movement could achieve and catering to regional interests. This was a strategy that the Seventh-day Adventists would later emulate, first as they came together around the publication of a new interpretation of Miller’s teachings and the events of 1844, and later as they sent out missionaries, establishing printing houses in each new geographic region they entered.

The Religious Roots of Adventism

Building on the wider revival traditions of the time, adventism drew from and shared a number of key features that marked the religious awakenings of the period. Key among these were an emphasis on recapturing the religion of the early church, a commitment to a literal understanding of scripture, the embrace of millennialism, and an affinity for reform movements, from temperance and health reform to abolitionism. These shared commitments helped make Miller’s teaching legible to his audiences and his promise that the second coming would soon occur fit within the religious understanding of many. Additionally, Miller’s followers came from a number of Protestant traditions, and brought aspects of their theological roots with them into what would develop into the Seventh-day Adventist church.

Shared across many of the religious movements that developed during the period of the Second Great Awakening was an underlying commitment to identifying and recovering the Christianity of the early church, an impulse known as primitivism or restorationism. This impulse was not new to the reformers of the early nineteenth-century, but the range and scope of its application is noteworthy. While the impulse to remove elements of religious faith and practice that were seen as unscriptural additions had marked Protestant Christianity since the Reformation, by the early nineteenth-century seemingly all aspects of faith were open to challenge and reevaluation, as religious leaders sought to recover and reinstate “the truth” of Christianity as expressed in the days of the apostles.

In exploring the particularly strong hold of this approach to the past within the nineteenth century and American Protestantism more broadly, scholars have looked to identify the distinctive features of the environment that made primitivism particularly potent. Historian Nathan Hatch has attached the particularly strong manifestation of restorationism of the period to two new conditions of the time: the breakdown of older systems of authority with the democratic revolutions of the age and the experience of religious pluralism. On the one hand, the establishment of a democratic system and the rejection of monarchical forms of government was seen as having implications within the sphere of religion in the need to overthrow more authoritative forms of religious organization. On the other hand, the very abundance of forms of religious expression seemed to indicate that the truth was not yet found. As a result, religious leaders and innovators sought to bring order to the surrounding chaos, preaching that they had found, through careful study of Scripture or through direct revelation, God’s true purpose and desired form of religious observance.24

The Bible served revival leaders as the primary guide for efforts to recover and restore the early church. Dating back to the eighteenth-century revivals of the Great Awakening and the earlier Reformation creed of Sola Scriptura or “Scripture Alone,” Protestant Christians had long taught and internalized the belief that the Bible offered a clear guide to the Christian life, and more broadly, revealed God’s plan for the world. Rather than creeds or traditions, the Bible was held up as the sole arbiter of true faith and the primary mechanism for salvation.25 For the Bible alone to serve as the central guide of Christian life, however, it needed to be a reliable and clear source for religious truth. Religious leaders such as Alexander Campbell (Disciples of Christ) preached the Bible as “a book of ‘plain facts’ to be read and apprehended by all,” with careful and engaged study as the solution to all differences of interpretation.26 By the nineteenth century, believers had come to embrace the Bible, and individual interpretation thereof, as the final authority in matters of religious truth.27

One topic where this shift to approaching the Bible as a literal guide to truth was felt particularly strongly was the embrace of millennialism, or the belief in the second coming and thousand-year reign of Christ on earth. While a minority belief during much of church history, interest in end times and the millennium increased in the years following the American and French Revolutions, as the power of the Roman Catholic seemed to be on the decline. The embrace of a literal approach to the Bible fueled the rise in end-times belief. Where earlier interpretations of the prophetic books of Daniel and Revelation focused on their content as symbolic or allegorical, eighteenth and nineteenth century millennialists focused on the same passages as communicating a literal second coming marked by signs both strange and miraculous.28

Nineteenth-century revivalism tended to embrace one of two different visions of the millennium. The first, associated with more reform-minded religious leaders, anticipated the thousand-year reign of Christ on earth, to be brought about by the increasing piety and conversion of the American people. Encouraged by the success of the American Revolution in bringing about a dramatic change in style of government, these believers anticipated a similar revolution in the system of government, one that would bring about a culture marked by justice, equality, and virtuous behavior.29 The gradual conversion and salvation of the world was the historical arc described in Scripture and it was the role of believers to help bring about the gradual redemption of society. The second, embraced by Miller and others, took a more pessimistic view and posited that the second coming of Christ would precede the establishment of the thousand-year period of peace. For Miller, this second coming would arrive swiftly and for later Seventh-day Adventists, would come at the close of a period of intense persecution of those who followed the law and kept the Sabbath on Saturday. For Millerites and Seventh-day Adventists, the historical arc was one of struggle between the faithful and the world, between God and Satan, and would end in the sudden return and triumph of Jesus. The role of believers in this vision was to be prepared and faithful, as well as to convert others to the faith, for the second coming could not occur until all destined for salvation had the opportunity to be saved.30

End times expectation was not unique to Miller and his followers. In the face of the social and political changes that shaped post-Revolutionary American society, many religious revivalists embraced some form of millennial rhetoric and expectation. Noted preacher Charles Finney went so far as to claim in his 1835 Lectures on Revival that should the church embrace revivals and bring about the necessary conversions, “the millennium may come in this country in three years.”31 Even among religious movements that positioned themselves in contrast to the general culture, such as Latter-day Saints, there still existed an underlying impulse to nationalistic millennialism, as members believed that “God’s kingdom would yet rise in America … and their endeavors would serve as decisive leaven.”32 Committed to the truth of their particular religious interpretation, believers from multiple traditions set out to make ready the way ahead of their soon to be returning savior and to prepare themselves in advance of the coming judgment.

One final cultural contributor to the development of Seventh-day Adventism was reform, from abolitionism to temperance, health reform to women’s rights. In addition to a flurry of religious innovation, the early nineteenth century also saw the rise of numerous reform efforts, groups of citizen organizing together to bring about societal change. These efforts were often framed in religious terms, as reformers saw themselves as part of the divine plan for the unfolding of the world. The beginning and end of this plan were known: creation began in the garden of Eden and would end with the establishment of God’s eternal kingdom. Human actors lived through and contributed to that progression, setting in motion the events that propelled God’s plan and used their vision of the unfolding “cosmic drama” to guide their actions and interpret the events of their lives.33

These themes were prevalent throughout the revival movements of the nineteenth century, though emphasized differently among different denominations and religious groups. Those who were attracted to Miller’s teaching, and later to Seventh-day Adventism, came from a variety of denominational backgrounds, and brought with them elements of their home traditions, all of which contributed to shaping the distinctive approach and theology of the SDA. As a result, understanding the eventual development of Seventh-day Adventism requires a brief accounting of those major traditions. Of particular significance are Methodism, and particularly the “shouting Methodist” tradition, in which Ellen White was raised; Baptist, the ecclesiastical home of William Miller, and particularly Seventh Day Baptist traditions; and the Christian Church or Christian Connection movement of the early nineteenth-century, the tradition of James White, Joseph Bates, and Joshua Himes.34

Converts from Methodism

One of the largest and most discussed denominations that contributed members to the SDA was Methodism. The denomination traces its roots back to John Wesley and his emphasis on “the witness of the Spirit,” or a perceptible experience of the divine, as the central component of the Christian experience.35 Educated and ordained within the Church of England in the 1720s, John and his brother Charles merged rigorous devotional practices with an emphasis on religious experience. While initially a revival movement within the church, Methodism eventually separated to form an independent denomination.36 The early Methodist community was organized in a hierarchical structure, with the local “bands” or “class meetings” organized into “societies,” which in turn reported to an itinerant preacher assigned to the area. That itinerant preacher reported to the annual conference, and specifically to the general superintendents — Francis Asbury and Thomas Coke — who reported to John Wesley.37 This structure enabled local communities to develop even when there were few ordained ministers, with those in leadership positions responsible for local and regional meetings that included multiple local communities, a structure later echoed in the organization of Seventh-day Adventism.

Within the young Methodist tradition, there was a variety of opinion about the proper role of “enthusiasm” in the religious life. While some amount of “extravagant emotions and bodily exercises” were generally accepted, particularly within the context of conversion or private devotion, Methodist leaders typically stressed more reserved religious behavior in the context of public worship.38 However, not all agreed with that distinction, stressing experiential and emotional religious expression as central to worship. This form of Methodism, which came to be known as the shout tradition or Shouting Methodism, developed out of the multicultural context of the revivals of the eighteenth and early nineteenth centuries, shaped particularly by the combination of African and European styles around “singing, preaching, the use of the body, and the level and meaning of interaction in worship.”39 This more enthusiastic style of worship was often found in the camp meetings of the nineteenth century and was the source of some division between those who desired more “respectable” expressions of religion and those they deemed “fanatical.”40

Ellen White and her family were members of the Methodist church in Portland, Maine and it was into that tradition that White was converted and baptized when she was eleven and twelve years of age. In her early memoir, White recounts how she experienced an outpouring of grace — a sought-after experience within the Methodist tradition — upon embracing and speaking publicly regarding the Millerite message. White’s emphasis on religious experience, her visions, and her embrace of such practices as foot-washing in the years after 1844 have led scholars such as Ann Taves to identify her with the Shouting Methodist tradition.41

Converts from the Baptists

The second primary contributor to adventism was the Baptist tradition. Due to the strong emphasis on the autonomy of local churches, the history of the Baptist movement is both varied and contested. The roots of the Baptist movement trace to England and to efforts in the seventeenth century to reform the Church of England. Baptist preachers emphasized the Church as the community of those who professed individual belief and the baptism of believers as a sign of that profession, as well as the Bible as the guide for the Christian life. Early Baptists were frequently the target of government and social censure, as their criticism of the Church of England and their particular interpretation of the Christian faith were threatening to existing cultural norms and power structures. As a result, early Baptists emphasized religious liberty, arguing that belief should be judged by God and not by the state. They also emphasized the autonomy of the local church, relying on the mechanism of voluntary societies for organizing their efforts to collaborate on missions or other shared endeavors. These emphases, particularly on the community of professing individuals, the Bible as the guide of Christian practice, religious liberty, and on organizing for shared concerns through the mechanism of associations, appear again in the development of Seventh-day Adventism.42

Among Millerites, Baptists made up a substantial percentage of the ministers and congregants. Studies of those who identified as Millerite suggest between twenty-seven and sixty-three percent of ministers and lecturers were affiliated with the Baptist church.43 Most significantly, Miller himself was affiliated with the Baptist church, joining the denomination during the period of his conversion back to Christianity, and receiving a preaching license from the Hampton, New York, Baptist Church in 1833.44

While Miller and the Adventists followed more traditional Baptists beliefs and practices, one Baptist sect, the Seventh Day Baptists, had a particular influence on the development of Seventh-day Adventism. The Seventh Day Baptist tradition also dates back to seventeenth century England. While similar in practices to Baptists, Seventh Day Baptists interpreted elements of the Old Testament law to apply to the current time, particularly the need to honor the Sabbath (Saturday) as well as some of the dietary laws.45 Early Seventh Day Baptists were among the first to establish churches in the American colonies, with communities in Rhode Island, New Jersey, and Pennsylvania.46 The traditions of the Seventh Day Baptists and the Millerites began to intersect as Baptist members embraced the interpretations of Miller and began to advocate for a Seventh Day understanding of the Sabbath within adventist communities. A small subset of adventists began to observe Saturday Sabbath and to publish on the issue, which in turn converted one of the prominent preachers within the Millerite movement, Joseph Bates. It is Bates who is credited with introducing Ellen and James White to the arguments for Sabbatarianism, which they adopted in 1846, and was also an early adopter of hygienic principles, having given up meats and stimulating foods already in 1843.47 While the Seventh Day Baptist tradition continues to the present, the Seventh-day Adventist church has become one of the largest proponents of the Saturday Sabbath interpretation within the modern Protestant churches.

Converts from the Christian Church

A number of converts made their way to Seventh-day Adventism through other emerging revival traditions, particularly the Christian Church or “Christians.” Broadly speaking, the label is applied to three or four different revival traditions that developed during the period of the Second Great Awakening and eventually merged. Led by Elias Smith in New England, James O’Kelly in Virginia, Barton Stone in Kentucky, and Alexander Campbell in Pennsylvania, Protestant Christians from Baptist, Methodist, and Presbyterian churches began to question the denominational structure. Of particular concern for those who joined the Christian movement was the rejection of formal ministerial training and of denominations in general, and instead focusing on the “priesthood of all believers” and the need to return to the purity of the early church.48 For those who converted to the Christian churches, of central concern were liberty, both political and religious, and the sovereignty of “the people” in contrast to the “tyranny” of elites, the clergy, and institutions.49 They advocated a complete break with the recent past in order to return to the Christianity of the early church. One key element of the success of the Christians was their embrace of print. Leaders such as Smith and Campbell early on identified the advantages of print for spreading their teachings, and put it to use in promoting their cause. From starting the first religious newspaper in the U.S. to producing pamphlets and papers designed for circulation and mass readership, Christian leaders sought to disrupt existing denominations by bringing their ideas to lay readers directly.50 This focus on liberty, on the Bible and its plain meaning, as well as the use of print, shaped the religious intuitions of future Seventh-day Adventist leaders such as James White and Joseph Bates, who were both initially preachers in the Christian Church, as well as Joshua Himes, Miller’s primary publisher and promoter.

The early nineteenth century was generally a period of religious unrest and innovation, as individuals struggled to make sense of the changes brought about by the American Revolution and by the proliferation of religious expression. A product of the end of this period of revival, both Miller’s Adventists and Seventh-day Adventism were built upon and reflect a number of the traditions and innovations of the period. Drawing on believers from across denominational boundaries, the new movements were faced with reconciling the differences between multiple understandings of the Christian tradition. For the early Seventh-day Adventists the process was challenging as a new set of theological commitments needed to be formed out of the various source traditions of its leaders and the end-times teachings of Miller. Rather than a straightforward product of any one tradition, Seventh-day Adventism represents the culmination of various strands of Second Great Awakening revivalism and the merging of those beliefs into something new.

Becoming Seventh-day Adventists

As the period Miller identified between March 21, 1843, and March 21, 1844, passed, adventist believers began to develop a number of alternative theories to address the persistence of time. A temporary reprieve from the crisis was granted when a young man named Samuel Snow announced that he had identified the error in Miller’s calculations and that October 22, 1844, was the date for the second coming. While acceptance of that interpretation brought unity back to the movement for a time, the passing of October 22, which came to be known as the Great Disappointment, exposed tensions between the various groups of adventists, which had become too great for maintaining a unified movement.

The adventist community did not collapse immediately after the final disappointment of October 22, 1844. While Himes briefly suspended publication after the October 16 issue of The Advent Herald and Signs of the Times Reporter in anticipation of the second coming, when the day passed without event, he quickly regathered, releasing a new issue on October 30, 1844. Still convinced of the truth at the core of the adventist message, that the time of the second coming was at hand, he offered a retelling of the history of the adventist movement that emphasized the nuances of their teachings and downplayed the significance of the October date.51 However, understanding and explaining the persistence of time posed a significant challenge for the community.

Among those who remained committed to Miller’s teachings, the movement split into two main groups: those who held that the end was indeed near but they had been mistaken about the date; and those who held that the end was indeed near but that they had been mistaken about the significance of the date. The debates between these groups took place on the pages of the main adventist papers, as well as in the pages of new periodicals started to promote various interpretations of the nature and timing of the Second Advent. These two groups were also characterized by distinct religious cultures, with those who reinterpreted the meaning of the October 22nd date embracing more radical forms of religious expression, including more exuberant forms of worship, foot-washings, and exchanging the “holy kiss.”52 In 1845, some of the major figures in the adventist movement met in Albany to formalize an official interpretation of the events of 1843-1844 and to clarify acceptable and unacceptable theology and religious practices. Led by Joshua Himes, and supported by William Miller, the Albany Conference marked the beginning of the Adventist Christian Church and the formation of a more “traditional”religious structure, one that “discouraged visionary enthusiasm, established a professional clergy, and forbade women to serve as evangelists.”53

In the years following 1844, William Miller continued to believe that his interpretation was correct and the second coming was at hand, but he ceased to offer or support any date predictions after 1844 until his death in 1849. He never endorsed any of the denominations formed in response to his teaching; instead he persisted in his belief that the truth of his message was one that would bring an end to religious sects.54 While Miller remained outside of the denominational disputes that unfolded in the 1840s, Joshua Himes continued the work of publishing adventist materials and traveling around the disbursed community. One of the leading figures in the Albany Conference, he, like Miller, backed away from additional date setting and worked with the Evangelical Adventists and the Advent Christians for many years after 1844. He was critical of the more theologically radical groups within adventism, including the Whites and the developing Seventh-day Adventist Church. Eventually Himes was ordained in the Episcopal Church in 1878 and served in South Dakota until his death in 1895.55

Those who would come to form Seventh-day Adventism were part of the second group of movements to come out of adventism, believing that Miller had been correct in his interpretation of the Bible, and his endorsement of October 22, 1844, as the date prophesied but incorrect about what that date signified.56 They embraced what were considered some of the more radical theological positions discussed in the wake of 1844.

  • First, they embraced the teaching of “conditional immortality,” viewing the standard teaching of an immortal soul to be extra-biblical, arguing instead that people would be granted immortality (or not) at the resurrection.
  • Second, this group of Adventists embraced the Seventh-day Baptist position on keeping Saturday, rather than Sunday, as the Sabbath.
  • Third, rather than question the date Miller and Snow had given for the second coming, the founders of Seventh-day Adventism questioned the event that the date marked, holding what became known as the Sanctuary Doctrine. The prophecies that Miller interpreted spoke of the “sanctuary” being cleansed at the end of 2300 days. This sanctuary was assumed by the adventist believers to be the earth, which needed to be cleansed of sin for the millennium to begin. When October 22, 1844, passed with no second coming and dramatic “cleansing”of the earth, those who embraced the Sanctuary Doctrine posited that the sanctuary was instead in heaven, and that starting October 22, 1844, Jesus was undertaking the work of “blotting out sins” ahead of his second coming.
  • Fourth, they embraced the existence of prophetic gifts, the belief that God spoke directly through an individual to provide guidance and new revelation for a community. For the early Seventh-day Adventists, this gift was primarily limited to Ellen White, though early on a number of people claimed to have received visions.
  • Finally, although this was abandoned around 1850, they embraced the “shut door” doctrine, holding that October 22, 1844, marked a closing of the period of salvation and only those who had accepted adventism prior to the date could be saved.57

As with others who continued in the Adventist faith after 1844, all of these beliefs hinged on a core belief that the waiting time they were now in would be short and that the second coming would take place soon.

The Gathering of Seventh-day Believers

As with many of their adventist contemporaries, publishing was central to the efforts of the nascent Seventh-day Adventist community to articulate and share their views. While there were a number of figures who contributed to what would become the main points of Seventh-day Adventist theology, the key figures for understanding the development of the denomination and the use of print to unite and grow its community of believers were James and Ellen White. White established herself early on as an authoritative prophetic voice for the adventist community struggling to understand how to reconcile their interpretation of the Bible and the events they experienced. James White, who had worked as a Millerite lecturer and correspondent, was one of her earliest converts and became her primary publicist. It was through their combined efforts that the Seventh-day Adventist church coalesced and grew in the years after 1844.

Both James and Ellen White were familiar with the adventist strategies for evangelism, which combined print; itinerant lecturing and testimony sharing; and periodic tent meetings and local conferences. James White continued as an active correspondent with adventist periodicals in the years after the Great Disappointment, including the Advent Herald, the Advent Herald and Bible Advocate, and The Second Advent Watchman, papers that represented three of the major groups that developed out of adventism.58 In these, he reported back to the community about camp-meetings held and argued for theological positions, such as conditional immortality. Ellen had also been active in the local Millerite community, speaking at local churches and class meetings prior to the events of 1844 about her experience and her belief in the approaching second coming.59

In addition to giving testimony in person, White wrote of her visions to a number of adventist leaders and publishers. In December 1845, one year after her first post-Disappointment visions, Ellen wrote a letter to Enoch Jacobs, editor and publisher of The Day Star, originally the adventist publication, Western Midnight Herald, describing her vision of “the Advent people” on their journey along the narrow way toward heaven and affirming that those who had embraced Miller’s teachings were indeed on the “narrow path” to heaven.60 As noted in the issue of The Day Star, and a subsequent letter by White, this letter was purportedly not intended for publication. However, White quickly followed up with a second publication to further clarify her message and affirm that the church had now entered a waiting period before Christ’s second coming. In 1846, White met Joseph Bates, a leader among the Millerites and a proponent of “Seventh-day” sabbatarianism, or the belief that Saturday, rather than Sunday, was the commanded day of Christian observance. While initially skeptical of White’s visions, Bates was eventually convinced, and published an account of them in 1847 with his own introduction of her as a prophet to the adventist community.61

With her visions circulating among the adventist communities through The Day Star and other periodicals, the couple also began to produce their own materials to share Ellen’s visions. In 1847, noting that the publication in which they had intended to publish was no longer producing new issues, James began to produce pamphlets and broadsheets offering his exegesis and Ellen’s visions to help the adventist community make sense of their current position in the unfolding of the end times.62 The evangelistic culture of the Millerites and adventists provided James and Ellen a template for how to reach the community with their particular message, and the opportunity to leverage existing networks of print and public speaking to reach a wider audience with their good news.63

While the Whites’ early publishing efforts took the form of singular publications, or relied on existing adventist outlets, in 1849 James White began to produce his own title, The Present Truth. His goal for the title, as described in its initial issue, was to reach the “scattered flock” with the “present truth” as quickly as possible, for the second coming was at hand.64 Central to that “present truth” was the necessity of keeping the Decalogue (the ten commandments) and especially keeping the commandment to “honor the Sabbath day,” understood as requiring Christian observance on Saturday rather than Sunday.65 In 1850, and at least partially in response to some criticism of the Whites’ efforts to create a new publication, Ellen published an account of her vision on September 23 of that year where she was shown that they were in the “gathering time” where God was bringing his people together and to a shared understanding of the adventist message, an effort that required “that the truth be published in a paper, as preached.”66

James and Ellen initially saw these publications as a short term endeavor and anticipated the quick success of their mission to bring together the adventist community around the belief that October 22 had marked the beginning of Christ’s “investigative judgment.”67 But as time continued and interest in the publication grew, it became clear that this would be a longer endeavor. While the initial issues were overall uni-directional, offering arguments for the Sabbath and accounts of Ellen’s visions, by the third and fourth issue of The Present Truth, the publication began to take on a community building function. In Volume 1, No. 3, published August 1849, James printed a notice about upcoming conference meetings to be held in Vermont and Maine by Joseph Bates. In Volume 1, No. 4, published September 1849, he included a letter from “J. C. Bowles”who wrote from Jackson, Michigan to send word that they had received The Present Truth, report on the success of Bates’ trip west, and suggest that James “insert extracts of the letters you may receive from the brethren” as it would be “comforting … to hear from others” and “may induce some to examine the subject, that would not otherwise.”68 While James noted that he did not intend initially to publish letters from the “brethren,” such quickly became a regular feature of The Present Truth and its successor, The Second Advent Review and Sabbath Herald (generally referred to as The Review and Herald). By the last issue of The Present Truth – Volume 1, No. 11, published November 1850 – over half of the publication was devoted to letters from readers and fellow seventh-day proponents, responses to controversies within the community (such as the eating of pork), news about recent and upcoming meetings, and the sale of other literature, including hymnals and prophetic charts.69

The publishing locations of the Whites’ initial periodicals reflect the family’s, and the religious community’s, movement from upper New England toward the West. Initially based out of Maine, where the Harmon family was located, their teaching had been brought as far west as Michigan as early as 1849 through the preaching of Joseph Bates. In 1851, in order to make it easier to travel to the far reaches of the dispersed community and to more cheaply distribute the periodical, the Whites moved their publishing operation to upstate New York, first to Saratoga Springs, and in 1852 to Rochester.70 Repeating the Millerite strategy of combining publications, traveling lecturers, and tent meetings to create and hold the community together, the editors of The Review and Herald understood the role of the paper as central to their efforts, that for a people “thus scattered, and surrounded by unbelief and opposition, you certainly need the weekly visits of a paper devoted to the present truth.”71 In 1855, the community in Battle Creek, Michigan, offered to establish a more permanent home for the denomination’s publishing efforts, including a press. By October of that year, James and Ellen relocated to Battle Creek, and the community called a general conference to decide who would run the daily operations of the publication.72 Battle Creek would remain the cultural and political center of the denomination through the rest of the century.

So central was publishing to the religious movement, that by 1860, the logistical concerns of running a successful publishing operation pushed James White and others toward discussing legal incorporation of the press and the denomination. During a general conference held in Battle Creek in the fall of 1860, called in order to discuss “the proper method of holding church property, the want of our Office of publication, &c.,” James White presented the case for incorporation.73 Noting that, as it stood, he was the sole owner of the “REVIEW office” and his desire was to see the property owned instead by the church, James White led the argument for legal incorporation. However, this proved a culturally contentious issue for the community of adventists, many of whom understood separation from the state and the rejection of denominations as central to their charge to be a separate people. While individual churches might establish a legal presence for building ownership, the press presented a different challenge for the emerging denomination — a commonly held asset that required some form of central management. Eventually a compromise was reached, creating an “association” for the purposes of legal incorporation of the publishing office and its holdings, while making membership in that association an optional aspect of church membership.74 On May 3, 1861, the Seventh-day Adventist Publishing Association was legally incorporated in Battle Creek Michigan.75 In May 1863, during the annual meeting in Battle Creek, delegates voted to create the General Conference of Seventh-day Adventists to formally organize the structure of the denomination and outline the relationship between the smaller state conferences and the larger national organization. This structure enabled them to coordinate efforts for the ordination of ministers and the sharing of the Seventh-day Adventist message, including the distribution of denominational literature.76

Developing a Distinctive Theology

Within the pages of the denominational periodicals, church leaders debated the developing theological commitments of the emerging denomination. Building on the religious traditions and assumptions from the variety of religious traditions of its founders and members, Seventh-day Adventism developed less as a branch of a single denomination, and more as a melding of the religious upheavals of the early nineteenth-century.77 As a result of coming into existence through a combining of traditions, their theological commitments were in flux in ways that differ from groups that split off from a single denomination.

Some of the initial debates within the adventist community were around the issues of the Sabbath and of the immortality of the soul. These were the key theological elements that distinguished Seventh-day Adventist from other Adventist denominations that developed after 1844. As noted earlier, Sabbath-keeping came into adventism through the Seventh Day Baptist tradition and particularly through the writing and teaching of Joseph Bates. Once adopted, Sabbath-keeping quickly became the distinguishing characteristic for the emerging religious movement. James and Ellen White’s first publication, The Present Truth, devoted most of its early pages to arguments for the keeping of Saturday Sabbath as the current “test” delineating true believers from non. Over time, the understanding of Sabbath-keeping expanded, seen as “a memorial of the six-day creation described in Genesis,” as “a continuing symbol of loyalty to God’s law,” as a sign of valuing “the Bible above the authority of the papacy,” and as “at the center of the conflict between Christ and Satan.”78

The doctrine of conditional immortality was also a prominent theme in the early periodicals. Rather than viewing humans as composed of a physical body and a separate immortal soul, early Seventh-day Adventists viewed humans as “indivisible beings who did not possess natural immortality.”79 Immortality was a gift that would be granted or denied at the Second Coming.80 This view of human beings as unitary and material beings and of immortality as a gift to be granted or denied was one of the emerging denomination’s more significant points of departure from standard Protestant theology. It also set the intellectual stage for the development of their emphasis on health as part of the religious life of believers.

As time continued, the Seventh-day Adventist theological literature began to consider the nature of revelation, the ways by which God communicated with his people. This included both a defense of the prophetic role of Ellen White and the development of the idea of “progressive revelation” to describe God’s incremental unveiling of his plan over time. While Ellen White’s prophetic gifts were foundational to establishing her religious authority in the years after 1844, by the 1850s, there was no clear consensus within the adventist communities regarding the role of visions.81 Within the Review and Herald, as published by her husband James, Ellen’s visions were largely absent between 1851 and 1855 as James emphasized the Bible as the guide for the Christian life.82 By the 1860s, however, and in part due to pressure from other leading figures in the movement, the newly forming denomination reached the consensus that would become part of their theological foundation: that “prophetic counsel” or “the spirit of prophecy” was a “gift to the church,” one specifically delivered through the visions of Ellen White.83 While formally always secondary to the Bible, Ellen White’s writings functioned as a secondary source of divine revelation, further distinguishing the young Seventh-day Adventists from their Adventist and Protestant peers.

Another concern that shaped Seventh-day Adventist theology was the problem of incorporating shifts in their understanding of time and the Christian life. Theological flexibility was part of the group’s basic character from early on, as believers successfully shifted from understanding October 22, 1844, as the date of Jesus’ physical second coming to understanding it as the beginning of his work of atonement within the heavenly Sanctuary and in embracing Sabbath-keeping as the new sign or test distinguishing true believers. While Seventh-day Adventist historian LeRoy Froom is credited with first acknowledging the slow shifts in Seventh-day Adventist theology and attributing them to “progressive revelation,” the framework for expecting understanding to change over time had deep roots.84 Writing in the third issue of the Present Truth, Ellen White laid out the beginning of this framework when she notes that those who passed away before 1844 and the revelation of the true Sabbath would not be judged for the failure to keep the Sabbath, “for they had not the light, and the test on the Sabbath, which we now have …”85 While believers would now be judged on their keeping of the commandments, specifically the seventh-day sabbath, that requirement did not apply to those who had died before hearing the message, for it had not yet been revealed. While God and the Bible were consistent, human understanding was limited and error prone; eventually, all would become clear. This framework provided the theological flexibility needed for the community to continue to cope with the delay of the second coming, as well as reinforced the need for and value of Ellen White’s prophetic ministry as a source of additional light to gradually clarify the group’s role in the unfolding of sacred history.

One final distinctive element of early Seventh-day Adventist theology was their understanding of the role of the United States government in the unfolding of the divine plan. Whereas most Protestant and revivalist groups of the nineteenth century interpreted the American revolution and the creation of the United States as a key positive development in the unfolding of divine history, early Seventh-day Adventists adopted an opposing interpretation. While the United States still featured as a key player in the events of the last days, the role of the nation was as a persecutor of the saints, rather than as a positive force bringing about the millennium. Interpreting the creatures described in the book of Revelation, SDA theologians identified the Roman Catholic Church as “the first beast … The United States, in this view, is the second beast and has“two horns like a lamb” but speaks “as a dragon,” and Babylon is composed of the papacy, Protestantism, and spiritualism.”86 As with many emerging religious traditions, early Seventh-day Adventists saw themselves rather than the nation as “God’s chosen people” and so their theology reframed the prophetic role of the state, not denying that the United States had a particular role in the events to come, but reinterpreting what that role would be.87

These distinctive elements of their theology — Sabbath-keeping, conditional immortality, prophecy, progressive revelation, and their reinterpretation of eschatology — placed early Seventh-day Adventists in tension with the broader non-Seventh-day Adventist community and with American Protestantism at large. Described as fanatics by those who joined the Adventist church and as heretical at best and a cult at worst by those within Protestant denominations, early Seventh-day Adventists were located on the cultural fringes of Protestant America during the nineteenth and early twentieth centuries. On the one hand, this tension proved reaffirming to the young community — proof that they were indeed on the narrow road to the kingdom of heaven. Over time, however, those tensions lessened, particularly in the years after the death of Ellen White in 1915. While more recent theologians such as Froom attribute these shifts to progressive revelation moving the SDA closer to evangelical Protestantism, others have seen these shifts as a signs of declension, of the falling away of the SDA from its initial “purity” through a process of accommodation and secularization (or modernization).88 Even as the denomination has found common ground with evangelical and fundamentalist forms of Protestantism, particularly on issues of Biblical literalism and conservative gender roles, they have also retained these distinctive aspects of their theology as well as their particular emphasis on health.

Conclusion: Living at the End of Time

Seventh-day Adventism poses an instructive puzzle for standard accounts of nineteenth-century religion. On the one hand, the emphasis on prophecy, the role of Ellen White, and their use of periodical literature and camp meetings to bring together a community of believers fits within the standard pattern of the Second Great Awakening revival tradition. Their roots in Miller’s adventism reinforces this connection. And yet, the place of Seventh-day Adventism within the history of religion in the United States and the broader cultural trends that they illuminate are not straightforwardly those of early nineteenth-century revivalism.

Historians James Bratt and Ruth Doan have both argued that Miller and the adventist movement should be interpreted as marking the culmination and turning point for the revivalism of the Second Great Awakening. In his study of the Bible, William Miller in many ways reflected the religious and cultural impulses of religion in the early republic. The rapid social and political changes brought about by the revolution made feasible the idea that the second coming and religious revolution were also on the horizon. However, the enthusiasm that marked the revivals of the period also spurred the development of anti-revival sentiment among religious leaders. Bratt locates the climax of anti-revivalism in 1845, right as Miller’s adventists were coming to terms with the failure of their prophetic interpretation.89 Additionally, Bratt identifies Miller’s teaching as precipitating a re-evaluation of enthusiasm and revival methods within Methodism.90 Miller’s teaching that the second coming was not just close but to be anticipated at a close-at-hand date was linked to a wave of revivals between 1843 and 1844, as “the promise and threat of meeting the Lord at any moment brought audiences to a pitch of excitement.”91 Miller functions, in this interpretation, as the logical conclusion of the revivalism of the early republic period.

The pervasiveness of Miller’s teachings raised an instructive challenge for the established Protestant denominations of the early republic, and pushed many Protestant groups to their breaking point, articulating a vision too dissonant with the prevailing norms to be embraced. In response to Miller, the emphasis within the established churches shifted from individuals back to institutions, and from sudden transformations to the slow process of character development.92 While expectations for a sudden return of Jesus continued among many Protestant believers, overall the culture of millennial expectation shifted to one of a “millennium of the heart” and toward a “more steady, gradual, and unimpeded evolution” of the individual and society.93 Rather than part of an ongoing thread of revivalistic religion, Bratt and Doan locate Miller at the end of a period of upheaval, as marking the beginning of a cultural shift away from the exuberant, individualistic, and emotional religion of the revivals and toward the more structured, institutionalized, and character-focused religion of the mid and late nineteenth century.

As a product of Miller’s teachings as well as of these new cultural emphases, Seventh-day Adventism is less a late expression of Second Great Awakening revivalism than a result of the ongoing reshaping of that tradition in light of these later cultural shifts. The ongoing renegotiation of religious and cultural norms can be seen particularly strongly in the theological shift from the primacy of experiencing assurance of salvation to the importance of the law, including Sabbath law and “laws of nature” around issues of health. Similar to Phoebe Palmer’s emphasis on personal piety and perfection, Ellen White and the early Seventh-day Adventists focused on Sabbath-keeping and the Old Testament law as the crux of faith.94 And “although White … managed to preserve the Millerite legacy of female evangelism,” she and other adventist women “moved in increasingly conservative directions during the 1850s and 1860s.”95 Sabbath-keeping and health became central to the professed requirements for salvation, a shift away from the more emotion-focused assurance of salvation and belief in the second coming that marked adventism. This corresponded with an increase in institution building and an increasingly robust denominational structure.

Early Seventh-day Adventists continued to believe that they were living in the last days of human history, that October 22, 1844, marked the beginning of the end and that the events of the present were part of the ongoing fulfillment of prophecy ahead of Jesus’ return. That belief fundamentally shaped their experience of time, their approach to the events of the present, and the development of their faith. Culturally, it provided a long-lasting link to the revival culture of the early nineteenth century, framing their embrace of the prophetic role of Ellen White. But the edge of anticipation is a difficult state to maintain, and as time continued Seventh-day Adventist believers found themselves needing to reexamine their understanding of their faith and their position in divine history. In doing so, they drew from prevailing cultural norms, using a framework of progressive revelation to interpret God’s truth as unchanging but human understanding as partial and progressive.


  1. Ellen Gould Harmon White, A Sketch of the Christian Experience and Views of Ellen G. White (Saratoga Springs, NY: James White, 1851), http://adventistdigitallibrary.org/adl-366537/sketch-christian-experience-and-views-ellen-g-white, pp. 5-6.

  2. The seminal work on religion and the early republic is Nathan O Hatch, The Democratization of American Christianity (New Haven: Yale University Press, 1989).

  3. For example, Miller appears sporadically in Hatch’s account of the Second Great Awakening, as one example among many of the use of print, of music, and of the theological features of the period’s revival movements. Catherine Brekus describes the Millerites as representing “both the culmination and the exhaustion of antebellum revivalism,” describing the movement as one final surge in the revival traditions of the First and Second Great Awakenings. ibid., , pp. 145; 159; 167. Catherine A. Brekus, Strangers and Pilgrims: Female Preaching in America, 1740-1845 (Gender and American Culture), 1st New edition (Chapel Hill: The University of North Carolina Press, 1998), p. 309.

  4. 1905 Year Book of the Seventh-Day Adventist Denomination (Washington, D.C.: Review; Herald Publishing Association, 1905), http://documents.adventistarchives.org/Yearbooks/YB1905.pdf, p. 14; D.J.B. Trim, Kathleen Jones, and Lisa Rasmussen, 2017 Annual Statistical Report (Office of Archives  Statistics  Research, 2018), http://documents.adventistarchives.org/Statistics/ASR/ASR2017.pdf, p. 4.

  5. Brekus addresses the shift from the Millerite movement to Seventh-day Adventism in her study of Female Preaching in America, noting that Ellen White, along with other women Millerite preachers “managed to preserve the Millerite legacy of female evangelism” but “in increasingly conservative directions.” Jonathan Butler notes in his essay on Seventh-day Adventism in The Disappointed: Millerism and Millenarianism in the Nineteenth Century that the “boundlessness of millenarian beginnings” of Adventism has been favored in historical analysis over the “later quietistic and consolidated stage of these movements,” such as the process that gave rise to the Seventh-day Adventist church. And in her discussion of religious publishing, Candy Brown mentions the publishing efforts of both William Miller and James White as examples of the use of print by new religious movements, with White picking up where Miller left off after the Great Disappointment. Brekus, Strangers and Pilgrims, pp. 333-334. Jonathan M. Butler, “The Making of a New Order: Millerism and the Origins of Seventh-Day Adventism,” in The Disappointed: Millerism and Millenarianism in the Nineteenth-Century, ed. Ronald L. Numbers and Jonathan M. Butler (Bloomington: Indiana University Press, 1987), 189–208, p. 190. Candy Gunther Brown, “Religious Periodicals and Their Textual Communities,” in History of the Book in America, Vol 2, ed. Scott E. Casper, Stephen W. Nissenbaum, and Jeffrey D. Groves (University of North Carolina Press, 2007), 270–78, http://site.ebrary.com/lib/georgemason/reader.action?docID=10460908\&ppg=270, pp. 271-272.

  6. The language of “deism” here is used by his biographer, Sylvester Bliss, as well as by James White in his telling of Millerite history. Sylvester Bliss, Memoirs of William Miller (Boston: J. V. Himes, 1853), https://hdl.handle.net/2027/loc.ark:/13960/t6m05ng21, pp. 18-25, 31, 63. James White, Life Incidents : In Connection with the Great Advent Movement as Illustrated by the Three Angels of Revelation Xiv (Battle Creek, Mich.: Steam Press of the Seventh-day Adventists Pub. Association, 1868), https://archive.org/details/lifeincidentsin00whitgoog, pp. 30-31. As Miller’s response to the challenge was to prove the rationality of the Bible and by extension the Christian faith, it is likely“deism” refers to a Thomas Paine style celebration of reason and skepticism of the authority of the Bible. See Amanda Porterfield, Conceived in Doubt: Religion and Politics in the New American Nation (Chicago: University of Chicago Press, 2012), particularly pp. 14-47, for a history of Paine and the skeptical tradition in early America.

  7. Everett N. Dick, “The Millerite Movement, 1830-1845,” in Adventism in America: A History, ed. Gary Land (Berrien Springs, MI: Andrews University Press, 1998), p. 4.

  8. Wayne R. Judd, “William Miller: Disappointed Prophet,” in The Disappointed: Millerism and Millenarianism in the Nineteenth Century, ed. Ronald L. Numbers and Jonathan M. Butler (Bloomington: Indiana University Press, 1987), pp. 22, 25; Miller gives the date of his being granted a license as 1834 in Mr. Miller’s Apology and Defense.

  9. William Miller, “Mr. Miller’s Apology and Defense,” The Advent Herald and Morning Watch 10, no. 1 (1845): 1–6, https://archive.org/details/WilliamMiller.Mr.MillersApologyAndDefence1845/page/n1, p. 3.

  10. David T. Arthur, “Joshua V. Himes and the Cause of Adventism,” in The Disappointed: Millerism and Millenarianism in the Nineteenth Century, ed. Ronald L. Numbers and Jonathan M. Butler (Bloomington: Indiana University Press, 1987), pp. 37-39.

  11. ibid., , p. 39

  12. ibid., , pp. 42-48.

  13. Judd, “William Miller.”, pp. 39-48.

  14. Dick, “The Millerite Movement, 1830-1845.”, p. 6; Arthur, “Joshua V. Himes and the Cause of Adventism.”, p. 46.

  15. Brekus, Strangers and Pilgrims, p. 323.

  16. Miller, “Mr. Miller’s Apology and Defense.”, p. 4.

  17. ibid., , p. 3.

  18. Judd, “William Miller.”, p. 30-31.

  19. Dick, “The Millerite Movement, 1830-1845.”, p. 9.

  20. Judd, “William Miller.”, p. 32.

  21. Dick, “The Millerite Movement, 1830-1845.”, p. 27; Miller, “Mr. Miller’s Apology and Defense.”, p. 4.

  22. Brekus, Strangers and Pilgrims, p. 313-314.

  23. Dick, “The Millerite Movement, 1830-1845.”, p. 6.

  24. Hatch, The Democratization of American Christianity, pp. 167-170.

  25. Jon Butler, Awash in a Sea of Faith: Christianizing the American People (Cambridge: Harvard University Press, 1990), p. 172.

  26. Quoted in Hatch, The Democratization of American Christianity, p. 163.

  27. See ibid., , pp. 179-184.

  28. Ernest A. Sandeen, “Millennialism,” in The Rise of Adventism: Religion and Society in Mid-Nineteenth-Century America, ed. Edwin S. Gaustad (New York; Evanston; San Francisco; London: Harper & Row, 1974), p. 113.

  29. Of course, one ongoing challenge for this approach is that each group had a different vision of the characteristics that marked this millennial society.

  30. In his essay on “Millennialism” in The Rise of Adventism, Ernest Sandeen offers a useful overview of the variations of Christian millennial belief as split on “near or distant,” “silent or cataclysmic,” and “gradual or swift,” with the combination of “near, cataclysmic, and swift” being defined as “apocalypticism.” ibid., , pp. 105-106.

  31. Quoted by James Bratt in James D. Bratt, “Religious Anti-Revivalism in Antebellum America,” Journal of the Early Republic 24, no. 1 (2004): 65–106, http://www.jstor.org/stable/4141423, p. 83.

  32. Hatch, The Democratization of American Christianity, p. 188.

  33. The relationship between reformers and this religious vision of the unfolding of human history is discussed in Robert H. Abzug, Cosmos Crumbling: American Reform and the Religious Imagination, 1st ed. (New York; Oxford: Oxford University Press, 1994), pp. 4-8 and 30-35.

  34. This list is echoed in Bull and Lockhart’s survey of the SDA. See Malcolm Bull and Keith Lockhart, Seeking a Sanctuary: Seventh-Day Adventism and the American Dream, 2nd ed. (Bloomington: Indiana University Press, 2006), http://mutex.gmu.edu/login?url=http://www.jstor.org/stable/10.2307/j.ctt1b349jq, p. 101.

  35. Ann Taves, Fits, Trances, & Visions: Experiencing Religion and Explaining Experience from Wesley to James (Princeton, N.J: Princeton University Press, 1999), p. 50-51.

  36. Sydney E. Ahlstrom, A Religious History of the American People (New Haven; London: Yale University Press, 1972), pp. 234-237.

  37. See Ann Taves’ account of Methodism in Taves, Fits, Trances, & Visions, p. 84.

  38. ibid., , p. 76.

  39. ibid., , p. 80.

  40. Ann Taves, “Visions,” in Ellen Harmon White: American Prophet, ed. Terrie Dopp Aamodt, Gary Land, and Ronald L. Numbers (New York: Oxford University Press, 2014), p. 30.

  41. Taves argues that one of the sources of tension between White and more moderate Adventists such as Himes is her background among the Shouting Methodists, particularly with respect to the general acceptance of visions and other signs among Methodists, where such things were held suspect among other religious traditions of the time. See ibid., , p. 35-39.

  42. For a brief history of the Baptist movements, see Bill Leonard, Baptists in America (New York: Columbia University Press, 2005), pp. 7-32.

  43. Range shaped by whether one looks at national-level data or regional, such as New York State. See David L. Rowe, “Millerites: A Shadow Portrait,” in The Disappointed: Millerism and Millenarianism in the Nineteenth Century, ed. Jonathan M. Butler and Ronald L. Numbers (Bloomington: Indiana University Press, 1987), p. 9.

  44. Judd, “William Miller.”, p. 25.

  45. Leonard, Baptists in America, p. 10.

  46. ibid., , p. 97.

  47. Godfrey T. Anderson, “Sectarianism and Organization: 1846-1864,” in Adventism in America: A History, ed. Gary Land, Revised Edition (Berrien Springs, MI: Andrews University Press, 1998), 29–52, p. 31; Joseph Bates and James Springer White, The Early Life and Later Experience and Labors of Elder Joseph Bates (Battle Creek, MI: Steam Press of the Seventh-day Adventist Publishing Association, 1877), https://www.adventistdigitallibrary.org/adl-366576/early-life-and-later-experience-and-labors-elder-joseph-bates?solr_nav\%5Bid\%5D=c62cab1951275e32383b\&solr_nav\%5Bpage\%5D=0\&solr_nav\%5Boffset\%5D=0, p. 314. For a denominational history of Bates and his adoption of Sabbatarian principles, see George R Knight, Joseph Bates, the Real Founder of Seventh-Day Adventism (Hagerstown, MD: Review Herald Publishing Association, 2004), pp. 93-111.

  48. Hatch, The Democratization of American Christianity, p. 69.

  49. ibid., , pp. 76-77.

  50. ibid., , pp. 73-74.

  51. Joshua V. Himes, “The Advent Herald,” The Advent Herald, and Signs of the Times Reporter 8, no. 11a (1844): 81, http://documents.adventistarchives.org/AdvRelated/AHM/AHM18441016-V08-11a.pdf, p81; Joshua V. Himes, “The Advent Herald,” The Advent Herald, and Signs of the Times Reporter 8, no. 12 (1844): 92–93, https://adventistdigitallibrary.org/adl-422050/advent-herald-and-signs-times-reporter-october-30-1844, pp. 92-93.

  52. This is described in David Tallmadge Arthur, “‘Come Out of Babylon’: A Study of Millerite Separatism and Denominationalism, 1840-1865” (PhD thesis, University of Rochester, 1970), pp. 85-145, and particularly 128-129. See also, Dick, “The Millerite Movement, 1830-1845.”, p. 25. The interpersonal tensions between the Adventist leaders is also discussed in Arthur, “‘Come Out of Babylon’.”, chapter 6. For a description of the three major sects that emerged from the adventist movement, see ibid., , chapters 7-9.

  53. Brekus, Strangers and Pilgrims, p. 332.

  54. Although Miller never endorsed any of the groups that developed after 1844, each group was keenly interested claiming the legacy of Miller. Framing themselves as descendants of Miller was one of the primary goals of the White’s early publishing work, so much so that Miller’s home in New York is included among the “Adventist Heritage Sites” maintained by the SDA. Judd, “William Miller.”, pp. 30-33; Bliss, Memoirs of William Miller, p. 383; Adventist Heritage Ministry, “Adventist Heritage : Welcome to Adventist Heritage Ministry,” 2018, http://www.adventistheritage.org/.

  55. Arthur, “Joshua V. Himes and the Cause of Adventism.”, p. 56.

  56. Also in this group of adventist, and whom SDA leaders such as Ellen White attempted to distinguish themselves from, were those who claimed that the Christ had returned spiritually on October 22, 1844 and that they were, as a result, both holy and immortal, a belief that seemed to give license for activities of questionable morality. See Laura Lee Vance, Seventh-Day Adventism in Crisis: Gender and Sectarian Change in an Emerging Religion (University of Illinois Press, 1999), p.26.

  57. ibid., , pp. 26-27; Anderson, “Sectarianism and Organization.”, pp. 30-33. This teaching originated with another two Seventh-day Adventist “pioneers” — Hiram Edson and O. R. Crozier — who re-examined Miller’s teaching and concluded that the “sanctuary” to be “cleansed” was not the earth (in the second coming) but in heaven. And, as did so many of their contemporaries, they the spread that conclusion through their own publication, the Day-Dawn.

  58. For example, James White, “Letter from Bro. White,” The Day-Star 9.7, 8 (1846); J.S. White, “Definite Time,” Advent Herald, 1846, 150, American Antiquarian Society (AAS) Historical Periodicals Collection: Series 3; J.S. White, “Letter from Bro. J.S. White,” Advent Herald, 1846, American Antiquarian Society (AAS) Historical Periodicals Collection: Series 3; J.S. White, “The Times of Restitution,” Second Advent Watchman 3, no. 23 (1852): 180, American Antiquarian Society (AAS) Historical Periodicals Collection: Series 3.

  59. Ronald L. Numbers, Prophetess of Health: A Study of Ellen G. White, 3rd ed. (Grand Rapids, MI: Wm. B. Eerdmans Publishing Co., 2008), p. 53; White, A Sketch of the Christian Experience and Views of Ellen G. White, p. 4.

  60. Ellen G. Harmon, “Letter from Sister Harmon,” The Day-Star 9, nos. 7, 8 (1846): 31–32, http://documents.adventistarchives.org/AdvRelated/WMC/WMC18460124-V09-07,08.pdf, p. 31.

  61. Ellen Gould Harmon White and Joseph Bates, A Vision, American Antiquarian Society (AAS) Historical Periodicals Collection: Series 3, vol. 1 (1) (Topsham, ME: Benjamin Lindsey, 1847).

  62. Often these pamphlets combined multiple previously published accounts, gathered together for ease of distribution. See James White, Ellen Gould Harmon White, and Joseph Bates, A Word to the “Little Flock” [Reprint] (Battle Creek, MI: Frank E. Belden, 1932), https://adventistdigitallibrary.org/adl-366760/word-little-flock.

  63. In this, I suggest more intentionality in the White’s use of publishing than Arthur Patrick suggests in his account of Ellen White. See Arthur Patrick, “Author,” in Ellen Harmon White: American Prophet, ed. Terrie Dopp Aamodt, Gary Land, and Ronald L. Numbers (New York: Oxford University Press, 2014), p. 91.

  64. “[Title Page],” The Present Truth 1, no. 1 (1849): 1, http://documents.adventistarchives.org/Periodicals/PT-AR/PT-AR-Part1-01.pdf; James White, “Dear Brethren and Sisters —,” The Present Truth 1, no. 1 (1849): 6, http://documents.adventistarchives.org/Periodicals/PT-AR/PT-AR-Part1-01.pdf.

  65. In Seventh-day Adventist terminology, the Sabbath and the requirements for Sabbath observance in separating the saved from the lost is known as the third angel’s message.

  66. Ellen Gould Harmon White, “Dear Brethren and Sisters,” The Present Truth 1, no. 11 (1850): 86–87, http://documents.adventistarchives.org/Periodicals/PT-AR/PT-AR-Part1-11.pdf. See also Floyd Greenleaf and Jerry Moon, “Builder,” in Ellen Harmon White: American Prophet, ed. Terrie Dopp Aamodt, Gary Land, and Ronald L. Numbers (New York: Oxford University Press, 2014), p. 126-7; p. 140, n.2 and n.6. Subsequent versions of this history present a more direct link between Ellen’s visions and James’ publishing.

  67. Fritz Guy, “Theology,” in Ellen Harmon White: American Prophet, ed. Terrie Dopp Aamodt, Gary Land, and Ronald L. Numbers (New York: Oxford University Press, 2014), p. 152.

  68. J.C. Bowles, “Dear Brother White —,” The Present Truth 1, no. 4 (1849): 32, http://documents.adventistarchives.org/Periodicals/PT-AR/PT-AR-Part1-04.pdf.

  69. In addition to publishing The Present Truth, James White published a five issue series called The Advent Review, which featured pieces originally published by Millerite leaders in the years surrounding the end-times anticipation of 1843 and 1844. Through this publication, James White and the rest of the editorial committee connected the teaching of the seventh-day adventists to the early leaders of the adventist movement, position their interpretation as an (if not“the”) authentic inheritors of the adventist tradition.

  70. “The Paper,” Second Advent Review and Sabbath Herald 1, no. 13 (1851): 104, http://documents.adventistarchives.org/Periodicals/RH/RH18510609-V01-13.pdf; “The Paper,” The Advent Review and Sabbath Herald 2, no. 14 (1852): 108, http://documents.adventistarchives.org/Periodicals/RH/RH18520323-V02-14.pdf.

  71. “The Review and Herald. To the Brethren.” The Advent Review, and Sabbath Herald 2, no. 13 (1852): 104, http://documents.adventistarchives.org/Periodicals/RH/RH18520302-V02-13.pdf.

  72. D.R. Palmer, Henry Lyon, and Cyrenius Smith, “General Conferences,” The Advent Review and Sabbath Herald 7, no. 8 (1855): 64, http://documents.adventistarchives.org/Periodicals/RH/RH18551016-V07-08.pdf.

  73. James White, “Battle Creek Conference,” Advent Review and Sabbath Herald 16, no. 17 (1860): 136, http://documents.adventistarchives.org/Periodicals/RH/RH18600911-V16-17.pdf.

  74. An example of this can be seen in James White’s response to a letter from a “Brother R. Miles” calling for the end of the Review in light of their organization efforts. See James White, “I Want the Review Discontinued,” Advent Review and Sabbath Herald 16, no. 19 (1860): 148, http://documents.adventistarchives.org/Periodicals/RH/RH18600925-V16-19.pdf. The debates regarding church organization were published in issues 16.21-23 of the Advent Review and Sabbath Herald for October 1860. For more on this, see Jonathan M. Butler, “Adventism and the American Experience,” in The Rise of Adventism: Religion and Society in Mid-Nineteenth-Century America, ed. Edwin S. Gaustad (New York: Harper & Row, 1974), 173–206, p. 178-180.

  75. “The Seventh-Day Adventist Publishing Association …,” Advent Review and Sabbath Herald 17, no. 25 (1861): 200, http://documents.adventistarchives.org/Periodicals/RH/RH18610507-V17-25.pdf.

  76. John Byington and Uriah Smith, “Report of General Conference of Seventh-Day Adventists,” Advent Review and Sabbath Herald 21, no. 26 (1863): 204–6, http://documents.adventistarchives.org/Periodicals/RH/RH18630526-V21-26.pdf.

  77. Bull and Lockhart claim that Seventh-day Adventism is “not the estranged child of Methodism or any other mainstream American Protestant body. It is rather an orphaned offspring of the brief liaison of the several Protestant groups that made up the Millerite movement.” This metaphor suggests a high degree of separation between the SDA and its denominational sources, which requires further study, but reinforces the SDA self-perception as a unique religious movement, rather than a branch of an existing denomination. Bull and Lockhart, Seeking a Sanctuary, p. 102.

  78. ibid., p. 42.

  79. ibid., , p. 89.

  80. Additionally, those saved, “the saints,”would receive immortality at the Second Coming (process described as translation) and would reign with Christ during the millennium. Only after this period would all the dead be raised, judged, and destroyed. See ibid., , pp. 89-90.

  81. Bate’s defense of White’s visions in his introduction of them indicates that skepticism was the expected response. White and Bates, A Vision.

  82. Anderson, “Sectarianism and Organization.”, p. 35.

  83. ibid., , pp. 35-36; Bull and Lockhart, Seeking a Sanctuary, p. 44.

  84. ibid., , p. 103.

  85. Ellen Gould Harmon White, “Dear Brethren and Sisters —,” The Present Truth 1, no. 3 (1849): 21–24, http://documents.adventistarchives.org/Periodicals/PT-AR/PT-AR-Part1-03.pdf, p. 21.

  86. Bull and Lockhart, Seeking a Sanctuary, p. 54.

  87. For additional analysis, see ibid., , pp. 57-58.

  88. ibid., , p. 103.

  89. Bratt, “Religious Anti-Revivalism in Antebellum America.”, p. 78.

  90. ibid., , p. 90.

  91. Ruth Alden Doan, “Millerism and Evangelical Culture,” in The Disappointed: Millerism and Millenarianism in the Nineteenth Century, ed. Ronald L. Numbers and Jonathan M. Butler (Bloomington: Indiana University Press, 1987), p. 122.

  92. Bratt, “Religious Anti-Revivalism in Antebellum America.”, p. 102; Ruth Alden Doan, The Miller Heresy, Millennialism, and American Culture (Philadelphia: Temple University Press, 1987), p. 29.

  93. These approach reached its maturity with the Social Gospel movement of the late nineteenth and early twentieth centuries. Doan, “Millerism and Evangelical Culture.”, p. 130; Doan, The Miller Heresy, Millennialism, and American Culture, p. 29.

  94. Bratt, “Religious Anti-Revivalism in Antebellum America.”, p. 93.

  95. Brekus, Strangers and Pilgrims, p. 333.

The Structure of A Gospel of Health and Salvation

Making the bi-directional argument of the dissertation required that I use a different medium than print. Books, and historical monographs in particular, operate according to genre conventions that encourage linear narrative development. Web documents, by contrast, operate according to different conventions — such as hyperlinking, interactivity, and layering — that create space for different kinds of argumentation. Because I am pursuing two inter-related arguments in this dissertation, and relying on data, code, and interactive visualizations to make those arguments, the conventions of web documents are a better fit for the work I am undertaking in this work.

For readability, the dissertation website does mirror some of the conventions of print dissertations. The main content of the dissertation is structured into four chapters, an introduction, conclusion, and the traditional supporting text of a bibliography and acknowledgments. However, although the chapters of this dissertation build sequentially, unlike a traditional narrative history, they are not structured to move forward through time or to unfold as a story. Rather, each chapter provides a different aspect of the overall analysis.

Additionally, the dissertation leverages the conventions of web documents to expose and connect the layers that make up the analysis of the dissertation. The site contains a collection of code notebooks that document the computational work behind the interpretive analysis as well as an interactive “browser” that provides access to the topic model at the heart of the dissertation. These components are linked to within the essays to expose the underlying computational analysis or can be explored independently for readers interested primarily in the methodological aspects.

As this dissertation relies on different conventions than traditional narrative texts, the interfaces require some explanation. The dissertation can be read in a number of ways. First, it can be read according to either of the two tracks identified on the website home page: “Ellen White and Gender in Seventh-day Adventism” for readers interested in the historical interpretation or “Computational Methods in History” for readers interested in the methodological arguments of the dissertation. Additionally, the project can be read according to the links in the top navigation, which guide the reader from the topic model browser, to the essays, code, and conclusion. Each of these elements is introduced below to explain how the pieces fit within the overall argument of the dissertation.

Topic Model Browser

At the core of the project is the topic model browser of the SDA literature, accessible at http://browser.dissertation.jeriwieringa.com. This model, built using MALLET and visualized using Andrew Goldstone’s DfR Browser, provides a window on the language used across the corpus of Seventh-day Adventist texts that I assembled. The interface is organized according to my interpretation of those topics, as reflected in the topic labels. I discuss my interpretive process for creating the labels in Chapter 3.

Essays

The four chapters that make up the primary text of the dissertation establish the context for and interpret the data from the topic model. These chapters build on one another and together offer the argument of the dissertation in a more traditional narrative form.

The first chapter, “The Emergence of Seventh-day Adventism,” provides an overview of Seventh-day Adventism in light of standard accounts of nineteenth-century religion and culture. This chapter situates the denomination historically, tracing their roots in the Second Great Awakening and William Miller’s predictions of the second coming as well as providing an overview of their distinctive theological and cultural features in relation to their development from a “sect” to a “denomination.”

The second chapter, “Constructing Computational Models from Historical Texts,” provides methodological arguments for the sourcing, processing, modeling, and visualization aspects of the project. This chapter is the most technical in content, tying the computational work to the historical interpretation.

The third chapter, “Anticipating the End of Time,” establishes the problem of time within early Seventh-day Adventism. A case study on using topic models to identify and reveal overarching trends in a corpus for historical analysis, I trace the prevalence of “end-times” topics over time. The study reveals three main cycles of end-times expectation that shaped the cultural development of the denomination.

The fourth and final chapter, “The Gendered Work of Salvation at the End of Time,” builds on the structure of time identified in chapter three to argue that the belief in the impending end of time created space for the development of a culture that was in tension with surrounding norms. The nearness of the second coming necessitated that men and women cooperate in the work of salvation, both within the home and in the missionary activities of the church, resulting in an expansive understanding of gender within the denomination.

Notebooks

As a project based on the computational analysis of text, a large portion of the intellectual and scholarly work that makes up the dissertation exists in code. In considering the methodological processes around the use of computational analysis in history, this dissertation makes the claim, both explicitly and implicitly, that this technical work is of central importance to the scholarly object of the dissertation. It does this through the inclusion of code and data files in a collection of notebooks, which document the computational work that went into creating and interpreting the topic model.

Where traditional methods of review rely on the use of clear and well formatted footnotes so that the trail of evidence can be retraced and evaluated, the review of computational work requires the use of documented and accessible code, so that both the execution of the techniques and the assumptions embedded within the technical choices can be seen and evaluated. The code elements can be viewed through the dissertation’s web interface or can be downloaded and run for verification or for adaptation to other projects. The work of making the technical elements of the dissertation visible and extensible is key to establishing processes for validating computationally-based scholarship in the humanities.

About: Process Statement and Bibliography

The final elements of the dissertation are the process statement and bibliography. Following the requirements of the Department of History and Art History at George Mason University, I provide an account of the methods and technologies used in the creation of the dissertation in the process statement, including the final websites, the modeling algorithms, and the dissertation sources. The bibliography lists the primary and secondary sources, as well as the computational tools used in the dissertation. This includes a full listing of the periodical issues used to create the topic model. Together, all of these aspects of the project constitute the body of work that is A Gospel of Health and Salvation.

Constructing Computational Models from Historical Texts: A Consideration of Methods

In order to identify periods of millennial expectation in the development of Seventh-day Adventism and show the effects of those periods on the culture of the religious movement, I used the computational method of topic modeling to identify patterns in the language of the denomination over time. I chose this method in consideration of the quality of the available data, the assumptions regarding time embedded in existing algorithms, and the research questions of the dissertation. Foundational to my research and shaping the interpretive possibilities of the project as a whole, there are three methodological aspects that require a more detailed discussion.

Discussion of methods, such as what follows here, and a commitment to reproducible code provide the information necessary to evaluate research based on computational analysis. One persistent critique of arguments based on computational analysis is that it is difficult to assess the soundness of the arguments.1 Even between projects that use similar modes of computational analysis, variations in corpus selection and processing can lead to conclusions that are difficult to compare. Some of that difficulty can be addressed by discussing the methodological aspects of the project, such as the match between sources and question, the applicability of algorithms to the questions at hand, the process of applying the algorithm to the data, and the evaluation of the resulting model. Additionally, executing and documenting code in ways that can be reproduced by others makes visible the implicit and explicit assumptions enacted in the code, opening it to evaluation and critique. By describing my sources, computational approach, and methods of interpretation, I am providing the means by which the project can be verified and assessed, similar to how footnotes and discussions of research methods support the verification and assessment of historical research presented in monograph form.

Developing a Corpus for A Gospel of Health and Salvation

To identify significant and meaningful patterns in the development of the religious culture of Seventh-day Adventism, this project requires an appropriate set of sources to abstract from, both in terms of content and format. As I discuss in the introduction and Chapter 1, periodicals played a key role in the development of the religious culture of early Seventh-day Adventism, and as such provide content well-suited to the intellectual question at hand. While most nineteenth-century religious denominations utilized print to connect their members, as a medium for evangelism, and as a forum for articulating theological differences, periodical literature played a particularly important role in the development of early Seventh-day Adventism.2 It was to periodicals that Ellen and James White first turned in their work to unite the adventists in the years after Miller’s failed prediction of the second coming on October 21, 1844.3 Ellen White defended the use of periodicals in 1850, reporting that she was shown in a vision that God was seeking to bring his people together around the truth of the seventh-day message, an effort that required “that the truth be published in a paper, as preached.”4 The numerous publications of the denomination functioned as the force that brought into being the movement’s “imagined community.”5

While the content of the periodicals provides a window onto the denomination’s developing theology and culture, the denomination has also invested in sharing their teachings as widely as possible through the large scale digitization of their historical materials. This commitment includes the official editing and publication work of the Ellen G. White Estate, which has focused primarily on White’s writings, lay efforts to digitize the denonination’s periodicals, and current work to aggregate the digital resource of the denomination in the Adventist Digital Library.6 As a result of these efforts, a large collection of the historical materials of the denomination are openly available in digital formats, making computational analysis of this material feasible.

The main collection of denominational periodicals is available through the website of the Office of Archives, Statistics, and Research, part of the General Conference of the denomination. This collection privileges the bureaucratic documents of the denomination, including statistical documents, minutes and reports, and the major periodical titles of the denomination. Despite this institutional focus, the collection provides access to a wide variety of periodical literature produced by the denomination, including the central organs of the denomination as well as regional and college publications.

Union conferences included in study

Lake Union Conference Boundaries
Lake Union Conference Boundaries in 1908. From the Yearbook from 1908.

Pacific Union Conference Boundaries
Pacific Union Conference Boundaries in 1908. From the Yearbook from 1908.

Columbia Union Conference Boundaries
Columbia Union Conference Boundaries in 1908. From the Yearbook from 1908.

Southern Union Conference Boundaries
Southern Union Conference Boundaries in 1908. From the Yearbook from 1908.

I focused my study on the periodical literature produced from four geographic regions within the United States: the Great Lakes (organized within the denomination as the Lake Union Conference), the West Coast (Pacific Union Conference), the Mid-Atlantic (Columbia Union Conference), and the Southern Mississippi River Valley (Southern Union Conference). These four regions were home to the major publishing and health reform centers of the denomination, and were sites of missionary activity as well as contention over the leadership of the denomination. Focusing on these four regions also provides a useful balance between two of the largest regions of the SDA (Lake Union and Pacific Union) and two of the smaller regions (Columbia Union and Southern Union).

Using the boundaries of the regions as defined by the denomination in 1920, my initial research step was to gather the publications that were produced within these four regions and which have been digitized by the denomination. Because this is a specialized historical collection, only a few of the Seventh-day Adventist periodical titles have been digitized as part of large digitization projects, such as HathiTrust, and a few titles are included in the Historical Periodicals Collection produced by the American Antiquarian Society. The most extensive and easiest to access collection of the periodicals has been produced by the denomination itself.7

Limiting my selection to periodicals published from within the above four regional conferences and digitized by the Seventh-day Adventist denomination, I gathered a corpus composed of thirty different periodical titles, of which there were 13,340 issues and 197,943 pages of material. For a full list of the titles included in this study, see Table 2.2. This scale of materials is important, as it enabled me to pursue a richer understanding of Adventist discourse than I could achieve were I to rely on one or two titles alone. But it also requires approaching the materials differently, relying on computational tools to identify areas of interest and to surface general patterns in the discourse of the denomination, with an awareness of the strengths and limitations of both the data and the methods.

Figure 2.1: Total number of tokens (words) from each periodical title in the corpus. The four largest titles are RH (Review and Herald), ST (Signs of the Times), YI (Youth’s Instructor), and HR (Health Reformer). Graph generated with the Plot.ly library.

The largest periodicals in my study are the four centrally produced denominational titles — the Review and Herald (RH), the Youth’s Instructor (YI), Signs of the Times (ST), and the Health Reformer (HR). In these titles, the leaders of the denomination sought to guide beliefs and practices, publishing pieces on theology, health, and the everyday life of Seventh-day Adventists. Also through these publications, community members responded back to the leadership through letters and submissions of their own. The corpus contains twice as many pages of the largest title, the Review and Herald, than of the next most significant contributor, the Health Reformer. Together, the Review and Herald, Youth’s Instructor and Signs of the Times, plus the General Conference Bulletin, The Present Truth and Advent Review constitute half of the entire corpus, weighting the dataset heavily toward the centrally produced publications of the denomination.

However, relying on the central publications would provide only a partial picture of the diversity of thought and practice within the denomination. While vital to understanding the development of the institution over time, the viewpoints expressed in these publications privilege the official positions of the denomination. Covering a number of topical and regional focuses, including education, health, missions, and “religious liberty,” the other half of my corpus offers a different range of voices.8 Many of the newly-formed regional and area conferences began producing their own periodicals in the early 20th century as a way to connect the local community, as well as to report back to the international denomination. These regional publications, although often short-lived, provide colloquial information on the activities of the denomination, featured reports from the local conferences, letters, and updates on the activities of the conference, particularly with regards to selling denominational literature.

The proliferation of topical and regional publications also created challenges for denominational leaders, particularly when the viewpoints presented came into conflict with the official position of the denomination. Traces of these conflicts can be seen in the remaining archival record, as the efforts to shape and control the denomination’s message continues into the present. A useful example of this can be seen in the digitization and distribution of the denomination’s health periodicals. The Health Reformer was one of the early publications started by James and Ellen White, and was the first new publication launched after the denomination organized in 1863. By 1875, John Harvey Kellogg had taken the lead as the principal editor and his control over the publication was such that the content of the periodical reflects the development of Kellogg’s own understanding of health as it changed over time. In 1907, Kellogg was disfellowshipped by the denomination, though he retained control over the publication of the Health Reformer and continued its publication well into the 20th century. The archivists and digitization teams at the Seventh-day Adventist church, however, did not include any issues of the Health Reformer (later known as Good Health) after 1907, an archival enactment of denominational politics and reflective of the fact that after 1907, Kellogg was no longer considered to be a voice of the denomination. Rather, Life and Health, which was published out of Washington, D.C., became the central publication for the denomination’s health message, one that was under closer control by the religious leaders of the church than the Health Reformer ever was.

Figure 2.2: Total words in the corpus per year, with color designating the periodical title. The total number of words increases each year as the publishing work of the denomination expanded over time. Created with the Plot.ly library.

Beginning in 1883, twenty years after they formally organized, the SDA began publishing yearbooks to record and disseminate information about the denomination, including information on who held leadership roles within the different state and regional conferences and the various institutions connected to and run by the denomination. These publications contain a plethora of information about the organization, finance, and leadership of the early church, and useful for my purposes, information about the different publishing ventures undertaken. Using the data in the yearbooks, it is possible to provide a sketch of the publications that are missing from this study.

The budget reports and advertisements published in the yearbook from 1883 indicate that the main periodicals produced by the S.D.A. Publishing Association, then based in Battle Creek, Michigan, were the Review and Herald, Youth’s Instructor, and Good Health (Health Reformer), along with four non-English titles — Advent Tidende (Danish), Advent Harolden (Swedish), Stimme der Wahrheit (German), and De Stem der Waarheid (Dutch). The Pacific Seventh-day Adventist Publishing Association, based in Oakland, California, produced one English-language title, Signs of the Times, and one additional title, Les Signes des Tempes, was produced in French in Switzerland.9 I chose not to include the foreign language titles in my study to focus on the development of the English-language discourses within the denomination. With those excluded, the titles listed in the 1883 yearbook indicate that the titles I have included in my study encompass all of the English-language periodicals produced by the denomination at that point in time and presumably for the first 30 years of its development.

In the years after 1883, the number of publishing houses and topical and regional titles began to increase precipitously. As some of these titles had relatively short publication histories, I examined the yearbooks from 1890, 1904, and 1920 to identify the new reported titles and to compare those to the available digitized titles. Again focusing on only the English-language titles produced from within my four regions of study, I identified twenty relevant titles that had not been digitized by the denomination at the time of my study. These were: The Southern Watchman, Bible Students’ Library, Apples of Gold Library, Our Little Friend, East Michigan Banner, The Haskell Home Appeal, The Illinois Recorder, Southern Illinois Herald, The Wisconsin Reporter, Signs of the Times Magazine, The Present Truth, The Medical Evangelist, and Field Tidings. In addition, of the eight college publications, only the The Sligonian has been digitized by the denomination.10 Although it has also been digitized, I did not locate the Sabbath School Quarterly in time for inclusion in this study. In total, out of the 52 titles produced by the denomination between 1849 and 1920, 29 of the titles, or 55%, were available in digital formats and included in my study. If the college publications are excluded, 28 of the 44 main denominational periodicals, or 63%, were available digitally.11

In addition to these missing titles, there are a number of holes to note in the digitized record as it stood at the time of this study. Whether because the original materials have been lost or due to digitization priorities, the denomination does not have full runs available for all of its digital titles. Most notably, the digital coverage of the Youth’s Instructor is sporadic, especially for the 19th-century, with holes in coverage between 1856-1859; 1863-1870; 1873-1879; 1891-1895; and 1896-1898. This publication, the second denominational title produced by James White, contained material aimed at educating children in the Seventh-day Adventist faith. This is also a title that was frequently written for and edited by the women leaders of the denomination. Additional titles with significant gaps in coverage are The Welcome Visitor (later retitled the Columbia Union Visitor) and The Indiana Reporter. The gaps in digital materials from these publications require creative strategies to use the limited issues available to reveal the distinctive voice these editors provided to the developing community.

I have documented the discontinuities between the sources digitized by the Seventh-day Adventist denomination and the record of their publication history in order to model a critical engagement with a digital collection of materials. By documenting what is included and excluded from the digital collection, I am able to understand the limitations of the data I have at hand and begin to develop strategies for addressing those limitations in my analysis. The documentation also reveals opportunities for the extension of my analysis. While my study is focused on the rhetoric within the U.S. faction of the denomination, the denomination had a significance presence globally, particularly in Australia. For those interested in the religious culture within other regions of the U.S., the regional titles from those areas could be added. An analysis of the non-English titles could be added to explore how the denomination shifted their message when addressing European immigrant groups in the United States. Additional work could be done comparing the health literature of the denomination to that produced by other health reformers of the day.

Preparing Text for Analysis

For scholars working with digital sources, particularly for computational analysis such as text mining and natural language processing, there is much to be gained by attending to the quality of the data and, where possible, improving that quality. The strength of computational algorithms is that they perform consistently and logically upon whatever data they are given. The weakness is that, unlike a human reader who will generally infer that the “IN S TRUCT OR” they encounter in a periodical entitled “Youth’s Instructor” most likely should be read as “INSTRUCTOR,” the computer, unless trained to do so, will make no such inference. The data given to it will be processed literally, in this case as four distinct words. As a result, the quality of the output depends directly on the quality of the initial data. If that data is riddled with errors, then the models created will reflect those errors, and frequently exacerbate them. As the adage goes, “garbage in, garbage out.”12

Because of the significant effect transcription errors can have on the ability to search, classify, and analyze texts, scholars in computer science and information retrieval, as well as across the digital humanities, have developed a number of strategies for identifying and correcting regularly occurring errors. Much of this work has focused on texts produced before the 19th-century, as the difference in typographical conventions, such as the long s (ſ), create known challenges for modern OCR engines. In addition to developing a series of regular corrections to received textual data, current work in information retrieval is focused on using probability and machine learning to estimate the most likely correct substitution for errors using general language patterns.13 The work of computationally correcting the errors generated during character recognition is an important step in developing a source base that is reliable for tasks from information retrieval to computational analysis, and is particularly important when working with the peculiarities of historical documents.

Optical Character Recognition and the Creation of Digital Texts

Institutional investment in the creation of digitized historical documents has reached the point where there is a sufficient amount of materials available to begin to engage questions of how digital and digitized resources might be used in computational historical studies. In addition to the resources of Google Books, HathiTrust, the Internet Archive, and Project Gutenberg, historians looking at the United States have the collections of the Library of Congress, the National Archives, and the Digital Public Library of America, along with a multitude of state, local, and organization-level digitization efforts, as sources for digital content. When vendor controlled resources, such as those from Gale, ProQuest, and Ancestry.com, are added to the list, there seems to be an overwhelming abundance of digital materials available for historians to work with.

It is tempting to look at this scale of digital material and to conclude that “Historians, in fact, may be facing a fundamental paradigm shift from a culture of scarcity to a culture of abundance,” though shifting political realities raise concerns regarding the fragility of the digital record.14 The digital record presents a wide range of challenges for current archival and research practices. On the one hand, questions of the stability of digital objects and the media upon which they depend requires engagement with the technical work of migration, emulation, and digital forensics as part of archival practice to mitigate the problems of bit rot and technological obsolescence that threatens digital records. At the same time, an abundance of digital materials does not necessarily equate to the quality of those materials or their usefulness for computational analysis. The apparent abundance of the current digital record elides numerous problems with the scope and quality of the materials that have already been translated into digital media.

This is due to the processes by which the physical materials of the past are translated into their digital counterparts. The first stage of digitization of a material object is imaging — the creation of a digital facsimile of the object through the use of some sort of imaging technology, most often through the use of a digital camera or a scanner. While the best facsimiles are created using the original printed materials, often digitizing agencies use previously created microfilm and microfiche materials as the source for digital objects as a cost and time saving measure. These digital files are displayed online or distributed on various media such as CDs or flash-drives, and, assuming the image quality is at least mediocre and the original document was in good physical condition, can be deciphered by human readers.

However, the computer only recognizes these files as image files — collections of pixel information with no text information. This brings us to the second phase of digitization: transcription. The creation of a text layer for images of textual materials can be done in a variety of ways. Manual transcription involves the work of a human actor who reads the original document and creates a corresponding digital document. This work is slow, though at times when the original document was handwritten or has physical features that make the text difficult to read, it may be the only currently viable way to create a digital text. In most cases, however, researchers rely on Optical Character Recognition (OCR) to decipher text from image files. OCR uses various algorithms to match patterns in an image file to the most likely corresponding letter, and the error rate for a recognition task reflects the percentage of characters that are correctly identified. The advantage of OCR is that it can be used to process a large number of image files quickly and, on the whole, accurately, enabling computational access to written words that were previously only interpretable by a human reader.

While at the core of modern efforts to bring the past online, the challenges of working with text generated through OCR compound as one moves backward through time, as the documents become further removed from the modern materials used to develop and train the software and the standardization of everything from font and layout to spelling decreases. Because OCR engines were created for (and trained on) printed text from second half of the 20th century, the error rates for materials published earlier tend to be higher. Differences in typography, in layout conventions, and lack of standardized spelling all contribute to the errors that OCR engines are likely to make when processing historic documents. In addition, physical blemishes, such as tears, stains, and discoloration in the case of the material objects, low image quality, and previous scanning errors introduce additional challenges for the OCR engine. It is this textual layer, particularly when it is created using OCR software, its use in digital corpora, and its limitations and their potential remediation that both enables and constrains the use of computational methods with historic documents.

Within historical scholarship, the limitations of the current digital materials are often first encountered within the context of search and retrieval, as these are modes of interaction that define the majority of scholars’ interactions with the digital and digitized ecosystem. While the promise of the online database is that millions of texts are merely a keyword search away, how well that promise is being fulfilled is not easy to ascertain. Studies of the material included in the Eighteenth Century Collections Online (ECCO) collection and within the Burney Collection, which is a Gale product containing 17th and 18th century English newspapers, raise significant concerns about the representativeness of the documents included in these prominent subscription databases and the quality of the available data.15 For historians working from digital materials, regardless of method, tackling questions of data transparency and accuracy is of increasing importance as part of documenting the process of historical research.

These data challenges are not limited to large subscription sources, as documented by the Mapping Texts project from the University of North Texas and the Bill Lane Center for the American West at Stanford University, and supported by the National Endowment for the Humanities. Originally conceived with the goal of developing tools and strategies for identifying meaningful patterns within the millions of newspaper pages of the Chronicling America project, the teams quickly realized that their ability to ask research questions of the data was dependent upon understanding which questions “could be answered with the available digital datasets …”16 To address this problem, the Mapping Texts team created an interface that highlights the coverage of the newspaper collection, particularly in terms of geography and dates, as well as the quality of the data.

Screenshot of Mapping Texts website.

Figure 2.3: Map of Texas created by the Mapping Text teams. Size of the circle indicates the number of words associated with the newspapers from a particular city, while color of the circle indicates the quality of those words as evaluated by the Mapping Texts team.

The resulting interface shows an overall low “good word” rate across the Texas newspapers in Chronicling America, with most of the best performing areas hovering around 80% while worse performing areas hover around 60% of the identified words being recognized.17 This data enables readers to temper their expectations about what can be known from the analyzed data, a useful counterbalance to the seeming expansiveness and authority of digital materials.

The situation encountered by the Mapping Texts team is not unique to the Chronicling America data. A quick browse through the “text view” of content in HathiTrust and Google Books reveals that it does not take very many OCR errors to make the text illegible. In the case of vendor resources, it is harder to ascertain the quality of the data, as noted previously, because they tightly control their content through contracts with research libraries. However, unless the vendors have devoted massive resources to the correction of their textual data, it is safe to assume that their data suffers similar error rates. This is also true for other third-party data sources, such as the online archives of the Seventh-day Adventist church.

In preparing my corpus for analysis, I considered two of the challenges that historical documents raise for OCR and automated text extraction: errors in character identification and errors in layout recognition. Of these, I found that character recognition errors are the primary error discussed in the computer science and information retrieval literature, as these can be addressed after the fact more easily than errors in layout recognition and are more detrimental for standard search functions. However, errors in layout recognition pose potential challenges for algorithms that represent relationships between words, a representation that is infrequently used in current databases, but which is increasingly prominent in computational text analysis.

Character Recognition

Errors in character recognition are perhaps the most common, and best known, problem encountered when moving from a physical to a digital object. From decorative borders that are interpreted as strings of “i”s and “j”s to confusion between characters such as “e”, “c”, and “o”, these are the errors readers first notice when glancing at the “text view” of digitized documents. Recognition errors create problems across a wide range of processing functions, including search and retrieval and natural language processing tasks, and as a result there is a significant field of literature from both computer science and digital humanities describing strategies for improving the quality of text generated through OCR.18 Some of these errors, such as mis-recognized line endings, out of place special characters, and the like, are possible to address with various forms of search and replace. Errors within words, however, are a more difficult problem to address. Because of the complexities of the word errors, simplistic strategies, such as creating lists of error-correction pairs or using standard spelling programs, have been shown not only to fail to improve OCR quality in historical documents, but to result in higher recorded error rates.19 By and large, the scholarship in the field indicates that approaches that combine information about word-use probabilities and the surrounding context are most successful at replacing errors with the correct substitution.

There are two basic approaches for evaluating the accuracy of transcribed data: comparison of the generated text to some “ground truth” or text that is known to be accurate and comparison of the words in the generated text to some bank of known words. The first method is used by Simon Tanner, Trevor Muñoz, and Pich Hemy Ros in their evaluation of the OCR quality of the British Library’s 19th-Century Online Newspaper Archive. Using a sample of 1% of the 2 million pages digitized by the British Library, the team sought to calculate the highest areas of OCR accuracy achieved by comparing the generated XML text to a sample that had been “double re-keyed.”20 While providing very accurate results about the quality of the generated texts, this approach is highly labor intensive. For my much smaller corpus of 197,000 documents, creating a ground truth dataset of 1% represents nearly 1,000 hours of work, assuming that each document could be transcribed with 100% accuracy in 30 minutes. The second approach, and the one pursued by the Mapping Texts team, is to compare the resulting text to some authoritative wordlist. This approach is easier to implement, as it takes much less time to compile a relevant wordlist than to rekey nearly 2000 pages of text. However, the results are much less accurate as the method is blind to places where the OCR engine produced a word that, while in the wordlist, does not match the text on the page or where spelling variations that occur on the page are flagged as OCR errors because they are not included in the wordlist.

Weighing the disadvantages of both approaches, I chose to use the second method, comparing the generated OCR to an authoritative wordlist. The creation of a source specific wordlist also opens up possibilities for error correction using a probabilistic approach, such as that used by Lasko and Underwood.21 However, creating a wordlist that is both sufficiently broad to recognize the more obscure words of the literature but not so broad to miss a large percentage of errors is a challenge.

To solve this, I took an iterative approach, first creating a base wordlist, and slowly adding the more specialized vocabulary used frequently in the denominational periodicals. I chose to use Spell Checker Oriented Word Lists (SCOWL) to generate the base wordlist, rather than the more commonly used NLTK (Natural Language ToolKit) wordlist. This provided me the option to pull from multiple dictionary sources and set various parameters for alternate spellings and range of words.22 To this list, I added words from the King James Bible (sourced from the Christian Classics Ethereal Library), people and place names from the denominational yearbooks, and place names from the National Historical Geographic Information System.

With these wordlists in place, I logged the reported errors for each title and then worked title by title to identify the most frequently reported errors in order to improve the coverage for community specific language. There were two advantages to this step. First, it helped me identify regular error patterns generated by the OCR process, such as extra characters and detached word endings. While the overall quality of the OCR was surprisingly good, the smallness of the font coupled with the tightness of the columns, particularly for the mid-19th century issues, caused the most confusion for character recognition. In addition, the density of the earlier pages seems to have created a challenge for the original publishers in terms of letter availability. A striking number of reported “errors” were due, not to character mis-recognition, but to the intentional use of incorrect characters in the original type — “c” in place of “e”s, for example, or regular ‘misspellings’ of particular words — errors that the human reader easily “corrects” for but that the machine readers does not (nor would we necessarily want it to). These regular errors, which are due not to OCR transcription failures but due to the historical realities of a limited supply of letters and editorial license with spelling, raise questions of normalization when working with historical materials.

Second, it enabled me to identify useful additions to my set of wordlists, additions that also shifted my conception of the type of writing these early Seventh-day Adventists produced. Particularly in the area of health, the Seventh-day Adventist writers used a specialized, and at times, inventive vocabulary for describing ailments and their cures. Their vocabulary, however, was not created from within a vacuum. While the scholarship on 19th-century popular religion tends to emphasize the populism and enthusiasm of revival preachers and adherents, that enthusiasm was also in conversation with literature on theology, educational theory, and medicine.23 The “errors” identified using the default wordlists reveal references to various educational theorists, a wide variety of theologians, and different medical theories and their proponents. Working systematically through the thirty titles of this study, I created a Seventh-day Adventist vocabulary as the wordlist against which to compare the OCR tokens.

Layout Recognition

The second challenge for creating textual data from scans of 19th-century newspapers is the problem of layout recognition when the documents do not match the layout and design patterns of 20th-century (and electronically produced) documents. With space at a premium, publishers packed each page with multiple columns of type in compact fonts. This strategy, while more or less readable to the original human audience, has resulted in images that strain the layout recognition algorithms of OCR software. With thin column lines and narrow, if existent, margins between the text, the recognition algorithm makes reasonable assumptions about the blocks of text, but with the result that the columns are blurred and the text jumbled.24 While this textual soup still provides the necessary data for problems of search and retrieval, the unreliability of sentence and paragraph structures makes standard natural language processing techniques, such as part-of-speech tagging and named-entity recognition, difficult with scanned historical documents. Additionally, the impact of the irregular word order on newer algorithms for textual analysis, such as word-embeddings and deep learning, remains an open question.

For the documents scanned and OCRed by the Seventh-day Adventist church, such problems of layout recognition are a recurring feature within the textual data. To highlight the challenge that layout recognition causes for the OCR of historic newspapers, I used Adobe Acrobat Pro to reanalyze one of the early Review and Herald PDF files, which identified and highlighted the different sections of text that the program recognizes on the page (see Figures 2.4 and 2.5). While sometimes successful, often times the peculiarities of nineteenth-century documents, including the narrow columns and the small font sizes, stretch the recognition algorithms well beyond their capacity.

Figure 2.4: Layout of page 7 from The Advent Review and Sabbath Herald, volume 5.5 as analyzed by Adobe Acrobat Pro. Note that, although articles are split into multiple sections, overall the columns have been correctly parsed.

Figure 2.4: Layout of page 7 from The Advent Review and Sabbath Herald, volume 5(5) as analyzed by Adobe Acrobat Pro. Note that, although articles are split into multiple sections, overall the columns have been correctly parsed.

Figure 2.5: Layout of page 6 from The Advent Review and Sabbath Herald, volume 5(5) as analyzed by Adobe Acrobat Pro. Note that the columns have not been correctly parsed, with most lines spanning two columns, and some spanning three.

Figure 2.5: Layout of page 6 from The Advent Review and Sabbath Herald, volume 5(5) as analyzed by Adobe Acrobat Pro. Note that the columns have not been correctly parsed, with most lines spanning two columns, and some spanning three.

Errors in layout recognition are harder to correct than mistakes in character recognition, in that they are best addressed during the recognition stage of the document processing and, as such, require time. Databases that organize content based on the article (rather than the issue), such as those produced by the American Antiquarian Society, are incentivized to ensure that the boundaries of each article are correctly detected. However, for large scale digitization projects, particularly those with limited budgets, often few resources are devoted to training the OCR engine. For many of these projects, such as the digital archive of the SDA, the primary “reader” of the document is assumed to be human and the primary object is the image file of the original document. The text layer is a secondary feature, useful primarily for search and retrieval.

OCR Correction in Action

What does this look like in practice? To highlight the problems that frequently appear in OCRed documents and the improvements that I made with relatively simple cleaning mechanisms, here is an example of the text from the first volume and issue of The Health Reformer.

The Health Reformer, Vol. 1, no. 1, (August 1866), page 3.

DUTY TO KNOW OURSELVES.
preserve it in a healthy condition. The in their physical organism, will not be present generation have trusted their bod less slow to violate the law of God ies with the doctors, and their souls with spoken from Sinai. Those who will not, the ministers. Do they not pay the min after the light has come to them, eat and ister well for studying the Bible for them, drink from principle, instead of being that they need not be to the trouble ? and controlled by appetite, will not be tena is it not his business to tell them what cious in regard to being governed by they must believe, and to settle all doubt principle in other things. The agitation ful questions o f theology without special
investigation on their part? If they are
sick, they send for the doctorÑbelieve
whatever he may tell, and swallow any a Ò god of their bellies.Ó
thing he may prescribe ; for do they not Parents should arouse, and in the fear pay him a liberal fee, and is it not his of God inquire, what is truth ? A tre business to understand their physical ail mendous responsibility rests upon them. ments, and what to prescribe to make
them well, without their being troubled
with the matter ?
Children are sent to school to be taught
the sciences; butthe science of human life
is wholly neglected. That which is of the
most vital importance, a true knowledge
of themselves, without which all other
science can be of but little advantage, is
not brought to their notice. A cruel
and wicked ignorance is tolerated in re laws that govern physical life. She gard to this important question. So should teach her children that the indul closely is health related to our happiness, gence of animal appetites, produces a that we cannot have the latter without
the former. A practical knowledge of
the science of human life, is necessary in
order to glorify God in our bodies. It is
therefore o f the highest importance, that
among the studies selected for childhood,
Physiology should occupy the first place. be to their children, both teacher and How few know anything about the struc physician. They should understand na ture and functions o f their own bodies, tureÕ s wants ancl natureÕ s laws. A care and of NatureÕs laws. Many are drifting ful conformity to the laws God has im about without knowledge, like a ship planted in our being, will insure health, at sea without compass or anchor; and and there will not be a breaking down what is more, they are not interested to o f the constitution, which will tempt the learn how to keep their bodies in a healthy afflicted to call for a physician to patch condition, and prevent disease. .them up again.
The indulgence of animal appetites has > Many seem to think they have a right degraded and enslaved many. Self-deni to treat their own bodies as they please; al, and a restraint upon the animal appe but they forget that their bodies are not tites, is necessary to elevate and establish their own. Their Creator who formed an improved condition of health and mor them, has claims upon them that they als, and purify corrupted society. Every cannot rightly throw off. Every need violation of principle in eating and drink
ing, blunts the perceptive faculties, mak ing it impossible for them to appreciate or place the right value upon eternal things. It is of the greatest importance that mankind should not be ignorant in regard to the consequences of excess. Temperance in all things is necessary to health, and the development and growth of a good Christian character.
Those who transgress the laws of God
less transgression of the laws which God has established in our being, is virtually a violation of the law of God, and is as great a sin in the sight of Heaven as to break the ten commandments. Igno rance upon this important subject, is sin ; the light is now beaming upon us, and we are without excuse if we do not cherish the light, and become intelligent in regard to these things, which it is our highest earthly interest to understand.
o f the subject o f reform in eating and drinking, will develop character, and will unerringly bring to light those who make
They should be practical physiologists, that they may know what are and what are not, correct physical habits, and be enabled thereby to instruct their children. The great mass are as ignorant and indif ferent in regard to the physical and mor al education o f their children as the ani mal creation. And yet they dare assume the responsibilities of parents. Every mother should acquaint herself with the
|morbid action in the system, and weakens their moral sensibilities. Parents should seek for light and truth, as for hid treas ures. To parents is committed the sa cred charge of forming the characters of their children in childhood. They should

To verify that the content generally matches the original document, let’s compare this text to the PDF scan of the original document.25

In comparing the two representations of the text, we can see that, although the individual words seem to have been captured fairly well, the OCR engine was inconsistent in recognizing the column layout of the page.

Evaluating the text against the dictionary I constructed, I generated the following error report:

'>',
 'al',
 'als',
 'ancl',
 'ani',
 'appe',
 'butthe',
 'cious',
 'doctorñbelieve',
 'f',
 'ferent',
 'ful',
 'gence',
 'ies',
 'igno',
 'indif',
 'indul',
 'ister',
 'mak',
 'mal',
 'mendous',
 'ments',
 'mor',
 'na',
 'natureõ',
 'natureõs',
 're',
 'sa',
 'self-deni',
 'struc',
 'tena',
 'tites',
 'tre',
 'ture',
 'tureõ',
 'ures',
 '|morbid',
 'ò',
 'ó'

As initially produced, the transcription of this page reveals a number of the typical errors that I found in the digitized SDA corpus, including problems recognizing punctuation, such as apostrophes, the addition of special characters, such as “ñ,” where the character does not appear in the text, words split into two tokens, and lines spanning two columns of content. To address these, I worked through a series of steps, including correcting the apostrophes, standardizing and removing special characters, fixing words broken due to line endings, and rejoining split words.26 At the end of the process, the transcribed text and the reported errors are as follows:

DUTY TO KNOW OURSELVES.
preserve it in a healthy condition. The in their physical organism, will not be present generation have trusted their bod less slow to violate the law of God ies with the doctors, and their souls with spoken from Sinai. Those who will not, the ministers. Do they not pay the min after the light has come to them, eat and ister well for studying the Bible for them, drink from principle, instead of being that they need not be to the trouble ? and controlled by appetite, will not be tena is it not his business to tell them what cious in regard to being governed by they must believe, and to settle all doubt principle in other things. The agitation ful questions o f theology without special
investigation on their part? If they are
sick, they send for the doctor believe
whatever he may tell, and swallow any a   god of their bellies. 
thing he may prescribe ; for do they not Parents should arouse, and in the fear pay him a liberal fee, and is it not his of God inquire, what is truth ? A tre business to understand their physical ail mendous responsibility rests upon them. ments, and what to prescribe to make
them well, without their being troubled
with the matter ?
Children are sent to school to be taught
the sciences; butthe science of human life
is wholly neglected. That which is of the
most vital importance, a true knowledge
of themselves, without which all other
science can be of but little advantage, is
not brought to their notice. A cruel
and wicked ignorance is tolerated in re laws that govern physical life. She gard to this important question. So should teach her children that the indul closely is health related to our happiness, gence of animal appetites, produces a that we cannot have the latter without
the former. A practical knowledge of
the science of human life, is necessary in
order to glorify God in our bodies. It is
therefore o f the highest importance, that
among the studies selected for childhood,
Physiology should occupy the first place. be to their children, both teacher and How few know anything about the struc physician. They should understand nature and functions o f their own bodies, ture  s wants ancl nature  s laws. A care and of Nature s laws. Many are drifting ful conformity to the laws God has im about without knowledge, like a ship planted in our being, will insure health, at sea without compass or anchor; and and there will not be a breaking down what is more, they are not interested to o f the constitution, which will tempt the learn how to keep their bodies in a healthy afflicted to call for a physician to patch condition, and prevent disease. .them up again.
The indulgence of animal appetites has   Many seem to think they have a right degraded and enslaved many. Self-deni to treat their own bodies as they please; al, and a restraint upon the animal appe but they forget that their bodies are not tites, is necessary to elevate and establish their own. Their Creator who formed an improved condition of health and mor them, has claims upon them that they als, and purify corrupted society. Every cannot rightly throw off. Every need violation of principle in eating and drink
ing, blunts the perceptive faculties, making it impossible for them to appreciate or place the right value upon eternal things. It is of the greatest importance that mankind should not be ignorant in regard to the consequences of excess. Temperance in all things is necessary to health, and the development and growth of a good Christian character.
Those who transgress the laws of God
less transgression of the laws which God has established in our being, is virtually a violation of the law of God, and is as great a sin in the sight of Heaven as to break the ten commandments. Ignorance upon this important subject, is sin ; the light is now beaming upon us, and we are without excuse if we do not cherish the light, and become intelligent in regard to these things, which it is our highest earthly interest to understand.
o f the subject o f reform in eating and drinking, will develop character, and will unerringly bring to light those who make
They should be practical physiologists, that they may know what are and what are not, correct physical habits, and be enabled thereby to instruct their children. The great mass are as ignorant and indifferent in regard to the physical and moral education o f their children as the animal creation. And yet they dare assume the responsibilities of parents. Every mother should acquaint herself with the
 morbid action in the system, and weakens their moral sensibilities. Parents should seek for light and truth, as for hid treas ures. To parents is committed the sacred charge of forming the characters of their children in childhood. They should
'al',
 'als',
 'ancl',
 'appe',
 'butthe',
 'cious',
 'f',
 'ful',
 'gence',
 'ies',
 'indul',
 'ister',
 'mendous',
 'ments',
 'mor',
 're',
 'self-deni',
 'struc',
 'tena',
 'tites',
 'tre',
 'ture',
 'ures'

Through this process, I significantly improved the text, but the errors that remain highlight the challenges of working with text when columns have been incorrectly identified. Because the columns were merged into lines during character recognition, it is difficult to address one of the most common errors, words that have been split due to line endings. While it might be possible to use the average line length to identify and remedy some of the errors, it is difficult to do at scale due to the multiple variations in layout across years and periodical titles.

In working through my corpus for A Gospel of Health and Salvation, I processed each title separately in order to account for the peculiarities of each title, including layout, language, and typography. This approach enabled me to clean the text more carefully than I would have with a batch approach. Working iteratively through the titles, both to generate a list of vocabulary words distinctive to the denomination and to clean the text, informed my understanding of the strengths and weaknesses of the textual data.

Historical Documents, OCR, and Computational Text Analysis

Current developments in computational textual analysis represent words in relation to their surrounding words and their grammatical functions.27 This form of representation is powerful, resulting in significant improvements in computer translation and enabling researchers to explore the relationships between words in new ways.28 The cutting edge of computational text analysis is to be found in analysis that considers language as networked, contextual, and relational.

However, it is unclear whether the textual data I have from the denomination’s digitization efforts can support this sort of analysis. Research is needed to consider the effects of poorly recognized document layouts on the performance of these more complex algorithms, but given that the first rule of analysis is “Garbage in, garbage out,” I am skeptical that I would achieve reliable results from data such as that which I extracted from the periodicals of the SDA. Not that the data cannot be used, but the types of analysis that the data can support are constrained by its quality. Should this prove to indeed be the case, scholarly attention will need to turn again to the digitization of historical documents, prioritizing the textual layer for algorithmic processing.

Careful analysis of the data quality from scanned historical sources is still rare among digital humanities projects, with the notable exception being studies into OCR quality for the purpose of improving information retrieval for large scale digitized collections. This absence is both curious and not unexpected. On the one hand, the quality of the data is critically linked to the potential success of experiments using machine learning algorithms with historical data, a connection that it seems should weigh heavily on researchers’ minds. However, the work involved in diagnosing and addressing errors in character recognition is not trivial and has few rewards within the current academic structure. As a result, researchers interested in the intersection of computational text analysis and historical sources have tended to pursue one of two tracks: experiment with high quality data sets, such as in the cases of Martha Ballards Diary or the Mining the Richmond Dispatch, or to pursue modes of analysis considered more resilient to data errors, such as the text reuse work of Viral Texts and America’s Public Bible.29

Such work has opened the conversation about the use of computational techniques in historical research and provides the necessary first steps in the effort to bring these two methodological approaches together. However, in order for computational methods for history to continue to mature as a form of analysis, the limitations of the available data needs to become a research question to be addressed head-on. With a clear understanding of the data and the types of analysis it reliably supports, the academic community can begin to improve the available data and develop algorithms designed to support the complex, interpretive forms of analysis of the humanities.

Creating Models of Religious Language

Computational text analysis spans a wide range of different strategies for working with textual data, such as computing word frequencies and the correlation between words, identifying and extracting key features such as person and place names, predicting sentiment, identifying text reuse, categorizing texts, and clustering textual features. These techniques can be used as a way to develop data for use as evidence within a larger algorithm or as the core for arguments about the relationship between different aspects of a text or among different factors across an entire corpus. Increasingly of interest in the computational social sciences, textual analysis is used in fields from political science to history to explore legal, political, and cultural language over time as a window onto questions of social and cultural change.30

In a 2014 talk about machine learning and the social sciences, Hanna Wallach described three major categories of modeling tasks: “prediction” of new or missing information based on what is known, which is often the primary concern of computer scientists; “explanation” of the observed patterns, which is the concern of social scientists; and “exploration” of unknown patterns in observed data, which is done by both, and I would add is one of the primary areas of interest for computational humanities scholars.31 While some scholars within literary studies have embraced statistical models to explain patterns in textual data, most work to date in history using computational text analysis has been exploratory or as a mechanism to generate data for other uses.32 One advantage of the exploratory approach is that it fits well within existing disciplinary practices of humanities research, functioning as a way to generate insights into larger data sets or to extract names and places for further analysis. This information can be used to frame research questions or as supporting evidence for a larger interpretive claim.

As part of exploratory analysis, computational methods have been used in history in a range of projects, from those highlighting networks of feminist authors to surfacing patterns in criminal court proceedings.33 Such projects use a range of computational techniques in order to surface, describe, and present aspects of the communities or events of the past, rather than to model or statistically correlate features. Using data analysis to develop an understanding of a larger whole, such projects bridge traditional methods for humanities research and scholarship with the affordances of digital data and computational algorithms. In so doing, they also start the process of identifying the key features within that historical data for further analysis, as well as surfacing the limitation of the data within the much larger context of historical research.

This project continues in the tradition of exploratory data analysis for historical research, in that rather than looking for clear associations between data points, my interest is in generating data that suggests areas for further historical analysis into the development of Seventh-day Adventism. This includes developing a broad view of overall patterns in the language of the denomination, as I demonstrate in Chapter 3, as well as using those patterns to explore the relationship between gender and the perceived nearness of the second coming in their cultural development, seen in Chapter 4. The bringing together of computational and historical methodologies is still rare in historical and digital humanities scholarship, and those projects that utilize both often obscure the contributions of the computational methods, for reasons ranging from institutional norms to constraints of the current publishing model. However, there are productive models for this joining of computational and theoretical approaches.34

To find patterns within my corpus, I used topic modeling, an algorithmic technique that groups words into an arbitrary number of “topics” based on the probability of their co-occurrence within documents. This method is an example of an “unsupervised”algorithm, as I passed all the documents to the algorithm with no contextual or category information other than the number of topics to divide the words into and some additional parameters controlling the way the algorithm processes the data. The main alternative to unsupervised is supervised learning, where a researcher identifies pages as containing, say, “domestic” content, “health”content, or “theological” content, uses an algorithm to identify the features that distinguished those categories from one another, and then uses that model to predict the category of future materials. While supervised algorithms are often more reliable modes of computational text analysis, they require the use of a “labeled” dataset, where the documents have already been assigned to different categories, often by content experts. With the scale of the periodical literature of the denomination and the constraints of the single-author dissertation, this type of analysis is a future goal for the project, but not one that could reasonably form the basis of the dissertation research. Unsupervised learning, by contrast, does not require such up front categorization work on the part of the researcher, and provides a way to view broad patterns across large bodies of texts, patterns which could be used in subsequent research to create labeled data.

Topic models generate a high-level overview of a body of literature, using patterns in word frequency and co-occurrence to identify different topics or subjects of discourse. Developed initially to address the problem of identifying relevant content within the exponentially growing universe of scientific literature, topic modeling algorithms are optimized for problems of information retrieval and summary. Although they work best with the more regular and topically focused content of academic journals, they have been used to explore the content of a range of textual artifacts, including novels, poetry, newspapers, and diaries.35 As researchers have experimented with topic models for modeling relationships between textual features and variables such as the gender of an author, the time of publication, and other metadata aspects, new and more complex versions of the algorithm have been released. These libraries, such as Structural Topic Model (STM) and Dynamic Topic Model (DTM), factor in these different relationships as part of the model, and provide tools for computing the effect of the different variables on the topic distribution.36 This enables a statistical calculation of the effect of different variables on structure of the model, such as the effect of news sources on the frequency and type of coverage for a topic.37

My choice of topic modeling as my primary exploratory method was informed by the research questions I was pursuing and the quality of the data available. While there are many forms of computational text analysis that the periodical data of the SDA might support, not all of those methods were clearly appropriate either to the questions at hand or to the still messy data extracted from the scanned periodicals. Examining the role of time in the development of the culture of Seventh-day Adventism is a question that requires the exploration of language use as well as development of “latent” or implicit patterns in discourse over time. This work lends itself to an exploratory approach, where computational methods are used to surface broad patterns in the textual data and to identify places for further close reading. While topic modeling has its disadvantages, in that as a probabilistic model it is not repeatable in ways required by more explanatory research and its “accuracy” is difficult to evaluate, it offers a useful mechanism for quickly summarizing and clustering large bodies of text in ways that can be used to guide further research and analysis.38

I used the more basic topic modeling algorithm of MALLET to reduce the complexity of the algorithmic assumptions at play in tracking the discourse of the denomination over time.39 Modeling the relationship between discourse and time is not straightforward and the type of change over time assumed in more complex topic modeling algorithms is one of gradual and continuous change. Such models do not account well for moments of historic rupture or for communities where a cyclical pattern is operative.40 For early Seventh-day Adventists, time was not a static category of experience — time and the experience of temporality were part of what denominational members contested and were striving to understand. While more complex topic modeling algorithms can provide additional nuance in classifying large collections of documents and in measuring the statistical strength of relationships between different aspects of texts, such as date and word usage, these algorithms simultaneously impose external assumptions about textual change over time, making the results more difficult to interpret as a part of historical analysis.

Choice of Preprocessing Techniques

Using computational methods in historical research involves not only the application of computational algorithms to the materials and questions of the past, but is shaped by the processing steps both before and after the creation of the computational model. While this work is often ignored or only obliquely referenced as part of the processes that led to the more interesting final model, as Rhody and Burton have argued the work of text preparation and post analysis greatly shapes the resultant model and the shape of developing analysis.41 Particularly as one moves toward analysis based on computational analysis, details such as tokenization, stopwords, and phrase-handling require clarification as through them one encodes assumptions about how the language is functioning rhetorically for a community and how that should be modeled. In this section and the one to follow, I provide a descriptive outline of the steps I took to prepare the textual data for analysis with the topic modeling algorithm. For this project, I chose three preprocessing strategies to streamline the data for analysis and improve the quality of the resulting model: joining of noun phrases, using a customized stopword list, and filtering the documents for training the model on length and accuracy.

Determining where words begin and end is a significant first step in preparing a text for computational analysis. By default, MALLET parses each word of a text as as separate entity, on the assumption that each individual word carries a unique semantic value. While a useful rubric, however, this approach has its limitations. The most apparent limitation is the disconnect between the use of individual words and the common use of “noun phrases” to refer to particular people or concepts. For example, the concept of “old” is included in but is different from the concept of “old age,” a separate referent used to discuss the condition of those in the final years of life. One advantage of attending to phrases is that it helps provide additional specificity to a model. Because it groups together words that tend to co-occur in documents, topic modeling will often generate topics where repeated pairs, such as “old” and “age,” occur as part of the same topic. However, it is difficult in such topics to know whether “old age” as a particular concept is being deployed, or if the author is speaking more generally about the past. By identifying and grouping together noun phrases, it is possible to introduce additional data points into the equation, so that in the text the concepts of “old” and “age” coexist with a concept of “old age” that is built from but is distinct from those component parts.42

In order to identify and work with the noun phrases within the periodical literature, I used the Python library TextBlob to identify noun phrases within the corpus and calculate their frequencies. I saved the two thousand most common phrases to a list, which I used while preparing the texts for MALLET. For each document to be fed to the model, I used TextBlob to identify the noun phrases in the document, and for those phrases that also were in the top two thousand list, I joined the words with an “_”, thereby creating a single combined token for each of these high frequency noun phrases. The result, which you can see in the topic model browser, is that some of the common names and phrases of the denominational literature, such as “jesus christ” and “sabbath school” are processed in the model as single entities.43

My second intervention into the preprocessing of the textual data was in the creation of a subject-specific stopword list for the Seventh-day Adventist periodical literature. Rather than use an existing list, which is the default MALLET process for removing high-frequency but low-meaning words, I used another topic modeling library, Gensim, to identify those words that occurred in more than sixty percent of documents (high frequency words) and those that occurred in fewer than twenty total documents (low frequency words) and used that list as my stopwords list. This approach had two advantages for the particularities of this corpus. First, the language of the denominational periodicals is unusual by twentieth century standards, and very repetitive. While standard wordlists focus on removing common function words (“the”, “and”, etc.), noise for this corpus included what would otherwise be considered meaningful nouns, such as “god.” Scholars such as Rhody have raised concerns regarding the practice of relying on standard stopword lists when processing texts for humanities research.44 This approach provides an alternative to relying on standard lists or manually curating a subject-specific list, building on the particularity of a given text and using that to automate the identification of words to remove. Secondly, while I took a good deal of care in the identification and removal of OCR errors, the methods I pursued are far from comprehensive, resulting in a high number of remaining errors that were generated through failed character recognition. Rather than attempting to identify all of the permutations of errors in recognition, I set the cut off at fewer than twenty documents as a way of identifying those words that are too unique to be useful at the abstract level of a model, including the generated OCR errors. By providing a mechanism to address both high frequency words and errors that appear in few documents, the Gensim library provided a way to reduce the words for analysis while following the contours of the historical text. Through this process, I generated a stopword list particular to the specialized language of the SDA and the eccentricities of my data.

In these first two preprocessing steps, I considered what a “word” was within the context of the Seventh-day Adventist literature and which of those words were most likely to carry the distinctive meaning within the body of the text. My final preprocessing step was to narrow the corpus for model creation to those documents with sufficient words and low error rates. As a form of probabilistic textual analysis, topic modeling is sensitive to OCR errors, particularly errors that inflate the number of words in the documents, as they skew the “weights” assigned to the words.45 To counter-balance this problem, particularly as the documents in my study while clean were not error free, I limited my training set to documents with more than three hundred words and error rates under ten percent. Once I had generated the model, I went through and classified these “hold-out” documents, so that they too were visible to my analysis, but the model itself is only based on those high quality documents.

This process of moving from text to data is one that can be done in multiple ways depending on research questions, features of the text, and the like. All of these decisions effect the patterns that can be seen in standard machine learning algorithms, and yet rarely are these steps carefully documented in research in the digital humanities.46 However, for these methods to continue to develop within the humanities, they must be documented, examined, and engaged. Preprocessing methods such as these change the shape of the final models and the research cannot be reproduced without including these seemingly mundane steps to shape the data into its form for analysis. My inclusion of this discussion, together with code examples, is part of an explicit and implicit argument that this intellectual work is as much part of the A Gospel of Health and Salvation as the prose and visualizations to follow.

Evaluating and Analyzing the Topic Model

Of the methods for statistical analysis of a corpus, topic modeling presents some unique challenges in that the algorithm is probabilistic, not deterministic. If you were to run a model on the same corpus multiple times, the results will vary in the words that constitute a topic and the weights assigned to topics within documents. And as the method is unsupervised, there is no “ground truth” against which to evaluate the results. These aspects of topic models have led many researchers to be justifiably cautious about the usefulness of the method for generating historical and interpretive insights. Additionally, the human tendency to find patterns, to “read tea leaves,” means that topic models are highly susceptible to seeming more coherent than they in fact are.47 These drawbacks are particularly significant when topic models are used as data for calculating the relationship between variables. As a form of exploratory data analysis, topic modeling provides different “readings” of the patterns of word usage. While holding too closely to the results will lead to error, these readings are helpful in identifying general themes and overall patterns.

This is not to say that there are no mechanisms for evaluating the quality and stability of a given topic model. In fact, there are a growing range of strategies used to evaluate and improve the quality of topic models as scholars have worked to find ways to make the results of the model more stable and coherent.48 These strategies provide useful mechanisms for examining the internal structure of a model and increasing confidence that said model provides a useful abstraction of the underlying literature. For the dissertation, I pursued three general strategies for evaluating the quality and stability of the topic model. These were to visualize the relationship between topics within the model, to measure the overlap between topics when the model was run on different permutations of the corpus, and to evaluate the legibility of the topics, or the ease with which they could be labeled, and the match between the topic and the originating content. Together these methods provide a picture of a model with some redundancy in topics but with enough distinction to enable focusing in on the shifting focus of the denomination’s rhetoric over time.

Visual representations of topic models provide a useful initial mode of entry into understanding how the words within the denominational literature have been grouped in relation to each other. For the dissertation, I used the PyLDAviz library to graph the dissertation topics in two dimensional space using principle component analysis, as well as the Plot.ly graphing library to create a dendrogram of the topics, defined in terms of a vector of their top fifty words.49 Each of these visualizations is focused on the relationship between the topic as defined by the words that compose them, rather than by their prevalence in the documents of the corpus.

Figure 2.6: Visualization of the 250 topic model (view full version). Note that the topic numbers in this visualization are in descending order of overall prominence, and not related to the topic assignment used throughout the rest of the site. The interface enables the viewer to modify which words are displayed, defaulting to the most frequent words assigned to the topic with lambda equal to 1, or the most unique words for each topic with lambda equal to 0. This provides a means of distinguishing between similar topics. Visualization created with the pyLDAviz library.

The visualization generated with PyLDAviz depicts a model with no clear predominant topic, and a number of topical clusters. For example, the top center of the graph shows a closely grouped cluster of topics related to issues of food and diet, derived from the health reform literature of the denomination. The bottom left corner of the graph shows a clustering of topics around issues of faith, including theology and biblical quotations. The bottom right corner, in contrast, clusters topics around more bureaucratic aspects of the denomination, including conference reports and missions. Finally, at the near center of the graph is the (here labeled) topic 22 which includes “study,” “instruction,” “plan.” The position of this cluster of verbs at the center of the graph suggests that these words appear across all of the different discursive areas of the denominational literature. As a people of the book, deeply committed to studying the divine law in both scripture and in nature, the centrality of that concept to all of their discourse would be appropriate.

Figure 2.7: Dendrogram of the 250 topic model. (view full version) Visualization created with the Plot.ly library.

A dendrogram provides another window onto the relationships between the topics of the model. Created using the default options within the Plotly library, this chart uses the top 50 words in each topic to cluster and computes the distances between the different topic pairs. The resulting chart again indicates a high degree of similarity between the topics with some definite clusters of related topics. At the bottom of the chart the reader can find a small collection of education-focused topics, while topic 20, related to church organization, is displayed at the top of the graph, with language distinctive from that of all the other topics. Within the large block of maroon topics, clear clusters relating to conference reports, to church and state legislation, and to theological topics such as the sabbath appear. While the PyLDAviz graph relies only on the topic words, in addition to showing relationships between topics, I incorporated topic labels into the dendrogram graph, and used it as a partial guide for the work of labeling topics and identifying related topics for further study.

The second strategy I pursued to evaluate the model generated through MALLET was to measure the stability of the topical word clusterings over different permutations of the corpus. For this, I used the topic keys, which consists of the top twenty words for each topic, from four topic models created from four different variations on the corpus: a random selection of documents within the corpus, a corpus where the minimum error rate was computed at twenty-five percent, a corpus where no minimum error rate was set, and the target configuration with an error rate set at ten percent. Using Levenshtein distance to compute the similarity between the words in the topic keys, I computed the distances between all of the words in each topic key pair, used the lowest distances to indicate the most similar words, and computed the percentage of tokens with a match within a low number of edits.50 In the random and twenty-five percent error rate case, ninety-six percent of the topics had a close match, while ninety-four percent of topics in the sample with no controls on error rates had a close match. This result indicated to me that although there is some shifting in the relative weights of words across the different runs of the topic model, overall, there was a close match for each of the main topics. Additionally, the lower return from the model with no constraints on the error rates suggests that the effort to control for those documents did make a measurable difference in the shape of the topics. While additional work should be done around the evaluation and optimization of topic models for use in historical analysis, these rough measures provide an initial measure of confidence regarding the stability and usefulness of the topic model.51

My final strategy for evaluating the topic was interpretive, drawing upon my position as a subject expert to evaluate the coherence and usefulness of the model. To do this, I looked both directly at the topics, particularly the internal coherence of the topic words and the overlap between topics, as well as the connections between topics and the documents where the topics featured predominantly. Reading the topics at these three scales, I assigned interpretive labels to the topics. These labels serve to indicate the content that is represented within the model by the topic and are used to track the prevalence of different topics over time. While many of the topics were composed of words with clear subject matter focuses, a number of topics captured particular modes of communication, rather than particular content. For example, one topic that was prevalent in the early years of the denominational publications, topic 56, falls into this particular category. Although the underlying tie between the topic words is not immediately apparent — “lord truth bro sister feel jesus” — an examination of the pages where this topic is prevalent quickly indicates that this topic has captured the confessional language used by those who wrote to James and Ellen White in the early days of Seventh-day Adventism. Of the strategies for evaluating the model, this interpretive work was the most time consuming, as it required bringing together the different distances of the model and the text to develop an understanding of the language pattern that was captured. However, it is also this step that most firmly embeds my particular use of the topic model within the realm of historical interpretation. I return to my process of labeling the topics in Chapter 3.

By combining these three strategies, I evaluated the model with reference to visual, mathematical, and interpretive frameworks and came to two-hundred and fifty as a workable number of topics for the final model. This scale of topic model enabled me to capture some of the more nuanced aspects of the periodical literature, while also limiting the fracturing of themes between multiple topics. The larger number of topics also provides space for topics to shift, with similar topics capturing different topical associations as the literature of the denomination developed over time. By carefully attending to the relationship between the words, topics, and documents, I have developed a clearer understanding of the contours of the literature the model describes as well as a degree of confidence in the usefulness of the model for describing the overall corpus as well as for directing the reader back to the particular texts. Those two roles of description and guidance are what I require from the topic model for this historical study.

Presenting the Topic Model

Generating a topic model is the first part of the challenge; using and interpreting a topic model within historical analysis is a second and less discussed project. For classification problems, where a researcher uses the model to generate and assign topical categories to different documents, outputs such as the breakdown of topic percentages for each document allow the researcher to identify the most prevalent topics in each document, and use the associated label as a descriptive tag or category. Topic models also provide an overview of topic distributions across the entire set of documents, providing a snapshot of the major and minor themes in a corpus. As a form of machine learning, topic models can be used to classify new content, enabling researchers to use a previously generated model to classify previously unseen content.

Researchers in the digital humanities have pursued several different strategies to bridge the computational and abstract data of topic models with the interpretive questions of the humanities. My goal in working with a topic model is not to argue for correlations between different facets of SDA discourse, but to surface broad patterns and identify areas for further research. This use of a computational model fits within the epistemological commitments of the humanities, with its emphasis on complexity, multiple causality, and the idiographic, rather than using the model to argue for correlations between discourse patterns and metadata variables. My primary areas of interest are within the humanities: to find ways to explore particular aspects of religious culture, to understand how a belief system functions and changes, and to consider how those patterns of thought have been built upon to shape the current cultural landscape. These are questions that I believe computational models and methods can help in the exploration of, but as one method among others, including traditional archival research and narrative construction.

Topic models provide vast amounts of information describing their respective corpora, information that can be used to identify patterns across the documents as well as to identify content on particular themes. One advantage of working in a digital medium is that it is possible to provide an interactive interface for the model, one that allows the researcher to highlight particular aspects of the model or the corpus, but that also opens the possibility for readers to use the model to further explore those or other themes. For A Gospel of Health and Salvation I used a topic model browser as part of the dissertation to enable topic exploration. This platform, while sufficient for the research of the dissertation, also serves as a “rough draft” for a future interface for the periodical literature of the denomination, one that supports the integration of the research and scholarly presentation.

The topic model browser I am using for the dissertation, hosted at http://browser.dissertation.jeriwieringa.com/, is an adaptation of Andrew Goldstone’s DfR browser. The browser is built using the D3 Javascript library and works from model data generated using the MALLET library.52 The browser provides the reader with interfaces that highlight the topical trends over time, the document composition, and the relationships between words and between words and topics. These different views enable the reader to explore when a topic was predominant in the corpus, examine the other topics associated with the documents where the topic is prevalent, and inspect the words that compose the topic, to see in which other topics those words are prevalent. To compute topics over time, the data processing function aggregates the topic assignments by year, smooths those values by the “alpha” hyperparameter used in MALLET and displays the percentage of tokens that constitutes each topic per year. This method of computing topics over time is one that I continue within the other interfaces of the dissertation. I discuss the different views of the browser in the browser about page.

In using this interface, I present the topic model as an abstraction of and a guide to the broad periodical literature of the denomination. Even with this degree of abstraction, there is more information in these interfaces than can be accounted for in a singular narrative and so while the interfaces enable my exploration of end-times and the cultural development of the denomination, by exposing the larger model, I invite the reader to explore whether those patterns do indeed hold within the larger, undiscussed context of the literature. I do not claim, however, that this model or these interfaces provides a neutral entry point into the denominations literature. Rather, as discussed throughout the chapter, all of the aspects that make up the model — the selection of sources, the data preparation steps, the choice and optimization of the algorithm, and the design of the interfaces — are oriented toward the interpretive goals of the project. The usefulness of the interfaces, the model, or the data for other projects would be dependent upon whether those choices support the desired analysis. This approach presents an alternative to the model of a neutral data interface that supports multiple research projects.

Conclusion

While all historical research involves the selection of materials, the analysis of those materials, and the organization of that analysis into some form of presentation, the incorporation of computational methods into that process is not a simple addition. Rather, the complexity of the analysis increases exponentially, as the layers of code and algorithms enact and embed interpretive and analytical assumption throughout the seemingly neutral processes of preparing and parsing the data. These processes can be undertaken in multiple ways and their documentation is necessary for that work to be inspected, reproduced, and evaluated. As a result, the computational work involved in digital scholarship expands the prevailing model of scholarship in ways that do not have clear parallels in traditional research practices. From source selection to preprocessing to choice of algorithm and modes of interaction, the technical work of computational analysis is all part of the scholarly work that constitutes a digital humanities project, intertwined with the interpretive questions and constraining the very contours of what can be seen through the resultant model. As a result, that work is properly part of the scholarly output of computational research in history.


  1. For example, see Matt Burton, “Blogs as Infrastructure for Scholarly Communication.” (PhD thesis, University of Michigan, 2015); Nan Z. Da, “The Computational Case Against Computational Literary Studies,” Critical Inquiry 45, no. 3 (2019): 601–39, https://www.journals.uchicago.edu/doi/10.1086/702594.

  2. Historian Catherine Brekus notes that while “earlier evangelicals had realized the potential of the press, the Millerites published tracts, memoirs, and newspapers on a scale never before imagined.” Catherine A. Brekus, Strangers and Pilgrims: Female Preaching in America, 1740-1845 (Gender and American Culture), 1st New edition (Chapel Hill: The University of North Carolina Press, 1998), p. 323.

  3. James White, “Dear Brethren and Sisters —,” The Present Truth 1, no. 1 (1849): 6, http://documents.adventistarchives.org/Periodicals/PT-AR/PT-AR-Part1-01.pdf, p. 6.

  4. Ellen Gould Harmon White, “Dear Brethren and Sisters —,” The Present Truth 1, no. 11 (1850): 86–87, http://documents.adventistarchives.org/Periodicals/PT-AR/PT-AR-Part1-11.pdf, pp. 86-87. In subsequent tellings of this history, a more direct link is told between Ellen’s vision and the start of The Present Truth. See Floyd Greenleaf and Jerry Moon, “Builder,” in Ellen Harmon White: American Prophet, ed. Terrie Dopp Aamodt, Gary Land, and Ronald L. Numbers (New York: Oxford University Press, 2014), pp. 126-7 & 140 n.2 and n.6.

  5. Benedict Anderson, Imagined Communities: Reflections on the Origin and Spread of Nationalism, Revised Edition (London; New York: Verso, 2006), p. 6.

  6. Information about the history of the denominations periodical digitization efforts from an email exchange with the archivist for the Adventist Digital Library. Eric Koester, “Inquiry Regarding Data from the Adventist Digital Library.” Email, 2017, Eric Koester and Henry Gomes, eds., “About,” Adventist Digital Library, 2017, http://adventistdigitallibrary.org/about.

  7. For reference see the Series 4 and 5 lists of the titles included in the AAS’s collection at https://www.ebscohost.com/archives/aas-historical-periodicals-collection. The dates for the issues of the Youth’s Instructor suggests that even for titles where the denomination’s coverage has gaps, their collection is more extensive than what is available through the AAS database.

  8. “Religious liberty” for Seventh-day Adventist writers refers primarily to resisting the passage of Sunday observance laws. While anticipating future cooperation between the Catholic church and the United States government on Sunday observance in the events leading up to the second coming, they also worked to raise awareness about these pieces of legislation and sought to use the first amendment to defend themselves against the passage of these laws.

  9. The Seventh-Day Adventist Year Book, 1883 (Battle Creek, MI: Seventh-day Adventist Publishing Association, 1883), http://documents.adventistarchives.org/Yearbooks/YB1883.pdf, pp. 55, 57.

  10. 1905 Year Book of the Seventh-Day Adventist Denomination (Washington, D.C.: Review; Herald Publishing Association, 1905), http://documents.adventistarchives.org/Yearbooks/YB1905.pdf, pp. 100-105; H. Edson Rogers, ed., Year Book of the Seventh-Day Adventist Denomination (Takoma Park, Washington, D.C.: Review & Herald Publishing Association, 1921), pp. 183-186.

  11. See Table 2.1 for a full listing of the periodicals I identified and their digitization status.

  12. See https://en.wikipedia.org/wiki/Garbage_in,_garbage_out

  13. For example, see John Evershed and Kent Fitch, Correcting Noisy Ocr: Context Beats Confusion (New York: ACM Press, 2014), Thomas A. Lasko and Susan E. Hauser, Approximate String Matching Algorithms for Limited-Vocabulary Ocr Output Correction, 2000, and Ted Underwood, “The Challenges of Digital Work on Early-19c Collections.” The Stone and the Shell, 2011, https://tedunderwood.com/2011/10/07/the-challenges-of-digital-work-on-early-19c-collections/.

  14. Roy Rosenzweig, “Scarcity or Abundance? Preserving the Past in a Digital Era,” The American Historical Review 108, no. 3 (2003): 735–62, http://www.jstor.org/stable/10.1086/529596, 739.

  15. Patrick Spedding assesses ECCO for the document base, the search interface, and the accuracy of the textual layer in Patrick Spedding, “‘The New Machine’: Discovering the Limits of Ecco,” Eighteenth-Century Studies 44, no. 4 (2011): 437–53, http://www.jstor.org/stable/41301590, while Tim Hitchcock evaluates the Burney Collection for word accuracy in Tim Hitchcock, “Confronting the Digital: Or How Academic History Writing Lost the Plot,” Cultural and Social History 10, no. 1 (2015): 9–23, http://www.tandfonline.com/doi/full/10.2752/147800413X13515292098070. In both studies, the authors raise significant concern about the quality of the data upon with the search and recovery interfaces are built, as well as the lack of transparency on the part of the database providers, in both cases Gale, regarding the underlying data.

  16. Andrew J. Torget et al., “Mapping Texts: Combining Text-Mining and Geo-Visualization to Unlock the Research Potential of Historical Newspapers,” Mapping Texts, 2011, http://mappingtexts.org/whitepaper/MappingTexts_WhitePaper.pdf, 7

  17. For the project, “good” and “bad” words are determined by comparison to dictionary of words.

  18. For examples, see Evershed and Fitch, Correcting Noisy Ocr; Lasko and Hauser, Approximate String Matching Algorithms for Limited-Vocabulary Ocr Output Correction; Thomas L. Packer, “Performing Information Extraction to Improve Ocr Error Detection in Semi-Structured Historical Documents,” Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, 2011, 67, https://dl.acm.org/citation.cfm?id=2037354; Carolyn Strange et al., “Mining for the Meanings of a Murder: The Impact of Ocr Quality on the Use of Digitized Historical Newspapers,” Digital Humanities Quarterly 8, no. 1 (2014), http://www.digitalhumanities.org/dhq/vol/8/1/000168/000168.html; Torget et al., “Mapping Texts.”; Underwood, “The Challenges of Digital Work on Early-19c Collections.”; Ted Underwood and Loretta Auvil, “Basic Ocr Correction,” The Uses of Scale in Literary Study, 2014, https://usesofscale.com/gritty-details/basic-ocr-correction/.

  19. Evershed and Fitch, Correcting Noisy Ocr, 45.

  20. Simon Tanner, Trevor Muñoz, and Pich Hemy Ros, “Measuring Mass Text Digitization Quality and Usefulness: Lessons Learned from Assessing the Ocr Accuracy of the British Library’s 19th Century Online Newspaper Archive,” D-Lib Magazine 15, no. 7/8 (2009), http://dx.doi.org/10.1045/july2009-munoz, §5.

  21. Lasko and Hauser, Approximate String Matching Algorithms for Limited-Vocabulary Ocr Output Correction; Underwood2011.

  22. Kevin Atkinson, “Spell Checking Oriented Word Lists (Scowl),” 2016, http://wordlist.aspell.net/scowl-readme/

  23. Nathan O Hatch, The Democratization of American Christianity (New Haven: Yale University Press, 1989), 5.

  24. More recent versions of OCR software have become better at dealing with the vagaries of older print documents. However, much of the currently circulated PDF scans and textual data were created with earlier iterations of the software. Another solution, one that is often featured in larger subscription collections, is to devote the time and energy to identify and catalog each individual article within the digitized document. This approach often introduces other complications around the more marginal aspects of older documents, including advertisements, which are often less differentiated than the main content.

  25. http://documents.adventistarchives.org/Periodicals/HR/HR18660801-V01-01.pdf#page=3

  26. All of these steps are outlined in a Jupyter notebook for readers interested in the technical details.

  27. This includes word embeddings, neural networks, and key words in context(KWIC). Benjamin M. Schmidt, “Word Embeddings for the Digital Humanities,” 2015, http://bookworm.benschmidt.org/posts/2015-10-25-Word-Embeddings.html; Akemi Ueda, “Stanford Literary Lab Uses Digital Humanities to Study Why We Feel Suspense,” 2016, https://news.stanford.edu/2016/02/18/literary-lab-suspense-021816/.

  28. Benjamin M. Schmidt, “Interactive Visual Bibliography: Describing Corpora,” 2018, http://creatingdata.us/techne/bibliographies/; Yonghui Wu et al., “Google’s Neural Machine Translation System: Bridging the Gap Between Human and Machine Translation,” CoRR abs/1609.08144 (2016), http://arxiv.org/abs/1609.08144.

  29. The problem of OCR is the focus of a recent report from Northeastern University, funded by the Mellon Foundation. While a little discussed problem when I began this dissertation, the problem of data quality from scans of historical documents is becoming increasingly recognized in the field. David A. Smith and Ryan Cordell, “A Research Agenda for Historical and Multilingual Optical Character Recognition,” 2018, https://ocr.northeastern.edu/report/.

  30. For a list of projects using computational text analysis in various fields, see Brendan O’Connor, David Bamman, and Noah A. Smith, “Computational Text Analysis for Social Science: Model Assumptions and Complexity,” Second NIPS Workshop on Computational Social Science and the Wisdom of Crowds, 2011, https://homes.cs.washington.edu/~nasmith/papers/oconnor+bamman+smith.nips-ws11.pdf, pp. 1-2.

  31. Hanna Wallach, “Big Data, Machine Learning, and the Social Sciences: Fairness, Accountability, and Transparency,” 2014, https://medium.com/@hannawallach/big-data-machine-learning-and-the-social-sciences-927a8e20460d

  32. Examples include the work of Matthew Jocker, Ted Underwood, and Andrew Piper. Gabi Kirilloff et al., “From a Distance ‘You Might Mistake Her for a Man’: A Closer Reading of Gender and Character Action in Jane Eyre, the Law and the Lady, and a Brilliant Woman,” Digital Scholarship in the Humanities 33, no. 4 (2018), https://academic.oup.com/dsh/article-abstract/33/4/821/5004302?redirectedFrom=fulltext; David Bamman, Ted Underwood, and Noah A. Smith, “A Bayesian Mixed Effects Model of Literary Character,” Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014, 370–79; Andrew Piper, Enumerations: Data and Literary Study (Chicago: University of Chicago Press, 2018).

  33. For example, historian Michelle Moravec addresses the limitations of network analysis for historical scholarship in Michelle Moravec, “Network Analysis and Feminist Artists,” Artl@s Bulletin 6, no. 3 (2017): Article 5, https://docs.lib.purdue.edu/cgi/viewcontent.cgi?referer=\&httpsredir=1\&article=1124\&context=artlas.

  34. For example, in his dissertation on the Digital Humanities community, Matt Burton uses topic modeling along with digital ethnography to explore the character and development of digital humanities blogs. Burton, “Blogs as Infrastructure for Scholarly Communication.”

  35. Some high-profile projects in the digital humanities that use topic modeling on a range of different types of content include Ted Underwood and Andrew Goldstone’s exploration of the archives of the PMLA journal, Lisa Rhody’s study of ekphrasis poetry, Robert Nelson’s work with the Richmond Daily Dispatch, Sharon Block’s work with the Pennsylvania Gazette, Cameron Blevins’s explorations of Martha Ballard’s diary, and Micki Kaufman’s exploration of the Kissinger Collection. Andrew Goldstone and Ted Underwood, “What Can Topic Models of Pmla Teach Us About the History of Literary Scholarship?” Journal of Digital Humanities 2, no. 1 (2013), http://journalofdigitalhumanities.org/2-1/what-can-topic-models-of-pmla-teach-us-by-ted-underwood-and-andrew-goldstone/; Lisa M. Rhody, “Topic Modeling and Figurative Language,” Journal of Digital Humanities 2, no. 1 (2013), http://journalofdigitalhumanities.org/2-1/topic-modeling-and-figurative-language-by-lisa-m-rhody/; Robert K. Nelson, “Mining the Dispatch,” 2007, http://dsl.richmond.edu/dispatch/pages/home; Sharon Block, “Doing More with Digitization,” Common-Place the Interactive Journal of Early American Life 6, no. 2 (2006), http://www.common-place-archives.org/vol-06/no-02/tales/; Cameron Blevins, “Topic Modeling Martha Ballard’s Diary,” 2010, http://www.cameronblevins.org/posts/topic-modeling-martha-ballards-diary/, Micki Kaufman, “‘Everything on Paper Will Be Used Against Me’: Quantifying Kissinger. A Computational Analysis of the National Security Archive’s Kissinger Collection Memcons and Telcons,” 2018, http://blog.quantifyingkissinger.com/.

  36. Molly Roberts et al., “The Structural Topic Model and Applied Social Science,” NIPS 2013 Workshop on Topic Models: Computation, Application, and Evaluation, 2013, https://scholar.princeton.edu/files/bstewart/files/stmnips2013.pdf; David M. Blei and John D. Lafferty, “Dynamic Topic Models,” Proceedings of the 23rd International Conference on Machine Learning, 2006, 113–20, https://doi.org/10.1145/1143844.1143859

  37. For an example of this type of research, see Margaret E. Roberts, Brandon M. Stewart, and Edoardo M. Airoldi, “A Model of Text for Experimentation in the Social Sciences,” Journal of the American Statistical Association 111, no. 515 (2016): 988–1003, https://www.tandfonline.com/doi/full/10.1080/01621459.2016.1141684.

  38. Jonathan Chang et al., Reading Tea Leaves: How Humans Interpret Topic Models, vol. 22 (Neural Information Processing Systems Foundation, Inc., 2009). One additional advantage of using topic modeling as part of an exploratory approach to the data from the SDA periodicals is the flexibility the algorithm provides for dealing with the messy textual data of the denomination. While studies have shown that algorithms that use weights in evaluating textual data, such as computing the relative uniqueness of given terms in each document, are particularly sensitive to errors in the data, particularly those introduced by OCR, there are clear strategies for mitigating the effects of such errors. For example, one can separate the work of model creation from the categorization of all documents, where the model is trained on one set of documents, but applied to the full collection. By limiting the training set to documents with low error rates and higher word counts, one can improve the quality of the resulting model by excluding some of the “noise” introduced by error-filled documents. Gudila Paul Moshi et al., “An Impact of Linguistic Features on Automated Classification of Ocr Texts,” Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, 2010, 287–92, https://dl.acm.org/citation.cfm?doid=1815330.1815367; Daniel Walker, William Lund, and Eric Ringger, “Evaluating Models of Latent Document Semantics in the Presence of Ocr Errors,” Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010, 240–50, http://hdl.lib.byu.edu/1877/3561.

  39. Andrew Kachites McCallum, “MALLET: A Machine Learning for Language Toolkit (2.0.8),” 2002, http://mallet.cs.umass.edu

  40. For a detailed discussion of time in topic modeling algorithms, see Benjamin M. Schmidt, “Words Alone: Dismantling Topic Models in the Humanities,” Journal of Digital Humanities, 2013, http://journalofdigitalhumanities.org/2-1/words-alone-by-benjamin-m-schmidt/ internal-pdf://7981/words-alone-by-benjamin-m-schmidt.html

  41. Rhody, “Topic Modeling and Figurative Language.” and Burton, “Blogs as Infrastructure for Scholarly Communication.”.

  42. This approach is discussed by David Mimno as one strategy for generating more meaningful models. David Mimno, “Using Phrases in Mallet Topic Models,” 2015, http://www.mimno.org/articles/phrases/.

  43. This was not a fool proof system, as the phrase “thou_art” illustrates. Here the twentieth-century grammar expectations of the part of speech tagger clashes with the seventeenth-century religious language of the denomination, creating a noun phase out of a common noun-verb pair. Additionally, errors in word order or at times the length of a text caused phrases to be inconsistently recognized by the TextBlob library. However, on the whole, it is a useful strategy for identifying and tagging noun phrases at scale.

  44. cite

  45. Kazem Taghva, Thomas Nartker, and Julie Borsack, Information Access in the Presence of Ocr Errors, ed. Kirk Lubbes (New York: ACM Press, 2004).

  46. Matt Burton makes a similar argument in his work translating Underwood’s “Pace of Change” article into an executable code document. Matt Burton, “Defactoring ‘Pace of Change’,” 2018, https://github.com/interedition/paceofchange/blob/c86e25eb1849d2b02677f82f0198cfd6f567ceb8/defactoring-pace-of-change.ipynb. Reproducible research is a problem outside of the digital humanities as well, as computational methods are increasingly relied upon in the sciences and social sciences. Movements such as Open Science, supported by the Center for Open Science, are working to encourage scientists to make the data and code available as part of standard scientific practice.

  47. Chang et al., Reading Tea Leaves.

  48. This includes newer varieties of composite models (combining a series of runs into a composite model) and user mediated models, where the user makes corrections on the model. Experimenting with these forms of topic modeling was beyond the scope of what I could accomplish within the dissertation. Mark Belford, Brian Mac Namee, and Derek Greene, “Stability of Topic Modeling via Matrix Factorization,” Expert Systems with Applications: An International Journal 91, no. C (2018): 159–69, https://dl.acm.org/citation.cfm?id=3170753; David Mimno et al., “Optimizing Semantic Coherence in Topic Models,” Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2011, 262–72, https://dl.acm.org/citation.cfm?id=2145462.

  49. These representations work by measuring the distance between topics, where each topic is understood as vector of words. Principle component analysis provides a mathematical representation of the most distinguishing features of a dataset. For the topic model, measuring two principle components enables plotting the topics along an x,y axis, where similar topics appear in close proximity within the grid. For an indepth explanation of how the PyLDAviz computes the relationships between the topics, see Carson Sievert and Kenneth E. Shirley, LDAvis: A Method for Visualizing and Interpreting Topics (Baltimore, Maryland: Association for Computational Linguistics, 2014).

  50. Levenshtein distance is a measure of the difference between two strings (blocks of text) by comuting the minimum number of edits (additions, subtractions, or substitutions) needed to transform one string into another.

  51. Measuring topic stability is a relatively new area of research within topic modeling, and holds promise as a method for both acknowledging and working with the probabilistic shift in topic models over time. For examples see Mika V. Mantyla, Maelick Claes, and Umar Farooq, “Measuring Lda Topic Stability from Clusters of Replicated Runs,” Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, 2018, 1–5, https://doi.org/10.1145/3239235.3267435. Further explorations of the stability of the topic model would involve rerunning the model on the same corpus and running an analysis of the resulting topic clusters.

  52. The default topic modeler for the browser is the R wrapper for MALLET, but Goldstone also provides a script to transform the data from the command-line MALLET interface for use in the browser.

Creating a Digital Dissertation

The process for creating a narrative dissertation in history is, on the whole, well-established. A researcher chooses a topic of study, a theoretical framework (or two), identifies the relevant archives, spends years in those archives reading and analyzing the materials, and then uses that evidence to construct an account of a particular time, place, person, or event that had some influence on the development of culture, politics, and the like. Because the genre is well-defined, there is generally little attention paid to the process beyond the traditional acknowledgment in introductions and footnotes of the relevant theoretical frameworks and the consulted archival sources.

With the rise of digital technologies, many of the assumptions regarding the standard processes for archival research, analysis, and even publication are open to challenge and revision. Where the standard methodology for interpretation was reading and the application of theory and logic, computational algorithms enable new forms of analysis of data to ground interpretive claims.1 Additionally, the use of computers to process more traditional forms of historical evidence create new challenges for how that research is evaluated and extended. Where scholarly arguments based on interpretation could be evaluated through a rereading of the historical objects and a consideration of the logic of the initial interpretation, evaluating work that relies on computation requires engagement with the source code. A reader cannot adequately assess arguments where the author notes a source base and a general computational approach, since part of the logic of the analysis is embedded in the implementation and is only visible through the code.2

Additionally, the very process of using computational tools in the analysis of historical data and the crafting of an interpretive narrative expands what has traditionally been considered the work of an individual scholar into a work that is more obviously a composite of the intellectual work of multiple authors.3 Whereas the written dissertation weaves together evidence and theory with well-constructed prose in order to convince readers of a particular interpretation, digital work weaves together evidence, theory, software libraries, and display frameworks in creating the final result, where the software and display contribute intellectual work of their own in shaping the final product.

As a result of these concerns, the Department of History and Art History at George Mason University has included the requirement for a self-reflective process statement as part of digital dissertation projects submitted to the department. This statement, which follows, is required to give a “full accounting of the technical and analogue work that went into building the digital dissertation” as well as the “code and software employed to produce the final dissertation.”4 This process of documenting the technical structure of the project aids in its evaluation as a complex and multi-layered work of scholarship. Much of the information found here is also documented throughout the dissertation itself, as part of my argument is that the technical elements are as much a part of the intellectual work of the dissertation as the more traditional narrative prose.5 By providing an overview of the technical whole of the project in this space, readers can quickly orient themselves to the project.

The process statement provides a summary and accounting of the three primary layers of the dissertation: the data, the analysis, and the presentation interfaces. In it, I document the software used for developing each layer and provide the information necessary for running the different aspects of the project on a local machine. This provides a mechanism for future viewers to reconstruct the project should it cease to live online, to test elements as part of an evaluation of the work, or to extend parts of the project for other applications.

Data for Historical Analysis

As with any digital project, A Gospel of Health and Salvation is dependent on the availability of digital sources upon which to operate. Due to the SDA’s commitment to making their historical materials as widely available as possible, a large percentage of the denomination’s periodical literature has been digitized and is available through the church’s websites. At the time of writing, the periodical scans were hosted at the Online Archive of the SDA’s Office of Archives, Statistics, and Research. The files are also increasingly available through the Adventist Digital Library, a compilation archive for a range of historic SDA materials. A full listing of the included periodicals, with links to the original PDF files, is included in the online bibliography.

In Chapter Two of the dissertation I describe the processes by which I selected, evaluated, and cleaned the textual data from the digital files. That work is also documented in the Gather, Clean, and Preprocess Notebooks. I do not offer my own interface for accessing the digital files as part of this project, recommending instead that they be accessed and viewed through the infrastructure of the SDA. I can make the processed text that was used for the topic modeling phase available on request. However, that text should also be re-creatable using the original scans and the associated notebook files.

I used additional data sources to support the analysis of the periodicals. This included place and people names from the SDA’s Yearbooks, place names from USGS, and third-party spelling lists. The data that I compiled or generated are available for download through the site, while external data sources are documented in the online bibliography.

Computational Analysis

From automating the download of the denominational periodicals to visualizing the topic model, this project has relied on computational methods at each stage. The primary coding language for the project is Python, as documented in the included notebooks section of the dissertation. Additionally, I used AntConc for computing keyness values as part of my evaluation of the strength of the topic model, as I discuss in Chapter 3. Other than the work in AntConc and my initial pass at downloading the PDF documents and extracting the text, all the computational pieces of the dissertation were done using Python libraries rather than desktop or online software so that the work could be easily documented and reproduced.

The topic model was created using Mallet through the command-line interface, as documented in the Model notebook. Some preprocessing steps, such as connecting noun phrases and the creation of the stopword list were completed using Gensim. I analyzed the model with Python, using the Pandas, Plotly, and pyLDAviz libraries for data analysis and visualization.

A list of the major software libraries utilized in this project is available in the bibliography.

Digital Interfaces

Selecting and developing the interfaces for the dissertation proved to be a challenge. While a single integrated application is the goal for future iterations of the project, the constraints of the dissertation resulted in my more modular approach to the different components of the project. There are four main interfaces for interacting with the dissertation. The first is the main project website, located at dissertation.jeriwieringa.com. The second is the topic model browser, located at browser.dissertation.jeriwieringa.com. The third is Jupyter Notebooks, which serves the dual functions of documenting the code used throughout the project and being an executable document, meaning that the provided notebook files can be used to execute the code locally. The final interface for the project is GitHub, with repositories for the main sites, the code notebooks, and a Python library I created for frequently used functions.

I created the main dissertation website using Nikola, a Python static site generator. This site generator enabled me to include a variety of different formats as part of the single project, including the notebook, html, markdown, and reStructuredText files. The default styling for the site uses the Bootstrap framework, which I have adapted. The notebook pages are included within the body of the dissertation in a static format — they can be viewed as last run but not executed. The notebook files rely on a Jupyter server to execute.

The main topic model browser is an application of Andrew Goldstone’s Topic Model Browser, which relies on D3.js. Using the output from MALLET and the scripts provided as part of the browser, I transformed the data so that it could be interacted with in this format. I manually created the labels for the topics, as I describe in Chapter 3.

Finally, I captured a number of the more repeatedly used functions into a Python library for ease and stability of use. This library is necessary for testing the code included in the project notebooks and offers examples for adoption and extension in other contexts. It can be downloaded from Github and installed locally using pip, as described in the library README file.

Archiving and Reconstructing A Gospel of Health and Salvation

Submitting a project such as this as a dissertation raised a whole myriad of questions regarding how to use existing processes and platforms for complex digital objects. The current system for archiving dissertations at George Mason relies on the use of a DSpace repository, which is optimized for the collection and preservation of singular, preferably PDF, files. Additionally, the formatting requirements for dissertations assume a textual final product, one that can be created with Microsoft Word. For the submission and archiving of this dissertation, I chose to pursue a hybrid strategy. This essay, along with the introduction, website overviews, and bibliography, make up the “dissertation object” for this project — the primary object that is properly formatted, cataloged, and archived in the repository.

For archiving and preserving the digital aspects of the project, I pursued a two-tiered strategy. First, to preserve the appearance of the project at the point of submission, I captured the web interfaces using WebRecorder.io, as well as submitted the sites to the Internet Archive. This includes dissertation.jeriwieringa.com and browser.dissertation.jeriwieringa.com. These interfaces are preserved within .warc (WebARChive) files and are viewable through a web archive player. For capturing the code and data of the project, I archived the code and data files for the different dissertation components and documented the required software and versions for running the code. These files can be downloaded and run on a local machine to test different aspects of the project or to modify them for other uses. This approach captures both the reading experience of the digital dissertation, as well as its technical underpinnings. These collections are preserved in the George Mason University Archival Repository, MARS.

Reproducing the Research

The computational aspects of the dissertation are split into four components, two for computational processing, and two for presentation:

Because of this structure, there is some duplication of files between the repositories. I managed the moving of files using the DoIt Automation tool.

To recreate the computational work that underlies the dissertation, one will need three primary components:

To run on a personal laptop, you will need Jupyter running locally and the supporting python libraries installed. Those libraries are documented in the environment.yml file in the Notebooks directory. The notebook files can also be uploaded to a third party Jupyter server, such as Microsoft Azure Notebooks or Google Colaboratory, for users who do not wish to set up a local Jupyter server.

Together these components make it possible for the intellectual work of preparing and analyzing the text to be examined and duplicated.

Rebuilding the Websites

To recreate the website portions of the dissertation, one will need the files for both the main website and the model browser. The main website uses Nikola to create html pages from a collection of markdown, restructured text, and notebook files. To run locally, use the nikola serve command from the root of the site directory.

The model browser is a single-page JavaScript application, built using D3.js. To run locally, run /bin/server from the root of the browser directory to launch a basic Python3 webserver.

Both sites require Python3. I recommend using a package and environment management system for running these elements locally. I used Miniconda for the dissertation.

Technical Support

When I started working on the main part of the dissertation, my programming experience to date had been the two required digital history courses at George Mason University, an additional introduction to programming course, some hands-on experience in web development through my research assistantship at the Roy Rosenzweig Center for History and New Media, and a Rails Girls workshop. In retrospect, embarking on a technical project from that starting point was a bit over-ambitious. My initial design of the project included network analysis from the denominational Yearbooks and geospatial analysis of people, publications, and ideas, along with text analysis. While I still think that these additional modes of analysis would help illuminate the development of this particular group of people, these ideas have been bracketed for future iterations of the project.

I have been overly committed in this project to doing my own computational work, both because I am committed to the idea that one needs to grapple with the assumptions and implementation of computational and historical analysis when bringing the two modes of inquiry together and because of the gender politics of the field. I am, however, deeply indebted to many people who have given their time and energy to help me understand and troubleshoot the technical aspects of this project. Chief among these are Fred Gibbs, the Experimental Humanities Group at the Iliff School of Theology, Lincoln Mullen, who consulted on the network analysis piece of the project that was unfortunately tabled due to time constraints, Amanda Regan, Taylor Arnold, and Lauren Tilton. The computational work in this project is primarily my own, aside from a myriad of snippets gleaned from StackOverflow and the libraries and resources noted above. The one major external contribution was the workflow I used for moving model files from DigitalOcean, where I ran Mallet due to the size of the corpus, to AmazonS3 for storage and locally for use, which was set up by Jason Wieringa.


  1. Tanya A. Clement, “Where Is Methodology in Digital Humanities?” in Debates in the Digital Humanities (University of Minnesota Press, 2016), http://dhdebates.gc.cuny.edu/debates/text/65 argues that for digital humanities to be situated “within a humanist epistemological framework” it “must also entail an explicit articulation of … how our techniques are tied to theory.” She notes that “the hermeneutical methods associated with reading,” the default methodology in humanities studies, “remain largely unarticulated,” which further complicates the work of introducing new methods.

  2. This problem of source code is one of the core elements of Da’s recent critiques of recent work in computational literary analysis. Nan Z. Da, “The Computational Case Against Computational Literary Studies,” Critical Inquiry 45, no. 3 (2019): 601–39, https://www.journals.uchicago.edu/doi/10.1086/702594. These concerns are not unique to history or the humanities. As computation and data science gain ground in the sciences as a mechanism for knowledge production, similar questions around reproducibility and code are increasingly of central concern.

  3. Whether that myth of the individual author has ever been true is another question, as all intellectual work relies on a robust intellectual community.

  4. Department of History Art History, “Digital Dissertation Guidelines,” 2019, https://historyarthistory.gmu.edu/graduate/phd-history/digital-dissertation-guidelines.

  5. Additionally, traditional dissertations may also benefit from such statements, particularly as software makes complex computational processes easier for non-technical users and such work is incorporated into traditional narrative prose.

  6. A list of the titles I used for the dissertation is included in the bibliography of the project.