David Golding is a PhD student in the History of Christianity at Claremont Graduate University, is a co-editor (with Loyd Ericson) of the Claremont Journal of Mormon Studies, and assisted in creating the new MHA website. He also wrote the leading book on the web programming framework, CakePHP. He has been kind enough to share a little bit about an exciting new primary source project.
We all have become hybrids in this day and age, haven’t we? In another life—and it still manages to remain with me no matter what I might do to shake it off—I worked in software development and desktop publishing. I can’t help but return to systems theory and technology as I build my own research agenda as a historian. For years now, I’ve anticipated historians taking advantage of what software engineers work with every day: open source data and logic. And yet nothing quite like open source technology has taken root in the archival and historical professions. It’s time for us to consider the benefits of pushing our research into a collective and open system, a system already possible (and free of charge) thanks to advances in social media, software versioning, and cloud computing.
And Cappuccino’s codebase was actually written by 83 contributors and well over 6,000 commits of code. Such is the magic of open source technology. Cappuccino is one of millions of projects hosted as a repository to which contributors commit changes to the code. Because these repositories are built on software versioning systems, nothing that is previously committed is destroyed and every new commit is preserved. This is “Save As…” on steroids. Not only is everything saved, but everyone has open access to every contribution made. This prevents contributors from writing over the top of each other’s work while also maintaining a single source. Over time, the collaborative efforts of dozens of people using their spare time to provide a small contribution to a single codebase accomplishes what would otherwise require a company-wide effort. We all benefit. Open source software is freely available—by definition, it’s open. That doesn’t at all mean that open software is less valuable. On the contrary, sometimes the open environment takes the possibilities to a higher level than closed environments. Motorola just bought the rights to Cappuccino for $20 million, and yet any one of us can download, use, and even contribute to the Cappuccino codebase.
So what would an open-source historical monograph look like? I’m not sure. But I am sure that our profession can benefit from open source technology, especially when it is freely available and universally accessible (provided you have an Internet connection and a computer that is less than 10 years old). We have all seen how social media have brought a tremendous degree of interconnectivity and collaboration to the profession. It’s time to utilize these media and technologies to further our research goals as a field.
Jonathan Stapley has pointed out recently how an extraordinary resource, the Studies in Mormon History database, is under review as a viable project. He mentioned how the task of maintaining this database exceeds the capacity of any one individual due to the increasing flood of information. A solution to this difficulty in databasing Mormon history is to open the database to the collective energy of the users themselves. Wikipedia has done something similar with exceptional results. But giving just anyone access to this kind of data store risks compromising the reliability of the data. How could we ensure that the content meets our standards when anyone could add, edit, or delete records in the database? Open source technologies have an answer in how versioning systems and repositories work.
Distributed repositories function differently than what most of us are used to. When we use web databases, we usually interact with records or rows in a table. In an open source environment, anyone can “fork” the repository which means that a local version of the database will be created on each user’s desktop. Each user has an exact copy of the repository. Users can then make any additions or edits to the local version of files themselves to suit their own purposes. When they wish to contribute their changes to the repository, they simply “commit” their changes. Repository administrators have the ability to “merge” these commits into the master branch. When changes are merged into the master branch, a new upgrade is pushed to all users. Individual users can then “merge” the upgrade with their own local codebase. This process of downloading and contributing information creates a mirror between all of the distributed local repositories of all the users and the remote, centralized master repository. The effect is compounded the more users participate: each contribution is collated into a single master branch, and that branch grows at the pace of the whole community of contributors. None of the information is destroyed during this process, ensuring that no single user can adversely affect the quality of the data. Repository administrators function as gatekeepers of the data without having to individually maintain all of the records in the database.
The open source model accomplishes a wiki-level of open collaboration while also maintaining a closed system protected from malicious or low-quality manipulation of the data.
After years of building my own personal research database, it dawned on me that I found myself continually reinventing the wheel. I would labor to assemble a comprehensive list of sources relating to a particular topic or event and then discover a monograph or a bibliography that had done much of the same work. Archives and bibliographies are amazing resources, but they remain unevenly distributed across libraries, publications, and institutions. I believe open source technology and software versioning repositories can overcome these systemic hurdles we face as researchers. And so I have begun a project to “open source” my own personal database. The database is hosted as a code repository on GitHub, the same leading open source website used by Cappuccino and several major products you may have already used, like Etsy, World of Warcraft, the New York Times, and several mobile apps for Twitter and Facebook.
This Mormon Primary Sources project on GitHub aims to consolidate citations of all primary sources relating to Mormonism. It accomplishes this by remaining an open system, meaning that it can evolve and adjust (what software engineers call scaling) as the source materials themselves change over time.
The project is in its infancy, but the database structure and the overall system is in place and ready to grow. The repository is ready for browsing, cloning, contributing, and merging. Anyone can browse the database from the web browser and anyone can contribute without worrying that the database will suffer. In a real sense, the repository behaves like a sandbox environment: no one can destroy the data except for the repository’s administrators. Feel at liberty to experiment and tinker with your local copies of the database or with the Git versioning software.*
For those interested in joining this collaborative effort, or even those curious about this repository as a researching reference, I invite you to visit the repository itself. Further details will be posted there as the project progresses. I hope to see you there. Of course, reinventing the wheel is always an available option, too.
* For Mac users, GitHub simplifies this process with their own specialized app. After installing GitHub for Mac, users only need to click the “Clone in Mac” button on the Mormon Primary Source repository page and a local copy of the database will be saved to their computer. They can immediately begin editing the files of this database and GitHub for Mac tracks all of their changes. The “synchronize” button in the application makes sure both the remote and the local versions of the repository mirror each other. PC users can accomplish the same tasks, but the process does run a little differently.