Adventures in crowdsourcing
Adventures in crowdsourcing

Adventures in crowdsourcing

When I first began my Library and Information Science studies at Pitt in 2000, the idea that people could, one day, assist in the digitization of archival projects without having to leave their homes would have been astounding to me. And yet, here we are today, over a decade into the field of crowdsourcing transcription, which according to Brumfield1 has become a “standard part of library infrastructure.” Given the transcription possibilities now available through crowdsourcing and the closure of institutions because of COVID-19, one could make the argument that the opportunities are endless. Nonetheless, the developer of the cloud-based FromThePage tool that many archives, libraries, universities, and organizations have invested in for their digital humanities projects, Brumfield also points out the importance of moral discussions by practitioners in determining “appropriate access” to and “immersing volunteers in archives” of violent or sensitive content.

Even though it may be feasible to crowdsource transcribe an archival collection in this day and age, my gut reaction tells me that it may not be ethical depending on the collection’s content. For instance, I feel that a collection that is currently restricted to researchers is a prime example of what not to digitize. As a collection may be deemed off-limits due to confidential material or personnel files, it’s safe to say that it probably shouldn’t be crowdsource transcribed either; however, there are certainly exceptions to that rule, especially if there are items in the collection that need to be exposed or there are situations that need bearing witness. 

For example, a potential DH project that both unsettles and intrigues me is the mapping of clergy abuse reports within a diocese, archdiocese, state, and nation. Although I am not a survivor of abuse, I think that documenting dates and locations might be helpful to adults who have endured this type of abuse or who, like me, are compassionate and concerned citizens who want to track abuse statistics. Some victims may not be willing to come forward and share about their experiences, but for those who are ready, it may be a cathartic form of healing for them. It would also benefit the criminal justice system and the Catholic Church to clearly visualize where and when clergy abuse has taken place so that they can take preventative actions and not just move a priest or religious sister to a different school or parish.

In pondering the logistics of this particular project, one question I have is whether the names of the abusers and their victims should be publicized on the maps. As the crowdsourced data would inevitably include sensitive information about abusers and survivors, I would argue that these identifying details should only be available to the project team and any investigators of the cases. A secondary question is in reference to the age of the survivors and how best to support them if they are under 18 and are telling their story for the first time. Since survivors may be uncomfortable in disclosing information about themselves out of shame or fear of retaliation, then safety protocols need to be in place during interviews, surveys, and other information-gathering measures to protect the person who has been traumatized by clergy abuse.

Interestingly enough, a similar project that maps pedophile networks from the oral histories of survivors is underway in Australia. I’m doubtful that such a project would be open to anyone on the web to transcribe, but graduate students in DH, psychology, sociology, and criminology as well as professionals in these fields would be ideal transcription candidates and analyzers of abuse clusters. I can also see the possible therapeutic value of inviting abuse survivors to help in the transcribing of abuse reports for the project.

While the above projects are considered controversial and may not be suited for public digitization and transcription, a less heated project that I would love to be involved with and which is actively looking for virtual volunteers is TranscribeNC. Offered as an online crowdsourcing project through the State Archives of North Carolina, TranscribeNC gives students and history buffs the opportunity to use FromThePage and create transcripts of African American education documents, colonial court records, Constitutional materials, military letters, travel perspectives, and women’s history correspondence. As this DH project is closer to home and one that I plan to promote to the English and Social Studies teachers at my school, it was with joy that I read the American Association for State and Local History (AASLH) article2 which celebrates the Keene, New Hampshire program that recruited and trained community volunteers how to transcribe handwritten documents. In envisioning how I can help high school students connect with NC historical items, I’m excited about getting involved in this crowdsourced transcription project.

TranscribeNC button on NC Archives page for Researchers

Having served as a newbie transcriber in 2018 with Viola Mahoney’s interview, I, for one, appreciated the Tips on Transcribing as I made my way through the 44-minute oral history, pausing and rewinding the sound file to ensure that I was typing word-for-word in Scripto what I heard Mahoney saying. It’s also encouraging to learn that TranscribeNC volunteers are referred to transcription resources such as videos and guides to aid them in the deciphering of handwritten documents and the formatting of transcripts. Obviously, in preparation for a future crowdsourcing transcription project with teenagers, it’s essential to review the linked tips on the TranscribeNC page with them and to provide the necessary support so that they are successful in individual or group transcription practice.

Likewise, as I grow more assured of my own transcription abilities in the crowdsourced decoding of Civil War & Reconstruction Governors of Mississippi letters for Dr. Walters’ class, I can share that firsthand knowledge with my students and be an additional resource beyond the online tutorial and supplementary materials that I have at my fingertips. And hopefully, this joint escapade will only bring us closer as a learning community.


1 “The Decade in Crowdsourcing Transcription – FromThePage Blog.” 9 Jan. 2020, https://content.fromthepage.com/decade-in-crowdsourcing/. Accessed 8 Feb. 2021.

2 “Crowdsourcing Transcription and Creating Community – AASLH.” 9 Jan. 2020, https://aaslh.org/crowdsourcing-transcription-creating-community/. Accessed 8 Feb. 2021.

Leave a Reply

Your email address will not be published. Required fields are marked *

css.php