A Reflection on Crowdsourcing

Crowdsourcing allows members of the public to contribute to existing digitized projects.   Each project has its own type of collection, so it attracts different contributors. For my crowdsourcing comparison activity, I was able to review and compare 4 different projects.  Also, I was able to take part in the contributor’s role by selecting two different projects.

For the Transcribe Bentham project, its “volunteers are proof that a partnership between the general public and academia works!” (UCL).  1. Contributors can transcribe the manuscripts “to preserve a collection of enormous international historical and philosophical importance” (UCL). 2. Contributors can encode transcripts because “by encoding your transcripts, you are helping to create a richer resource: researchers and students interested in Bentham’s writing process, his deletions and revisions, will be afforded the opportunity to pursue this, owing to your work” (UCL).   By completing these tasks, the goals include broadening the access to manuscripts, to preserve the collection, and to generate scholarship for future research.

For the second project, the contributors are people who have knowledge about the building history. For example, contributors can provide information pertaining to “buildings long ago destroyed, streets renamed, whole neighborhoods redrawn or redefined” (The Building Inspector). The collected information from the contributors allows “Making these lost places findable via contemporary digital maps allows [The Building Inspector] to drill down through the layers of urban change and study the city in profound new ways.” There are four tasks that contributors undertake: check footprints, fix footprints, enter addresses, and check colors. For check footprints, the contributor checks “it is whether right, wrong, or close but in need of fixing” for the buildings shown on the computer (The Building Inspector). For the fix footprints, the contributor makes corrections to the imperfect footprints for recording history. For enter addresses, the contributor help provide “original street numbers” in order for TBI “to reference specific buildings in their historical context (and, eventually, to see who lived/worked there).” For check colors, the contributor helps with identifying the color coded buildings to distinguish residential vs. commercial.   The contributors’ tasks meet the following goals of The Building Inspector:   “It will allow our interfaces to drop pins accurately on digital maps when you search for a forgotten place. It will allow you to explore a city’s past on foot with your mobile device, ‘checking in’ to ghostly establishments. And it will allow us to link other historical documents to those places: archival records, old newspapers, business directories, photographs, restaurant menus, theater playbills etc., opening up new ways to research, learn, and discover the past” (The Building Inspector).

For the third project, “Trove brings together content from libraries, museums, archives, repositories and other research and collecting organisations big and small” (Trove).  Also, Tim Sherratt explains, “So there’s a number of ways that people can participate and contribute to Trove, the most obvious one being text-corrections, so digitized newspapers, the OCR.”  Sherratt contends that people contribute for the following reason: “In some ways I like to call it a collection of collections because what Trove does is it brings together collections from archives, museums, libraries, down to sort of little historical societies in little country towns.” As a content partner, the library or organization contributes records to the Trove collection. In the “For Content Partners,” there is more information on how to contribute by suggesting several methods. By contributing to Trove, “it is a collaboration between the National Library, Australia’s State and Territory libraries and hundreds of cultural and research institutions around Australia, working together to create a legacy of Australia’s knowledge for now and into the future.” Also, “It’s that aggregation of the data and what that makes possible in terms of other people building new tools, new interfaces, and creating new forms of analysis to work across that material” (Sherratt).  It also a collaboration between the general public and libraries and organizations.

Papers of the War Department requests help from the community because “PWD’s work with community transcription is part of a larger project to make crowdsourcing possible for archivists and documentary editors with digital collections” (PWD). In the “Become a Transcription Associate” page, PWD suggests teachers, researchers, and doctoral candidates to participate in the transcription. Here are some of the examples/suggestions for who should consider contributing: “a doctoral candidate working on early federal economic issues might contribute transcriptions of account records and correspondence with pay masters that she made for her dissertation work; an instructor teaching the U.S. history survey course might work with students to transcribe a set of translated Indian speeches and treaties so that they can investigate the relationship between the government and Native Americans just after the revolution; a genealogist researching a distant relative who served in the Revolutionary War might transcribe correspondence between that soldier’s widow and the War Department about his pension” (PWD)

All four projects feature different interfaces.  1. In Transcribe Bentham,  “For each manuscript to be transcribed, the Transcription Desk shows a digital image of the manuscript and an online text editor to enter and edit your transcription of the text” (UCL). The manuscript is transcribed online by viewing the image of it. The contributor can use the “edit” tool to transcribe parts of the manuscript. The word part is on the left side of the manuscript for the transcription. Also, the transcript can be encoded on the same online source by using the editing tool.  2. The Building Inspector uses scanned maps with an open source software with Map Warper. For each task, there is a map with buttons for the contributor to complete each task. For example, for the check colors, the contributor has to select a color button to indicate which type of building that is shown on the map. For the check footprints, the contributor has to select one of the three buttons for the selected building: fix, no, or yes. Each task is interactive and user friendly. The Map Warper allows the contributor to zoom in and take a closer look at the map.   3. Trove uses OCR platform. The corrections can be made on a digital format. However, according to Sherratt, “And of course as you would expect running OCR on these historical newspapers, the results aren’t always that great.”  4. The Papers of the War Department Project uses Scripto.   When the document is selected for transcription, in plain text, there are 3 or 4 tabs and empty white space for each tab selection.  If the transcription is completed or partially done, it will appear in “View Transcription.”   After the contributor selects “Transcribe this Document,” the contributor selects “Transcribe” to transcribe the scanned document.  The editing tool appears at the bottom of the transcription box or window. After the transcription is saved, the contributor views the results by selecting the “View Transcription” tab.  There are also  “View Discussion” and “Discuss this Page” tabs to view the transcription and editing exchange.

According to the UCL Transcribe Bentham, “All volunteers who transcribe manuscripts will be acknowledged in the relevant volumes of The Collected Works.” Also, the website features “Hall of Fame page that displays”a list of all volunteer transcribers who have contributed to Transcribe Bentham since the initiative launched in September 2010″ (UCL). Contributions are validated by consensus. So, a group of people are looking at the same map and conducting similar tasks.   According to The Building Inspector website, “Every time you inspect a building, you’re essentially casting a vote alongside your fellow Inspectors. We show the same footprint and task to several people and tally up those votes to decide whether they agree.” Also, “If the jury’s still out, we keep the footprint in circulation until consensus is reached, focusing our collective efforts on the buildings most in need” (TBI).  For Trove, the contributions are validated on a web page: http://trove.nla.gov.au/system/counts. It displays an updated list of contributors from libraries and organizations.  There is also the Text Ccorrection Hall of Fame for contributors who corrected the text of the scanned newspapers and magazines.  For the Papers of the War Department, the contributors are validated by contributing to the site as part of a community. Unfortunately, I could not find a list of contributors.

To better understand the concept of contributing to a crowdsourced project, I selected two projects.  The first project was selected for transcription, and the second project was selected for correction.  By contributing to both projects, I was able to see the significance of crowdsourcing for large online projects and some of the challenges that I encountered while contributing to both projects.

For the transcription project, I selected the Papers of the War Department. The interface was ok. I was allowed to view the image and change its size. There is also a “link to full size image,” which helped m view the image in a larger format. Unfortunately, I had difficulty with deciphering the handwriting for most of the selected scanned documents. I was able to transcribe numbers and a few words. I spent more time reading the handwriting and trying to decipher the letters and words. Here is an example of my transcription for “The Payment of Invalids”: http://wardepartmentpapers.org/scripto/index.php?documentId=4185&pageId=23257.

After I completed my correction, I viewed my contributions by clicking on “View Transcription.” Also, in my login page, I can see the which documents were transcribed and what was transcribed. Over time, I do not think I would continue contributing to this project because I had so much difficulty with deciphering/transcribing the handwriting. However, contributing to this project increased my interest in the importance of transcribing and preserving the PWD collection online.

This experience does fit into the goals of the Papers of War Project designers. As mentioned on their website, their goal “is to use the best technology of the early twenty-first century to recover and make widely available this vital record of American history that was seemingly lost at the dawn of the nineteenth century” (PWD). By allowing the public to access and contribute to the collection, my experience with transcribing fits in with the goals of the project designer. However, it takes certain people (e.g. handwriting experts) to decipher the handwriting of valuable documents.

For the correction project, I selected Trove.  Trove’s text editing tool was easy to use. I was able to make corrections on the left side of the screen while viewing the scanned image of the newspaper page on the right side of the screen. I could zoom in and out and move the image.  After I made the corrections for selective Obituaries, Marriages, and Crimes, I was able to see a saved list of my corrections.  Here is an example from one of my corrections: http://trove.nla.gov.au/newspaper/article/143335662.  The next examples are my corrections for Stoker Henry Davis and Mr. John E. Hammond’s Obituaries: http://trove.nla.gov.au/newspaper/article/58244432.

Trove has a “Text Correction Hall of Fame” list that shows the username and how many times the user has made corrections.  My username is listed under the the number 33.  In this project database, there are many articles waiting to be corrected. My contributions to correcting some of the pages in the newspaper increased my interest in the project because it was engaging, and I was acknowledged for my work. I might continue making corrections after this course is completed. I saw the importance of allowing the interested public to make corrections. My correction experience fits in with the goals of the project’s designers. As mentioned in “Text correction guidelines” of the Trove website, “the primary purpose of text correction is to improve the accuracy of search results in Trove’s newspapers and gazette zones. Text correction also allows for more accurate transcription downloads of the article text.”

Overall, I enjoyed the correction activity more than the transcribing activity because I selected a transcription project, Papers of the War Department, that was difficult for me to transcribe. Both projects allowed me to view the scanned contents; however, I had difficulty seeing the the contents from PWD. The viewing and editing tool for the Trove project was easier to use, and it had more features. The PWD tools seemed out of date like the scanned material/contents. Also, the PWD site had two features, View Transcription and Transcribe, that seemed to be a little confusing because a user may accidentally use either one to transcribe the material. For both projects, I was able to independently transcribe and correct the scanned documents.  My individual contribution allowed me to appreciate each project more because I felt as though I had contributed to the big project.  Both projects have valuable documents that need to be either transcribed or corrected by contributors in order to keep the collection accessible and readable to the public.

Leave a Reply

Your email address will not be published. Required fields are marked *
