|
|
International Journal of Internet Science |
A peer reviewed open access journal for empirical findings,
methodology, and theory of social and behavioral science concerning the
Internet and its implications for individuals, social groups,
organizations, and society.
Digitizing a Large Corpus of Handwritten Documents Using Crowdsourcing and Cultural Consensus Theory
Prutha S. Deshpande, Sean Tauber, Stephanie M. Chang, Sergio Gago, & Kimberly A. Jameson
University of California, Irvine, USA
Abstract: We investigated using internet-based procedures to convert information from a large handwritten archive of ethnographic survey data into a computer addressable database. Rather than manually transcribing the archive's estimated 23,000 pages of handwritten data, we sought to develop novel crowdsourcing task designs, and to use an innovative variation of Cultural Consensus Analysis (CCT) to objectively aggregate crowdsourced responses based on a formal process model of shared knowledge. Experiment 1used simulated internet-based tasks conducted on human subject pool participants in a university laboratory. Experiment 2 used a similar design with the exception that it was implemented on an internet-based research platform (i.e., Amazon Mechanical Turk). Results from these investigations shed light on several uncertainties concerning the utility of CCT analyses with crowdsourced transcription data. For example, they clarify (1) whether crowdsourced tasks are practical as a method for automating the transcription of the archive's handwritten material, (2) whether responses from perceptually-based tasks inherent to transcribing handwritten documents can be analyzed using CCT, and (3) if CCT is appropriate as a model of the transcription challenge, then do the results produce accurate answer-key estimates that could serve as correct transcriptions of the archive's data. Our results address these issues and convey how CCT modeling can be modified and made appropriate for aggregating such data. Implications of these analyses and uses of CCT in large-scale crowdsourced data collection platforms are discussed.
Keywords: Crowdsourcing, cultural consensus theory, shared knowledge, handwriting transcription, individual differences
Download full paper
The article is published under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
Navigate
Home
Mission
Articles
Articles in Press
Book Reviews
Editors
Editorial Board
Editorial Panel
Submit Article
Subscribe
Supporters
Conferences
Contact
Editors
Ulf-Dietrich Reips
(University of Konstanz, Germany)
Uwe Matzat
(Eindhoven University of Technology, NL)
Editorial Board
Michael Birnbaum (California State University at Fullerton, USA)
Tom Buchanan (Westminster University, UK)
Don Dillman (Washington State University, USA)
Anja Göritz (University of Freiburg Germany)
Adam Joinson (Open University, UK)
John Krantz (Hanover College, USA)
Han Woo Park (Yeungnam University, South Korea)
Chris Snijders (Eindhoven University of Technology, NL)
Barry Wellman (University of Toronto, Canada)
Regular Issues
Acceptance for submission for IJIS is open.
Please submit articles here.
You may also send a notfication of intended submission via e-mail to us (delete the word "NOSPAM" from the e-mail address).
Scope
The International Journal of Internet Science is an
interdisciplinary, peer reviewed journal for the publication of research
articles about empirical findings, methodology, and theory in the
field of Internet Science. It provides an outlet for articles on the
Internet as a medium of research and its implications for individuals,
social groups, organizations, and society.