The five pitfalls of document labeling - and how to avoid them

Whether you call it ‘content analysis’, ‘textual data labeling’, ‘hand-coding’, or ‘tagging’, a lot more researchers and data science teams are starting up annotation projects these days. Many want human judgment labeled onto text to train AI (via supervised machine learning approaches). Others have tried automated text analysis and found it wanting. Now they’re looking for ways to label text that aren’t so hard to interpret and explain.

Roundup: #text2data - new ways of reading

‘From text to data - new ways of reading’ was a 2-day event organised by the National Library of Sweden, the National Archives and Swe-Clarin. The conference brought together librarians, digital collection curators, and scholars in digital humanities and computational social science to talk about the tools and challenges involved in large scale text collection and analysis.