Note: This post originally appeared on the QSR International Blog.
By Silvana di Gregorio, QSR International Director of Research.
Calling all social scientists. How were you trained? How are you keeping up (or not) with new developments in this rapidly changing digital world? How are you training your students?
This was the subject of an event sponsored by SAGE Ocean as part of the ESRC’s 2018 Festival of Social Science. In case you are not aware, Sage, who have been at the forefront of publishing qualitative work, have now launched SAGE Ocean – an initiative “to help social scientists to navigate vast datasets and work with new technologies”.
The main thrust of the discussion was whether social scientists need to be trained in learning how to code and develop software to support their analysis. Simon Hettrick of The Software Sustainability Institute reported that a survey of the 15 Russell group universities in the UK found that 92% of academics use research software, 56% had developed their own software of which (quite worryingly) 21% had no training in writing code. The Software Sustainability Institute run Carpentry workshops to train researchers in essential computing and data skills needed to do computational or data intensive research. The University of Manchester have also started to run a Data Carpentry workshop for social scientists with R. (Note: it is possible to analyse textual data with R.) However, Hettrick also advocated that social science research teams include Research Software Engineers (called Cyberpractitioners in the USA) in their projects. Research Software Engineers combine expertise in programming with an intrinsic understanding of research. This is a new professional group and Hettrick’s proposal is that they are needed when research groups are trying to analyse large data sets.
Ken Benoit, Head of the Department of Methodology at the London School of Economics, argued that methods training for social scientists need to include computation methods – such as, machine learning, programming skills, data structures and databases. He also argued that in order to incorporate these skills, doctoral programmes need to be re-structured – four years (which is the norm in the UK) is not enough time. But he says pre-requisites can count – which means some of this training could start during secondary education.
Andreea Moldovan, Research Fellow at the Centre for Research in Ageing and Cognitive Health, University of Exeter, gave the example of the NHS’ digital hub, which collects, processes and publishes data across the health and social care system in England. These datasets provide a great opportunity for social scientists to study a whole range of new topics not previously possible. But social scientists have not been trained to analyse this kind of big data. However, as a sociologist, she points out that this data is about people – which are messy – so she is not in favour of turning social science into another STEM discipline.
So where does this leave qualitative approaches?
Tomasso Venturini, Founder & Co-ordinator of the Public Data Lab; Researcher at the Institut des Systèmes Complexes Rhône-Alpes and the École Normale Supérieure of Lyon was the only speaker who addressed qualitative big data. He discussed that while quantitative analysis reveals trends and gives an overview, qualitative analysis zooms into the details of interactions. He was dismissive of current mixed methods approaches as he claimed that they are not continuous. He discussed his quali-quant approach by showing the platform he developed to explore amendments to laws in the French parliament. There have been 5000 laws since 2008 that he analysed – looking at from the time a law was first discussed through all the amendments added until it was passed. He uses visualizations to show how long each law was discussed along with a range of statistics. But he showed how it was easy to also visualize how each law changed and then drill down further to see how the wording has changed. It is also possible to drill down to specific amendments at a particular time. His platform illustrates how with visualizations the quantitative and qualitative data is continuous.
My view is that social scientists should have an understanding of computational methods but rather than coding and developing their own programs, they would be more creative and productive collaborating with computer scientists. And likewise, computer scientists should have an understanding of social science methods. I have recently come across a few projects where social scientists have been working with computer scientists. Matthew Hanchard at the University of Glasgow is a sociologist who is on the team of the ‘Beyond the Multiplex’ project - a collaboration between social science, humanities and computer science. He used NVivo to analyse inductively a large dataset of about 7,000 unstructured textual items to develop a framework for a computational ontology. For more detail of his work see his QSR blog post
Dong Nguyen is a Research Fellow at the Turing Institute and a computer scientist who is interested in developing text mining methods that can help understand questions from the social sciences and the humanities. She works collaboratively with social science and humanities researchers working with massive digital datasets. One example is the work Nguyen did in collaboration with humanities researchers on the Dutch Folktale Database. This is a collection of nearly 40,000 folktales which needed relevant metadata to be assigned. Nguyen and colleagues used supervised machine learning techniques to annotate the stories. In supervised machine learner, the computer is given a training set of manually coded text (by the humanities researchers) and based on those examples, the computer builds a model to automatically code new text in a test set. The test set is than analysed for its accuracy. More details about this project can be accessed here.
With more and more large datasets either being digitalized or being produced by social media, qualitative social scientists and humanities scholars need to develop methods to analyse such data. It is too important a task to be left to computer scientists alone. Some educators advocate that social scientists need to be trained in computational methods; others emphasize collaboration between social scientists and computer scientists. It is time for all qualitative social science and humanities scholars to think about where they stand in this debate.
Silvana is a sociologist and a methodologist specializing in qualitative data analysis. She writes and consults on social science qualitative data analysis research, particularly in the use of software to support the analysis. She is also QSR International's Director of Research.