By Eve Kraicer, MSc Candidate at The London School of Economics and Political Science
Here at SAGE Ocean, we’ve been collecting data on the landscape of tools for computational social science. While looking through the data, we found an incredible variety, from resources to aid crowdsourcing to text analysis to social media analysis. Despite this diversity at the technical level of the tools, we found a persistent lack of diversity in terms of who built these tools.
As forthcoming research from SAGE Ocean shows, the founders, CTOs and developers of these products are 90% male. They are also majority white.
There could be many reasons for this trend. It’s consistent with structural racial and gendered patterns in STEM fields, as well as in workplace leadership more generally. It may also be that our dataset is not comprehensive, and could be missing existing tools because search engine algorithms favour popular responses. New tools, and tools that do not match what the algorithm expects, wouldn’t be as easy to find. (This could be its own full post, so for now I’ll just say, search engine algorithms like this are really good at keeping things as they are.)
Whatever the reason, however, what is clear is there is a lack of women, gender minorities and BAME groups in our dataset. To me, this is a problem for CSS for two reasons.
The first is about representation. It’s important to have resources built by a group of people representative of the diversity of folks that use them, and to encourage more people to see themselves as potential users of CSS.
The second is about knowledge. There is an idea that came out of epistemology called Standpoint Theory. Standpoint suggests that people in different positions have differential access to knowledge. In other words, the way we experience society shapes the kinds of questions we have the ability to ask about its structure and function.
Although standpoint came originally from questions about economic position, feminist theorists like Sandra Harding and Patricia Hill Collins are thinking about standpoint in terms of gender and race, and how it impacts the ways we come to know and question things. Our social position informs what and how we research, and using tools built from a single perspective may limit what we think to ask and test.
Together, the gap in our dataset could limit both who we imagine as a computational social scientist, and even how computational social science should work. To begin to fix this, we want to start by highlighting 6 tools led by women in our list of CSS tools:
Crowdtruth is a platform that enables researchers to aggregate responses from people to determine ground truths in their research. For example, Crowdtruth can help researchers annotate images, videos or texts into different categories.
SciStarter was started by Darlene Cavalier to help people connect and contribute to different scientific research projects. With a focus both on formal and informal research, SciStarter is investing in building a community of scientists, as well as making scientific research more accessible to the wider public.
ImageNet is a database of over 14 million images that have been tagged using WordNet classifications. This database has been used to develop and train techniques for visual recognition, as well as to study abnormal behaviour and creativity. Recently, cognitive psychologists published a study that used ImageNet to try to figure out how the human brain functions in comparison to artificial intelligence.
Prolific is a tool that allows researchers to quickly and efficiently find participants, and let participants find research studies they are best suited for. It can be integrated with many of the most common survey tools used in social science, allowing researchers to upload their surveys and then use demographic information to send the survey only to the desired group all on one platform.
Kitware is a company that creates evidence-based custom software to help researchers build the tools they need to conduct rigorous social science. Their open-source software includes platforms like VTK (a software to help display findings and data), ITK (a software to aid in image analysis) and Resonant (a software for analysing and managing datasets).
The Stanford NER (Named Entity Recognizer) is a Natural Language Processing tool that identifies words or phrases in a text that are proper nouns, or more generally the names of things. It can then classify these words into groups like person, organisation and location, which can be used to identify nodes for social network analysis, and summarise documents based on key terms.
This list is just a start, and fixing these gaps in our dataset has to be a continual process. Do you know of any research tools led by women and minority groups we can add? Let us know in the comments or tweet us @SAGEOceanTweets, and we’ll keep working to build a more representative, inclusive list.
Eve Kraicer is an MSc Candidate at The London School of Economics and Political Science (LSE). Her research uses data science methodologies to study the intersection of gendered violence and digital media. She has previously worked as a Research Assistant at McGill University and The LSE, and as an Intern at Penguin Random House Canada. You can find her on Twitter @helaeve.