Re-valuing Openness: contested open data practices in global machine learning

25.05.2023

International Workshop "Re-valuing European Research Infrastructures – Knowledge, Innovation, and the Public Good" University of Vienna, May 25-26, 2023.

In my workshop presentation, I delve into the intricate dynamics of designing and governing research infrastructures for data sharing in the context of machine learning (ML). These infrastructures play a pivotal role in data science and AI development, providing access to vast datasets crucial for training ML models and assessing their efficacy. However, their operation is situated within a complex web of public and private interests.

While the open sharing of training data is often championed for fostering transparency and collaboration in ML research, the notion of value is contested across various stakeholders. Researchers, tech companies, law enforcement agencies, and advocacy groups all have divergent agendas shaping the development and utilization of these infrastructures. This tension gives rise to counter-imaginaries—alternative visions of what these infrastructures should embody and the values they should promote.

My presentation draws on theoretical insights from STS and critical data studies, coupled with preliminary findings from empirical research. Through discourse analysis and stakeholder interviews, I aim to unravel the social construction of "openness" in AI research infrastructures. By identifying key actors and exploring emerging counter-infrastructures, I shed light on the social and political dimensions of data sharing in ML.

Ultimately, my contribution sparks critical dialogue on the future of research infrastructures for open science, emphasizing the importance of engaging with diverse perspectives and alternative visions. By embracing counter-imaginaries, we can challenge dominant paradigms and pave the way for more inclusive and accountable practices in the field of ML and beyond.