zeehaven – a tiny tool to convert data for social media research

Zeeschuimer (“sea foamer”) is a web browser extension from the Digital Methods Initiative in Amsterdam that enables you to collect data while you are browsing social media sites for research and analysis.

It currently works for platforms such as TikTok, Instagram, Twitter and LinkedIn and provides an ndjson file which can be imported into the open source 4CAT: Capture and Analysis Toolkit for analysis.

To make data gathered with Zeeschuimer more accessible for for researchers, reporters, students, and others to work with, we’ve created zeehaven (“sea port”) – a tiny web-based tool to convert ndjson into csv format, which is easier to explore with spreadsheets as well as common data analysis and visualisation software.

Drag and drop a ndjson file into the “sea port” and the tool will prompt you to save a csv file. ✨📦✨

zeehaven was created as a collaboration between the Centre for Interdisciplinary Methodologies, University of Warwick and Department of Digital Humanities, King’s College London – and grew out of a series of Public Data Lab workshops to exchange digital methods teaching resources earlier this year.

You can find the tool here and the code here. All data is converted locally.

New article on GitHub and the platformisation of software development

An article on “The platformisation of software development: Connective coding and platform vernaculars on GitHub” by Liliana Bounegru has just been published in Convergence: The International Journal of Research into New Media Technologies.

The article is accompanied by a set of free tools for researching Github co-developed by Liliana with the Digital Methods Initiative – including to:

  • Extract the meta-data of organizations on Github
  • Extract the meta-data of Github repositories
  • Scrape Github for forks of projects
  • Scrape Github for user interactions and user to repository relations
  • Extract meta-data about users on Github
  • Find out which users contributed source code to Github repositories

The article is available open access here. The abstract is copied below.

This article contributes to recent scholarship on platform, software and media studies by critically engaging with the ‘social coding’ platform GitHub, one of the most prominent actors in the online proprietary and F/OSS (free and/or open-source software) code hosting space. It examines the platformisation of software and project development on GitHub by combining institutional and cultural analysis. The institutional analysis focuses on critically examining the platform from a material-economic perspective to understand how it configures contemporary software and project development work. It proposes the concept of ‘connective coding’ to characterise how software intermediaries such as GitHub configure, valorise and capitalise on public repositories, developer and organisation profiles. This institutional perspective is complemented by a case study analysing cultural practices mediated by the platform. The case study examines the platform vernaculars of news media and journalism initiatives highlighted by Source, a key publication in the newsroom software development space, and how GitHub modulates visibility in this space. It finds that the high-visibility platform vernacular of this news media and journalism space is dominated by a mix of established actors such as the New York Times, the Guardian and Bloomberg, as well as more recent actors and initiatives such as ProPublica and Document Cloud. This high-visibility news media and journalism platform vernacular is characterised by multiple F/OSS and F/OSS-inspired practices and styles. Finally, by contrast, low-visibility public repositories in this space may be seen as indicative of GitHub’s role in facilitating various kinds of ‘post-F/OSS’ software development cultures.

Article on COVID-19 testing situations on Twitter published in Social Media + Society

An article on “Testing and Not Testing for Coronavirus on Twitter: Surfacing Testing Situations Across Scales With Interpretative Methods” has just been published in Social Media + Society, co-authored by Noortje MarresGabriele ColomboLiliana BounegruJonathan W. Y. Gray, Carolin Gerlitz and James Tripp, building on a series of workshops in Warwick, Amsterdam, St Gallen and Siegen.

The article explores testing situations – moments in which it is no longer possible to go on in the usual way – across scales during the COVID-19 pandemic through interpretive querying and sub-setting of Twitter data (“data teasing”), together with situational image analysis.

The full text is available open access here. Further details and links can be found at this project page. The abstract and reference are copied below.

How was testing—and not testing—for coronavirus articulated as a testing situation on social media in the Spring of 2020? Our study examines everyday situations of Covid-19 testing by analyzing a large corpus of Twitter data collected during the first 2 months of the pandemic. Adopting a sociological definition of testing situations, as moments in which it is no longer possible to go on in the usual way, we show how social media analysis can be used to surface a range of such situations across scales, from the individual to the societal. Practicing a form of large-scale data exploration we call “interpretative querying” within the framework of situational analysis, we delineated two types of coronavirus testing situations: those involving locations of testing and those involving relations. Using lexicon analysis and composite image analysis, we then determined what composes the two types of testing situations on Twitter during the relevant period. Our analysis shows that contrary to the focus on individual responsibility in UK government discourse on Covid-19 testing, English-language Twitter reporting on coronavirus testing at the time thematized collective relations. By a variety of means, including in-memoriam portraits and infographics, this discourse rendered explicit challenges to societal relations and arrangements arising from situations of testing and not testing for Covid-19 and highlighted the multifaceted ways in which situations of corona testing amplified asymmetrical distributions of harms and benefits between different social groupings, and between citizens and state, during the first months of the pandemic.

Marres, N., Colombo, G., Bounegru, L., Gray, J. W. Y., Gerlitz, C., & Tripp, J. (2023). Testing and Not Testing for Coronavirus on Twitter: Surfacing Testing Situations Across Scales With Interpretative Methods. Social Media + Society, 9(3). https://doi.org/10.1177/20563051231196538

Working paper on “Testing ‘AI’: Do we have a situation?”

A new working paper on “Testing ‘AI’: Do We Have a Situation?” based on conversation between Noortje Marres and Philippe Sormani has just been published as part of a working paper series from “Media of Cooperation” at the University of Siegen. The paper can be found here and further details are copied below.

The new publication »Testing ‘AI’: Do We Have a Situation?« of the Working Paper Series (No. 28, June 2023) is based on the transcription of a recent conversation between the authors Noortje Marres und Philippe Sormani regarding current instances of the real-world testing of “AI” and the “situations” they have given rise to or as the case may be not. The conversation took place online on the 25th of May 2022 as part of the Lecture Series “Testing Infrastructures” organized by the Collaborative Research Center (CRC) 1187 “Media of Cooperation” at the University of Siegen Germany. This working paper is an elaborated version of this conversation.

In their conversation Marres and Sormani discuss the social implications of AI based on three questions: First they return to a classic critique that sociologists and anthropologists have levelled at AI namely the claim that the ontology and epistemology underlying AI development is rationalist and individualist and as such is marked by blind spots for the social and in particular situated or situational embedding of AI (Suchman, 1987, 2007; Star, 1989). Secondly they delve into the issue of whether and how social studies of technology can account for AI testing in real-world settings in situational terms. And thirdly they ask the question of what does this tell us about possible tensions and alignments between different “definitions of the situation” assumed in social studies engineering and computer science in relation to AI. Finally they discuss the ramifications for their methodological commitment to “the situation” in the social study of AI.

Noortje Marres is Professor of Science Technolpgy and Society at the Centre for Interdisciplinary Methodology at the University of Warwick and Guest Professor at Media of Cooperation Collaborative Research Centre at the University of Siegen. She published two monographs Material Participation (2012) and Digital Sociology (2017). 

Philippe Sormani is Senior Researcher and Co-Director of the Science and Technology Studies Lab at the University of Lausanne. Drawing on and developing ethnomethodology he has published on experimentation in and across different fields of activity ranging from experimental physics (in Re- specifying Lab Ethnography, 2014) to artistic experiments (in Practicing Art/Science, 2019). 

The paper »Testing ‘AI’: Do We Have a Situation?« is published as part of the Working Paper Series of the CRC 1187 which promotes inter- and transdisciplinary media research and provides an avenue for rapid publication and dissemination of ongoing research located at or associated with the CRC. The purpose of the series is to circulate in-progress research to the wider research community beyond the CRC. All Working Papers are accessible via the website.

Image caption: Ghost #8 (Memories of a mise en abîme with a bare back in front of an untamable tentacular screen), experimenting with OpenAI Dall-E, Maria Guta and Lauren Huret (Iris), 2022. (Courtesy of the artists)

forestscapes listening lab at re:publica 23, Berlin, 5-7th June

As part of the forestscapes project we’re organising a listening lab at re:publica 23, the digital society festival in Berlin, 5-7th June 2023:

How can generative soundscape composition enable different perspectives on forests in an era of planetary crisis? The forestscapes listening lab explores how sound can serve as a medium for collective inquiry into forests as living cultural landscapes.

The soundscapes are composed with folders of sound from different sources, including field recordings from researchers, sound artists and forest practitioners, as well as online sounds from the web, social media and sound archives. They are composed using custom scripts with the open source supercollider software as well as open source norns device, a “sound machine for the exploration of time and space”.

The re:publica installation will include soundscapes from workshops in London and Berlin – including some new pieces from the Environmental Data, Media, and the Humanities hackathon last week.

Cross-posted from jonathangray.org.

Network exploration on the web: an interview with Gephi Lite

Following the recent release of Gephi Lite, an open-source web-based visual network exploration tool, we interviewed its developers about the background of the project, what they’ve done and future plans…

What is Gephi Lite?

Gephi Lite can actually be defined in two ways. The first definition follows the name we chose: Gephi Lite is a lighter version of the Gephi desktop software, targeting users who need to work on smaller networks with less complex operations in mind.

The second definition is more focused on the technical context: Gephi Lite is a serverless web application to drive visual network analysis. There are no more requirements than an internet connection and a modern web browser.

Continue reading

New paper: “Visual Models for Social Media Image Analysis: Groupings, Engagement, Trends, and Rankings”

A new article on “Visual Models for Social Media Image Analysis: Groupings, Engagement, Trends, and Rankings” co-authored by Public Data Lab researchers Gabriele ColomboLiliana Bounegru and Jonathan Gray has just been published in the International Journal of Communication (IJOC). It is available as an open access PDF. Here’s the abstract:

With social media image analysis, one collects and interprets online images for the study of topical affairs. This analytical undertaking requires formats for displaying collections of images that enable their inspection. First, we discuss features of social media images to make a case for studying them in groups (rather than individually): multiplicity, circulation, modification, networkedness, and platform specificity. In all, these offer reasons and means for an approach to social media image research that privileges the collection of images as its analytical object. Second, taking the 2019 Amazon rainforest fires as a case study, we present four visual models for analyzing collections of social media images. Each visual model matches a distinctive spatial arrangement with a type of analysis: grouping images by theme with clusters, surfacing dominant images and their engagement with treemaps, following image trends with plots, and comparing image rankings across platforms with grids.

Article on “Engaged research-led teaching: composing collective inquiry with digital methods and data”

A new article on “Engaged research-led teaching: composing collective inquiry with digital methods and data” co-authored by Jonathan GrayLiliana BounegruRichard RogersTommaso VenturiniDonato RicciAxel MeunierMichele MauriSabine NiedererNatalia Sánchez-QuerubínMarc TutersLucy Kimbell and Anders Kristian Munk has just been published in Digital Culture & Education.

The article is available here, and the abstract is as follows:

This article examines the organisation of collaborative digital methods and data projects in the context of engaged research-led teaching in the humanities. Drawing on interviews, field notes, projects and practices from across eight research groups associated with the Public Data Lab (publicdatalab.org), it provides considerations for those interested in undertaking such projects, organised around four areas: composing (1) problems and questions; (2) collectives of inquiry; (3) learning devices and infrastructures; and (4) vernacular, boundary and experimental outputs. Informed by constructivist approaches to learning and pragmatist approaches to collective inquiry, these considerations aim to support teaching and learning through digital projects which surface and reflect on the questions, problems, formats, data, methods, materials and means through which they are produced.

Make a deal with Gephisto

Mathieu Jacomy and Anders Munk, TANT Lab & Public Data Lab

6 minutes read

Make a deal with Gephisto

Gephisto is Gephi in one click. You give it network data, and it gives you a visualization. No settings. No skills needed. The dream! With a twist.

Gephisto produces visualizations such as the one above. It exists as a website, and you can just try it below. It includes test networks, you don’t even need one. Do it! Try it, and come back here. Then we talk about it.

https://jacomyma.github.io/gephisto/

Continue reading

“Algorithm Trouble” entry in A New AI Lexicon

 A short piece on “Algorithm Trouble” for AI Now Institute‘s A New AI Lexicon, written by Axel Meunier (Goldsmiths, University of London), Jonathan Gray (King’s College London) and Donato Ricci (médialab, Sciences Po, Paris). The full piece is available here, and here’s an excerpt:

“For decades, social researchers have argued that there is much to be learned when things go wrong.¹ In this essay, we explore what can be learned about algorithms when things do not go as anticipated, and propose the concept of algorithm trouble to capture how everyday encounters with artificial intelligence might manifest, at interfaces with users, as unexpected, failing, or wrong events. The word trouble designates a problem, but also a state of confusion and distress. We see algorithm troubles as failures, computer errors, “bugs,” but also as unsettling events that may elicit, or even provoke, other perspectives on what it means to live with algorithms — including through different ways in which these troubles are experienced, as sources of suffering, injustice, humour, or aesthetic experimentation (Meunier et al., 2019). In mapping how problems are produced, the expression algorithm trouble calls attention to what is involved in algorithms beyond computational processes. It carries an affective charge that calls upon the necessity to care about relations with technology, and not only to fix them (Bellacasa, 2017).”