Introducing Memespector-GUI: A Graphical User Interface Client for Computer Vision APIs

In this post Jason Chao, PhD candidate at the University of Siegen, introduces Memespector-GUI, a tool for doing research with and about data from computer vision APIs.

In recent years, tech companies started to offer computer vision capabilities through Application Programming Interfaces (APIs). Big names in the cloud industry have integrated computer vision services in their artificial intelligence (AI) products. These computer vision APIs are designed for software developers to integrate into their products and services. Indeed, your images may have been processed by these APIs unbeknownst to you. The operations and outputs of computer vision APIs are not usually presented directly to end-users.

The open-source Memespector-GUI tool aims to support investigations both with and about computer vision APIs by enabling users to repurpose, incorporate, audit and/or critically examine their outputs in the context of social and cultural research.

What kinds of outputs do these computer vision APIs produce? The specifications and the affordances of these APIs vary from platform to platform. As an example here is a quick walkthrough of some of the features of Google Vision API…

Label, object and text detection

Google Vision API aims to label, identify objects and recognise text in images. “Labels” are descriptions that may apply to the whole image. “Objects” are objects found in the image. “Text” is any printed or handwritten text recognised in the image.

Image sample 1

Labels:
• Sky
• Building
• Crowd

Objects:
• Building
• Person
• Footwear

Text:
• DON’T SHOOT OUR KIDS

Face detection

Google Vision API also detects faces and recognises their expressions. Google Vision API currently detects four emotions: joy, sorrow, anger and surprise. The likelihood of each emotion is presented on a scale of 5 (very unlikely, unlikely, possible, likely, very likely).

Image sample 2

Face
• Joy: Very likely
• Sorrow: Very unlikely
• Anger: Very unlikely
• Surprise: Very unlikely
• Exposed: Very unlikely
• Blurred: Very unlikely
• Headwear: Very unlikely

Web detection

Google Vision API attempts to provide contextual information about an image. “Web entities” are the names of individuals and events associated with an image. “Matching images” are the URLs of images that are visually similar, fully matching or partially matching with the analysed image. In particular, the domain names of the URLs may be repurposed to study how one image is circulated on the web.

Image sample 3

Web entities:
• Secretary of State for Health and Social Care of the United Kingdom
• Kiss
• Closed-circuit television
• Matt Hancock
• Gina Coladangelo
• Girlfriend


Full matching image URL:
• thetimes.co.uk/…


Pages with full matching image:
• dailymail.co.uk/…
• theaustralian.com.au/…
• mirror.co.uk/…

Safety detection

Google Vision API is used for content moderation work. There are five flags for the safety of an image: Adult, Spoof, Medical, Violence and Racy. The likelihood of each flag is presented on a scale of 5 (very unlikely, unlikely, possible, likely, very likely).

Image sample 4

Safety:
• Adult: Very likely
• Spoof: Unlikely
• Medical: Unlikely
• Violence: Unlikely
• Racy: Very likely

Misrecognition, mislabelling, misclassification

As we know from recent research and investigations into into AI and algorithm, these machine-learning based processes of labelling, detection and classification can often go wrong, sometimes with troubling or discriminatory consequences. The following example shows that the facial expression of a man in sorrow is misrecognised as “joy”.

Image sample 5: misclassification

How to repurpose computer vision APIs?

The computer vision APIs do not have official user interfaces since they are not intended to interface with humans directly. Google Vision API has an official drag-and-drop demo to demonstrate its detection capability, but the demo’s features are very limited.

Memespector-GUI is a digital methods tool that helps researchers use and repurpose data from computer vision APIs – whether to facilitate analysis of image collections, to understand the circulation and social lives of images online or to compare and critique the operations of computer vision platforms and services. This tool currently enables users to gather image data from Google Vision API, Microsoft Azure Cognitive Services, Clarifai and other services.

Are they free?

Memespector-GUI is free and open-source. However, the use of commercial APIs is not necessarily free of charge. Yet, it does not mean that you have to pay. When you open an account with Google Cloud and Microsoft Azure, you will usually receive free credits, which may be enough to process thousands of images.

Resources

The following resources will guide you open accounts and get free credits (if applicable) with commercial vision APIs.

After opening accounts with the commercial APIs, the following guide is a step-by-step manual on using Memespector-GUI to enrich your image datasets.

The project was inspired by previous memespector projects from Bernhard Rieder and André Mintz, and developed with ideas, input and testing from Janna Joceli Omena.

Using ObservableHQ notebooks for gathering and transforming data in digital research

We’ve recently been experimenting with the use of ObservableHQ notebooks for gathering and transforming data in the context of digital research. This post walks through a few recent examples of notebooks from recent Public Data Lab projects.

In one project we wanted to use the CrowdTangle “Links” API to fetch data about how certain web pages were shared online and across different platforms. After gaining access to relevant end points, we could adopt different means to call the APIs and retrieve data: such as using something like Postman (a general-purpose interface to call endpoints), or writing custom scripts (for example in Python or Javascript).

Code notebooks are a third option that lies somewhere in between these options. Designed for programmers, notebooks allow for iterative manipulation and experimentation with code whilst keeping track of creative processes by commenting on the thinking behind each step.

Notebooks allow us to both write and run custom scripts as well as creating simple interfaces for those who may not code. Thus we can use them to help researchers, students and external collaborators to collect data, making it easier to call APIs, setting parameters, or perform manipulations.

ObservableHQ is one solution for writing programming notebooks, it runs in the browser and is oriented towards data and visualisations (“We believe thinking with data is an essential skill for the future”). Hence, we thought it could be a good starting point for what we wanted to do.

Screen capture of a notebook

The first notebook that we produced allows researchers to call CrowdTangle APIs in a simplified way: it exposes calls parameters and, contextually, it provides explanations and warnings about how to set them. For instance, it transforms the selection of platforms into checkboxes or the interval between calls into a slider (with a warning about rate-limits). The insertion of dates and other parameters can also be facilitated.

Examples of input fields

Data can be browsed in tabular form or downloaded as a CSV or JSON.

Examples of data

Notebooks can be used for a lot of diverse tasks. For instance, we produced a notebook that extracts hashtags from a list of posts and formats the data so to be used in Table2Net. One for extracting URLs and domain names from texts. Another one is dedicated to extending shortened URLs.

The last example was a case where it was necessary to implement a back-end service. Notebooks in ObservableHQ, indeed, run as front-end browser Javascript, therefore certain operations are tricky or impossible (this is one of their main limitations).

However, there are also many advantages. Notebooks are very flexible and easy to be transform and adjust: we can start gathering and exploring data and, after a couple of iterations, we can decide how best to structure it. We can add and remove parts of the interface almost instantly and we can embed functions (“cells”) from other notebooks, such as an emoji loading bar. The possibility to reuse or modify an entire notebook, or just a part of it, is very useful to build on the work done by other researchers and quickly bootstrap new tools as we need them.

Notebooks are particularly useful as part of exploratory research approaches where you are iteratively refining and adjusting research questions and seeing what is possible as you adjust various settings (e.g. the structure of the data, the parameters of the APIs).

An unusual loading bar, which can be imported into your notebook.

So far these projects have been used in the context of investigations with journalists on the Infodemic project, as well as in ongoing research and collaborations around DeSmog’s Climate Disinformation Database (including a prize-winning undergraduate thesis on this topic).

As per the working principles of the Public Data Lab, all of these notebooks are open-source (MIT license) and you are most welcome to use, transform and adjust them to your own work. If you use them for a project of piece of research, or if you’re also using code notebooks for we’d love to hear from you!

Here’s a full list of the notebooks mentioned in this post:

“Data critique and platform dependencies: How to study social media data?”, Digital Methods Winter School and Data Sprint 2022

Applications are now open for the Digital Methods Winter School and Data Sprint 2022 which is on the theme of “Data critique and platform dependencies: How to study social media data?“.

This will take place on 10-14th January 2022 at the University of Amsterdam.

More details and registration links are available here and an excerpt on this year’s theme and the format is copied below.

The Digital Methods Initiative (DMI), Amsterdam, is holding its annual Winter School on ‘Social media data critique’. The format is that of a (social media and web) data sprint, with tutorials as well as hands-on work for telling stories with data. There is also a programme of keynote speakers. It is intended for advanced Master’s students, PhD candidates and motivated scholars who would like to work on (and complete) a digital methods project in an intensive workshop setting. For a preview of what the event is like, you can view short video clips from previous editions of the School.

Data critique and platform dependencies: How to study social media data?

Source criticism is the scholarly activity traditionally concerned with provenance and reliability. When considering the state of social media data provision such criticism would be aimed at what platforms allow researchers to do (such as accessing an API) and not to do (scrape). It also would consider whether the data returned from querying is ‘good’, meaning complete or representative. How do social media platforms fare when considering these principles? How to audit or otherwise scrutinise social media platforms’ data supply?

Recently Facebook has come under renewed criticism for its data supply through the publication of its ‘transparency’ report, Widely Viewed Content. It is a list of web URLs and Facebook posts that receive the greatest ‘reach’ on the platform when appearing on users’ News Feeds. Its publication comes on the heels of Facebook’s well catalogued ‘fake news problem’, first reported in 2016 as well as a well publicised Twitter feed that lists the most-engaged with posts on Facebook (using Crowdtangle data). In both instances those contributions, together with additional scholarly work, have shown that dubious information and extreme right-wing content are disproportionately interacted with. Facebook’s transparency report, which has been called ‘transparency theater’, demonstrates that it is not the case. How to check the data? For now, “all anybody has is the company’s word for it.”

For Facebook as well as a variety of other platforms there are no public archives. Facebook’s data sharing model is one of an industry-academic ‘partnership’. The Social Science One project, launched when Facebook ended access to its Pages API, offers big data — “57 million URLs, more than 1.7 trillion rows, and nearly 40 trillion cell values, describing URLs shared more than 100 times publicly on Facebook (between 1/1/2017 and 2/28/2021).” To obtain the data (if one can handle it) requires writing a research proposal and if accepted compliance with Facebook’s ‘onboarding’, a non-negotiable research data agreement. Ultimately, the data is accessed (not downloaded) in a Facebook research environment, “the Facebook Open Research Tool (FORT) … behind a VPN that does not have access to the Internet”. There are also “regular meetings Facebook holds with researchers”. A data access ethnography project, not so unlike to one written about trying to work with Twitter’s archive at the Library of Congress, may be a worthwhile undertaking.

Other projects would evaluate ‘repurposing’ marketing data, as Robert Putnam’s ‘Bowling Alone’ project did and as is a more general digital methods approach. Comparing multiple marketing data outputs may be of interest, and crossing those with CrowdTangle ‘s outputs. Facepager, one of the last pieces of software (after Netvizz and Netlytic) to still have access to Facebook’s graph API reports that “access permissions are under heavy reconstruction”. Its usage requires further scrutiny. There is also a difference between the user view and the developer view (and between ethnographic and computational approaches), which is also worth exploring. ‘Interface methods‘ may be useful here. These and other considerations for developing social media data criticism are topics of interest for this year’s Winter School theme.

At the Winter School there are the usual social media tool tutorials (and the occasional tool requiem), but also continued attention to thinking through and proposing how to work with social media data. There are also empirical and conceptual projects that participants work on. Projects from the past Summer and Winter Schools include: Detecting Conspiratorial Hermeneutics via Words & Images, Mapping the Dutchophone Fringe on Telegram, Greenwashing, in_authenticity & protest, Searching constructive/authentic posts in media comment sections: NU.nl/The Guardian, Mapping deepfakes with digital methods and visual analytics, “Go back to plebbit”: Mapping the platform antagonism between 4chan and Reddit, Profiling Bolsobots Networks, Infodemic everywhere, Post-Trump Information Ecology, Streams of Conspirational Folklore, and FIlterTube: Investigating echo chambers, filter bubbles and polarization on YouTube.

Organisers: Lucia Bainotti, Richard Rogers and Guillen Torres, Media Studies, University of Amsterdam. Application information at https://www.digitalmethods.net.