Dime Web

The DIME-Web instrument is a team of one to three engineers which supports Social Science research projects in the development and use of digital methods to exploit the web as a survey medium. In particular, it is used to collect, enhance, clean up, visualise and analyse data from digital traces accessible online.

Objectives

The purpose of DIME-Web was to equip the Social Science world to exploit digital traces left on the web. The instrument was addressing different research populations with different types of approaches and timeframes: advice in the choice and use of digital methods, identification of sources available on the web, end-to-end development of reusable, free and open source generic tools, pedagogical or academic training, methodological support, one-off support in an afternoon workshop, support throughout a multi-year research project.

To these ends, the instrument developed and supplied a whole battery of tools in the form of free open source software, which can be used to collect data from across the web with crawlers such as Hyphe, or from targeted platforms such as Twitter using the Gazouilloire software, for cleaning, enhancement or classification, or else for visualisation, exploration and study, notably from the perspective of network analysis. Free and open source, these tools can not only be freely downloaded, installed and re-engineered by third parties, but many of them can be run directly online, for example utilities like Table2Net, ScienceScape, CatWalk or SeeAlsology.

The most ambitious of Dime Web tools is the Hyphe web crawler, which helps users in building, curating and categorizing qualitative web corpora of hypertext links between entities grouped quantitatively on the web. Both highly technical and user friendly, Hyphe offers students and researchers in the social sciences an exploratory and quali-quantitative method of studying communities of interest relating to their research fields. It is currently used by researchers in several European countries and in the Americas.

Positioning

The DIME-Web team was standing at the intersection of the fields of methodological research and digital methods in the social sciences. As well as supporting researchers in their methodology, it offered them its skills in network analysis and web archiving, as well as in controversy analysis inspired by Bruno Latour’s Actor Network Theory (ANT). The members of the operational team have authored or co-authored some fifteen academic publications, including an article on PlosOne in 2015 about the ForceAtlas2 network spatialisation algorithm implemented in the Gephi software, as well as a number of papers introducing Hyphe, notably at the ICWSM 2016 and FOSDEM 2018 conferences.

The link with the local research and innovation environment was essentially coming through technology exchange. The free and open source technologies used and/or developed by DIME-Web and more broadly by the medialab, were also being developed by an ecosystem of specialists united by questions relating to networks and data visualisation. Private sector entities were also sometimes interested in training courses on these technologies.

The fields that have historically been most receptive to such work were web studies, social network analysis and controversy mapping. The instrument attracted a wide range of academic and university users in France, but also abroad, especially in a number of European countries where the medialab has formed ties in the digital methods community: Richard Rogers’ Digital Methods Initiative de Richard Rogers) in Amsterdam, the Tantlab in Copenhagen, King’s College in London… Since 2011, therefore, numerous training courses and papers, notably on Hyphe, have been delivered in different countries in Europe and beyond, and DIME-Web tools have been implemented internally in different universities in England, Denmark, the United States.

Operation

DIME-Web’s operational team essentially was relying on two engineers recruited in 2011 and 2012, who pooled their working time with the Sciences Po medialab, an approach that notably provides access to more varied skills (designers, expert JavaScript developers, researchers…).

Supporting research projects encompasses different realities, which demand different levels of input. One-off requests for help, which are also often requests for guidance, were dealt with at Sciences Po’s monthly open-doors workshop at the medialab. This is often the occasion for initial contacts, which may then develop into a paying service. DIME-Web also gives courses to groups of researchers and PhD candidates, either in a project support role, or independently, for example at summer schools. Since 2014, both members and nonmembers of the Equipex consortium have been charged a fee for access to the facility for support, training or joint research development, which therefore helps to finance the development of the generic tools.

A specific DIME-Web Scientific and Technical Committee (STC) was monitoring the project selection process and the orientation of the programme, through one annual face-to-face meeting and ongoing digital interchanges. Between 2014 and 2020, selection and support for small projects that require rapid response without costly developments (by contrast with the bigger projects) have been delegated to the operational team and the STC’s role is confined to annual oversight.

Dime Web STC’s members:

  • Dominique Cardon,
  • Dana Diminescu,
  • Guilhem Fouetillou,
  • Dominique Goux,
  • Delphine Lagarde,
  • Raphaël Laurent,
  • Olivier Martin,
  • Clément Oury,
  • Franck Rebillard,
  • Roxane Silberman,
  • Jérôme Thièvre
  • Tommaso Venturini.

Contributions

Dime Web has accompanied more than twenty research projects carried out by teams from a wide range of backgrounds throughout the duration of the project, from sociology to economics and political science. For example, in the RiscoVac and PerseVac projects, the Web tool has helped in studying the controversies surrounding vaccines online, by measuring and evaluating the presence and influence of anti-vaccine actors on the web and social networks. Similarly, with the economists of the SoWell project for example, the Dime Web team assisted in the collection and analysis of search engine usage data in order to model well-being in different countries. In an entirely different vein, the ComIngGen project has benefited from the support of the instrument to study the extent to which discussion on the web has been able to both relay and fuel the debates around biotechnology and the public policies surrounding it.

Between 2011 and 2020, the Web tool has also led to the creation of a number of digital tools dedicated to the social sciences and freely available for anyone to use, indirectly supporting a wide range of users in France and abroad, in both the academic and pedagogical worlds, as well as in journalistic and associative circles. Working on the instrument has also contributed to theoretical research on visualization algorithms and the spatialization of networks.

Dime Web tools, and Hyphe in particular, are taught in Master classes by teacher-researchers as well as in high schools through the IDEFI FORCCAST pedagogical program piloted at Sciences Po, which contributed to financing the development of the Hyphe-Browser, a complementary user interface to Hyphe with a more pedagogical aim, in order to allow students to use the tool directly within a dedicated web browser.

Since most of the tools are free and open source, they are regularly used, downloaded and installed at UCLA (USA), King’s College (UK), Aalborg University and ITU (Denmark), UNIL (Switzerland), FMSH, EHESS, TGIR HumaNum, Universities of Lille 3, Rennes 2, Paris Nanterre, Paris Descartes and Paris-Est Marne La Vallée (France). There are also several instances where Hyphe has been used by researchers without contact with the team or even users outside the academia: for example, a researcher published an [analysis of the extreme right in Germany]((https://www.kai-arzheimer.com/german-right-wing-internet.pdf) using Hyphe in a conference, and the activists of Utopies Concrètes carried out a mapping of the associative world, an analysis of the controversies on the war on drugs with Hyphe has been spotted on Medium, etc.

Sustainability

The success that the various tools developed by the Dime Web team met in both academic and educational communities has made it possible to recognize the need to preserve these tools by ensuring that they have a life beyond the DIME-SHS Equipex project and therefore make it possible for other future projects to be able to use them without necessarily benefiting from special assistance. Following the recommendation of the ANR report, the web team has therefore focused its efforts during the last years of the project on the maintenance, consolidation and documentation of the catalog of open source tools and methods in order to ensure their life beyond Equipex.

For a certain number of the tools developed, especially the simplest of them (which can be used directly on line via a web interface), the development and publication in Open Source have made it possible to ensure this kind of sustainability by simply preparing and writing documentation on how to use them.

On the other hand, for more sophisticated tools such as Hyphe or Gazouilloire, the objective proved more challenging: although they were also open source, these tools were still difficult to install without having a minimum knowledge in coding, often putting off users, and quickly proving to be very demanding in terms of computing time and storage volume. As a result, a major effort has been made over the last few years to adapt Hyphe to enable it to scale up and spread more widely by automating its deployment in the cloud directly from the Hyphe-Browser interface, therefore making it possible for everyone to host Hyphe at a low cost for their own needs.

Similarly, the Dime Web team has thoroughly redesigned the architecture of the Gazouilloire software for the longitudinal collection of tweets, in order to make its installation and its use accessible to a much wider public, as well as to open up the simple possibility of connecting these collections to qualitative tools for selection and categorization, and quantitative tools for visual exploration.

Sciences Po continues to employ the engineers of the Web team within médialab, which makes it possible to ensure a longer term maintenance of the various software, as well as the production of new, free, and open source tools that allow the implementation of numerical methods in the Social Sciences and Humanities. The support of SHSs in the use of digital methods and the development of tools to use the web as a field of investigation continues beyond the Dime Web instrument within the Sciences Po médialab and its community of users and partners.