Dataset information
Available languages
French
Keywords
covid-19, confinement, bibliotheque-nationale-de-france, archivage-du-web
Dataset description
As part of its heritage mission of [legal deposit of the internet](https://www.bnf.fr/fr/centre-d-aide/depot-legal-des-sites-web-mode-demploi), the Bibliothèque nationale de France regularly collects a sample of the French web, consisting of large collections and targeted collections. These include “current” collections (for reference sites on a given disciplinary field) and “project” collections (related to a particular event or theme). This dataset contains the URLs of the sites, in connection with the Covid-19 outbreak, collected as part of targeted collections, between 1 February and 31 July 2020.
The game consists of a file in CSV format gathering nearly 4600 URLs of sites, blogs, social networks and videos. This content related to the Covid-19 epidemic was collected as part of the collection of News ephemeral, between February 1 and July 31, 2020, from the installation of the virus, on French soil until its remission, which corresponds to the end of the state of health emergency (10 July 2020). The CSV file also includes URLs of sites collected as part of the Videos and Instagram collections that were carried out in June and July 2020 respectively. These URLs serve as the starting point for the creation of Internet archives, which can be consulted by researchers in the research rooms of the various sites of the BnF, as well as remote access in the libraries of legal deposit printers (BDLI), in the region. The Covid-19 epidemic collection, available in the Internet Labs Archives, brings together, in addition to the content collected in the context of the three collections mentioned above, those gathered during the collections of the Paid Press and News. Each URL is accompanied by descriptive information (theme of the sheet used to carry out the collection, keywords provided) and technical information (frequency of collection, history of the URL collected) concerning the collection. It should be noted, however, that the collection frequency indicated in the CSV file corresponds to the last frequency associated with the URL of the site to be collected. This column therefore does not record the frequency changes that may have occurred during the collection.
Given the unpredictable nature of the Covid-19 outbreak in France, this collection was not carried out as part of a project collection within a given timetable. The contents were therefore initially selected during the ephemeral News collection. Fifty-two correspondents participated directly in this extensive collection. Some of them belong to the network of internal correspondents of the BnF while the others are attached to the network of regional correspondents (who work in fifteen partner institutions in the region). Subsequently, two further collections were carried out in June and July 2020; these are Videos and Instagram collections.
Build on reliable and scalable technology