Québec magazines and newspapers
“Part of the digital collection of Bibliothèque et Archives nationales du Québec (BAnQ), the subset “Revues et journaux québécois” is particularly rich. From BAnQ’s patrimonial collections, these magazines and newspapers bear witness to the daily, cultural, political, economic and scientific life of Quebec.” (Description taken from the data provider website)
Visit the data provider website for more information.
General Information
Added to the Research Data: August 2017
Last update: 2024
Update periodicity: No update plans at the moment
Available formats: JPG, PDF, TXT, TIF, XLS
Documents’ assets available? No
Documents’ metadata available? No metadata available
Data size: 18.4Tb
Number of files: 4,627,040
Copyright: The data and all related documentation are subject to copyright. Please consult the data provider website for more details.
Data Subject Area: Various subjects. Concerns mainly the geographical area of Quebec
The Data in Graphs
All content available in “BAnQ - Revues et journaux québécois” dataset is under proprietary formats with no metadata available.
Available formats in “BAnQ - Revues et journaux québécois”
The Data Structure
The content of the BAnQ data is not structured in a completely homogeneous way and varies from one journal/newspaper to another. For one journal/newspaper it may vary over time. Although it may vary for each journal/newspaper, note that the content of this dataset should be structured by journal/newspaper, then by year, month and issue. At the issue level, it is very variable: it can vary from one file per issue to one file per page.
PDF Availability
Some documents are available in PDF format. The other files are images (mostly .jpg).
Full-text availability
Files in .TXT format, obtained by optical character recognition (OCR), are available for some magazines/newspapers, allowing a full-text search.
There is an folder called “OCR_corpus_data” in the dataset, which contains the .tsv files with the full-text of the documents. Not all documents have a co-related full-text in the tsv files.
The structure of the tsv files is as follows:
Column 1: file - The column file has the path to the file in the dataset
Column 2: page - The column page has the page number of the document
Column 3: text - The column text has the full-text of the page
Metadata Availability
There are no metadata available for the “BAnQ - Revues et journaux québécois”.
Document’s Bibliographic References
There are no references available for “BAnQ - Revues et journaux québécois”.