Digitising Newspaper Collections - Best Practice tips

Mark scanning newspaper archives

Digitising historical newspaper archives is an excellent of way of making them more accessible to your target audience and preserving them for future generations.

However deciding to digitise, what are often very large and fragile collections, can be a daunting prospect. So to help archivists and collection managers, below we’ve shared our best practice advice to help you plan and undertake the digitisation of your newspaper collections.

1. Selecting What to Digitise- Be Strategic

With newspapers often published daily over a long period of time, newspaper archives can be very large, whilst time and budgets are often limited. It can be overwhelming deciding which materials to digitise first and which to leave out.

To begin with, consider a smaller pilot project around a sample of the collection; perhaps digitising the most in-demand, valuable, or at risk items first. Alternatively you could consider basing the pilot project around a themed sub-set of the archive, for example scanning issues produced during the WW1 period.

If you plan to digitise in-house, a pilot project can allow you to benchmark timescales and staff overheads, enabling you to budget accurately for future digitisation projects. Likewise, if outsourcing the digitisation, it will allow you to get an accurate idea of the supplier’s costs, turnaround time, and quality of the digital images produced.

2. Survey the Newspapers Before Digitising

Checking the condition of your newspapers before starting a digitisation project is a vital initial step in the process. Very delicate items may be unsuitable for scanning and as such the items you are planning to scan should be assessed for general fragility as well as brittleness, mould and other signs of damage.

If any of the above are found to be present on particular documents, a conservator may need to be consulted and necessary action taken before scanning the item.

If you are outsourcing the digitisation, any professional external supplier should always carry out a visual assessment of the collection you are planning to digitise to ensure it is safe to do so, before proceeding with a project.

Mark turning newspaper pages

3. Handling The Archival Materials During Scanning

As with handling any old and fragile archival materials, special care should be taken whilst preparing and digitising newspapers. The National Archives advise that gloves do not need to be worn when handling newspapers and other paper based media.

Doing so can make it difficult to handle paper due to the loss of sense of touch at the fingertips which can be hazardous when handling fragile items. In addition, fibres from the gloves could potentially catch on the delicate pages causing damage. They do advise though that hands are dry, clean and free from any moisturiser or hand cream.

When preparing publications for scanning, unfold any corners that have been turned down but do not fold them back on themselves and, of course, ensure pages are turned with care to avoid damaging already fragile paper.

4. Use Flatbed Planetary Scanning Equipment

When scanning newspapers, roll fed scanners can appear to be a faster and more cost effective option, which can be an appealing choice for organisations with large collections and limited budgets. However, this rigorous method is unsuitable for digitising older and deteriorating newspaper archives as the way the material is fed through the machine can be damaging.

For this reason we always recommend using flatbed planetary scanners as the best method of digitising newspaper collections, this advice is echoed by both JISC and The National Archives. With this method the newspaper can lie flat on the scanning bed, which is large enough to provide support for the whole document, and low pressure glass screens flatten the page with the minimum of stress on the newspaper.

However planetary scanners can be expensive (especially when taking into account the additional cost of training/hiring a member of staff specialising in digital scanning), so if your institution doesn’t already possess this equipment it can be a more cost effective option to outsource the digitisation project to a professional supplier.

5. Use Fluorescent or LED Lighting

Whilst digitising, using the correct lighting for the media is vital. Incandescent (heat producing) lighting should not be used as it can damage paper and the dyes used in paper-based media. They can cause the paper to darken or fade and become brittle due to the breakdown of cellulose fibres in the paper.

Instead we recommend using fluorescent or LED lighting, as we do in our studio, to illuminate newspapers during scanning. This lighting method produces less heat and infrared energy, mitigating the risk of damage to the newspapers.

Digitised newspaper page

6. Choosing image capture resolution and file formats

Every digitisation project is different and image capture resolution changes accordingly depending on the end goal. When digitising newspapers for clients our team of technicians recommend capturing the items at 300ppi to produce uncompressed master Tiff files as preservation images.

Lower resolution surrogate Jpegs can then be produced from these to create thumbnails or for publishing online. You may wish to also consider producing PDF copies of individual newspaper issues for your digital archive, to allow quick and easy browsing and downloading by users if you are publishing the collections online.

It is also worth noting that due to the large physical size of newspapers, generally a minimum of A2 size when opened, the file sizes of the resultant digital images produced during the digitisation will also be relatively large. This is important to bear in mind when planning where and how to store the digital images.

7. Digitally capture the printed text

Digitally capturing the written content of your newspapers is vital in getting the most out of your digitised archive. Capturing this text can enable virtually instantaneous keyword searching across the entire collection, making the valuable information within far more accessible for researchers and historians.

Optical Character Recognition (OCR) is, for most projects, the most effective way of capturing this text content. OCR works by scanning each digital image in a collection, identifying written content and converting it to typed text.

This text is then held as metadata within the digital archive, which makes the content searchable and allows the end user to find specific dates, names, or other keywords.

But if planning to use OCR, additional care should be taken when initially scanning the papers, especially with thicker editions. The technician must make sure the documents are fully open and flat when the image is captured, or when the OCR process is carried out text may become distorted - resulting in potential errors such as words being missed or misread.

Although an additional cost to your scanning project, OCR-ing the newspaper collections is simply vital in making their content usable and accessible.

8. Managing the Digitised Newspaper Archive

As mentioned above, newspaper collections are often very large and therefore so are their digitised counter-parts. This makes it vital to organise the digital images effectively in order to enable discovery and keep the information in the digital archive accessible.

This means including valuable descriptive metadata in the filenames of the images, such as the newspaper title and publication date, and having a logically named hierarchical folder structure with descriptive metadata within the folder names (such as issue, publication year or month).

For large digital archives we also recommend utilising a digital collections management system (such as PastView) to enable quick and easy organisation and access to the archival materials.

A note on Scanning Newspapers vs Microfilm

If your newspaper archive has been microfilmed, it is often less costly to digitise the microfilm version of the newspapers than the original newspapers themselves. Though the quality of output images is dependant on the quality of the initial microfilm.

Are you looking to digitise your newspaper collection?

Our specialist large format digitisation equipment is able to scan all manner of oversize archive material, ranging in size from A1 (84cm x 59cm) up to approximately 1.5 x A0 (150cm x 100cm).




Explore recent articles