Creation Basics

PPI/DPI/SPI
Pixel is short for picture elements, which make up an image, similar to grains in a photograph or dots in a half-tone. Each pixel can represent a number of different shades or colors, depending on how much storage space is allocated for it.
Pixels per inch (PPI) refers to the number of pixels captured in a given inch and is used when discussing scanning resolution and on-screen display. When referring to digital capture, pixels per inch (PPI) is the preferred term, as it more accurately describes the digital image. This document will refer to pixels per inch (PPI) when discussing capture resolution. For more information see the Digitization Glossary.

Digitization guidelines and hardware manufacturers’ documentation frequently use the measurement of dots per inch or DPI when discussing optical resolutions for images and hardware. DPI more accurately refers to output devices, how many dots of ink per inch a printer puts on the paper or onscreen monitor display.

SPI stands for number of samples or units measured per inch. SPI is considered interchangeable with the preferred term, PPI.

Modes of Capture
Most imaging equipment offer three modes for capturing a digital image:

These three modes of capture also require some subjective decisions. For example, a black and white typed document may have annotations in red ink. Although bitonal scanning is often used for typed documents, scanning in color may be preferable in this case, depending on how the image will be used. Manuscripts, older printed matter and sheet music may be better served by capturing as continuous tone in grayscale or color to bring out the shade and condition of the paper and the marks inscribed on it. Projects interested in capturing the current condition of source materials should consider capturing in color.

Bit Depth
Bit depth measures the number of colors (or levels of gray in grayscale images) available to represent the color/gray value in the original work. A bit is the basic digital building block with a value of either 1 or 0. Every pixel sampled is assigned a value that corresponds to the color/shade it represents. Higher bit depths capture more information, increase the number of available colors/shades and, correspondingly, increase file size.

Color Space
Setting the color space is critical to digital capture and can be set within your photo editing software prior to digital capture, unless you are using a digital camera and capturing a RAW file.  A device color space simply describes the range of colors, or gamut, that a camera can see, a printer can print or a monitor can display. Editing color spaces, on the other hand, such as Adobe RGB or sRGB, are device-independent and determine a color range you can work in. Their design allows you to edit images in a controlled, consistent manner.  Choosing a wide gamut space such as Adobe RGB allows for greater color information to be captured about the original item and will allow you to convert to a narrow space later.  Narrow gamut space such as sRGB captures less information and is intended for images used on the web.

Resolution
Resolution determines the quality of an image. It is described either by pixel dimensions (height and width) for on-screen use or by physical size and PPI. Increased PPI take more frequent samples of the original and contain a more accurate representation of the original. Since higher resolutions are capturing more information, file sizes also increase. There is no one “perfect” resolution to scan all collection materials. Resolution should be adjusted based on the size, quality, condition and uses of the digital object. The combination of ppi and size of the original object determine the resolution needed to accurately capture as much information about the original object as is available. See Guidelines by Source Type for specific spatial resolution targets.

There is a point at which adding more pixels per inch no longer adds content, because the original source object has a finite amount of information available based on the way that it was produced.

Tonal Dynamic Range
One of the most significant factors affecting image quality is the Tonal Dynamic Range — the color space an image occupies between pure black, represented graphically as the number (0), and pure white, represented as (255). Tonal dynamic range can be displayed in professional-level TWAIN drivers and image editing software such as Photoshop. The histogram graphically displays the number of pixels in the image at selected values from black to white. Reviewing histograms at the time of capture can ensure that all of the image’s information is being recorded.

Compression

In lossy compression, a certain amount of information is discarded during the compression process. Although the discarded information may be invisible to the human eye, a loss of quality occurs.  Lossy compression formats also introduce generational loss ― each time a lossy image is manipulated or edited the quality of the image decreases.  Generational loss is one of the reasons master and service master images are not stored using compression. 

Converting images from a bit depth of 16 bit to a bit depth of 8 is also considered a lossy compression method as color information is discarded.

Women 

The Joint Photographic Experts Group (JPEG) format is most frequently used for access images requiring lossy compression.  The JPEG compression algorithm was designed for continuous tone images.

Lossless compression results in a file similar to the original image, with no loss of information.  The Tagged Image File Format (TIFF) and Portable Network Graphic (PNG) formats support lossless compression.  

File Formats
There are proprietary and non-proprietary formats for image files. The recommendation for Master file image capture is to use a non-proprietary format. See Guidelines for Creating Master Digital Images.

A RAW file is an image file that contains unprocessed data. Digital Single Lens Reflex (DSLR) cameras and some high-end scanners allow users to capture images in a raw or native file format that is unique to each manufacturer.  The proprietary nature of these files is of concern for the long term preservation and access of these digital files. Raw files have advantages for photographers including smaller file size and fine processing controls.  After processing or editing and before use, Raw files must be converted to an open standard format such as JPEG or TIFF. Processing Raw files creates an additional step in the imaging workflow and may require sophisticated photographic skills or expertise.

Tagged Image File Format (TIFF) is the format of choice for archival and master images. It is a flexible, highly portable, widely accepted, open standard image format and considered the professional image standard. TIFF files may or may not use lossless compression such as the LZW algorithm. Due to the large file size of TIFF images, they are not suitable for web delivery.

JPEG 2000 is a wavelet-based standard for the compression of still digital images. It was developed by the ISO JPEG committee to improve on the performance of JPEG while adding significant new features and capabilities to enable new imaging applications.  Beyond image access and distribution, JPEG 2000 is being used increasingly as a repository and archival image format. Many repositories are storing “visually lossless” JPEG 2000 files: the compression is lossy and irreversible, but the artifacts are not noticeable and do not interfere with the performance of applications. Compared to uncompressed TIFF, visually lossless JPEG 2000 compression can reduce the amount of storage by an order of magnitude or more. JPEG 2000 is still a lossy, compression technique, but may have potential for becoming the file format of choice for archival master images in the near future.

The Library of Congress in October 2007, announced collaboration with Xerox Corporation to study the use of the JPEG 2000 format in large repositories of digital cultural heritage materials. This study will build on the work that led previously to the JPEG 2000 profiles for newspapers and extend it to cover prints, photographs and maps. It will pay attention to the preservation, access and performance issues associated with large image repositories and how the JPEG 2000 standard can address those issues. The work, due to be completed in 2008, is expected to lead to specifications and best practices for the use of JPEG 2000. For details see the Digital Resources document, Appendix A. 

Joint Photographic Experts Group. A compression algorithm for condensing the size of image files. JPEG image files allow online access to full screen image files because they require less storage and are therefore quicker to download into a web page.

In April 2008, the United Kingdom’s Digital Preservation Coalition (DPC), named Portable Document Format (PDF) as one of the best file formats to preserve electronic documents and ensure their survival for the future.  This decision allows information officers to follow a standardized approach for preserving electronic documents. The DPC report suggests adopting PDF/Archival (PDF/A) for archiving electronic documents as the standard that will help preservation and retrieval in the future.  It concludes that it can only be done when combined with a comprehensive records management program and formally established records procedures.  For details see the Digital Resources document, Appendix A. 

Proprietary Formats
Proprietary formats are controlled or owned by a particular entity that licenses the format for use by others. These formats often require special plug-ins or software for viewing. Proprietary formats are not recommended for master images because licensing requirements may prevent the long-term access and preservation of images.  Examples of proprietary file formats include: Photoshop (psd) and Encapsulated PostScript (eps).

Buckley, Ph.D., Robert. “Technology Watch Report JPEG 2000 - a Practical Digital Preservation Standard?” DPC Technology Watch Series Report 08-01 (February 2008), http://www.dpconline.org/docs/reports/dpctw08-01.pdf

Fanning, Betsy A., “Technology Watch Report Preserving the Data Explosion: Using PDF” DPC Technology Watch Series Report 08-02 (April 2008),
http://www.dpconline.org/docs/reports/dpctw08-02.pdf

RSS Feed RSS Feed
Envelope Suscribe to Newsletter

Member Services

discover. share. experience.