Storage
Best Practices for Image Scalability and Long-term

Any discussion of sustainable digital projects must include an overview of the infrastructure basics of a data center.  While digital projects, i.e. scanning and digital photography, can be done by smaller institutions, it is in their favor when they are able to partner with a larger organization that is able to take their "work product" (digital objects and descriptive metadata) and place the content in a data center setup to manage data backups and has a plan to preserve the digital objects and metadata.  For additional and more detailed information about Data Center Basics See Appendix E. Network Design for Data Protection.

Solid network design is critical to allow access to and to protect the digital objects and the metadata that go into the making of quality digital projects.  An exhaustive treatise on network design is not in the scope of this document, but a general overview is important to give the reader the basics to protect their data.  Most complex, well-designed networks are created by network engineers in organizations.

The design of a basic TCP/IP network model consists of the following five layers.  They are:  the application layer the transport layer the network layer the data link layerand the physical layer.  The layers near the top are closer to the user application.  Those layers located near the bottom of the network are logically closer to the physical storage and transmission of the data.  For additional information about Network Architecture see Appendix F.

RAID Arrays
Storing scanned images on "live" servers is now a viable alternative to archiving high resolution tiff images on CD-ROMs.   Proper configuration of a server or workstation with RAID arrays can help sustain data without loss. 

RAID is an acronym for Redundant Array of Inexpensive (or Independent) Disks.  A RAID array consists of a number of drives which collectively act as a single storage system and can tolerate the failure of a drive without losing data.  These drives operate independently of each other. The number of disk drives on the server or even a workstation PC setup as a storage device, will help determine how the RAID array on the equipment will be formatted.  

A research group at UC-Berkeley coined the term RAID, initially defining six RAID levels. Each level is a different way to spread data across multiple drives ― a compromise between cost and speed. Understanding these levels is important, because each level is optimized for a different use.  Even though a RAID array offers another layer of data protection, one still needs to do regular backups and have an overall plan to transition and preserve the digital objects and metadata over time.  Setting up the proper RAID array on a large-scale storage device or a workstation PC setup as a storage device, is a basic way to insure image and metadata sustainability.  

Note:  There are about 10 types of RAID configurations.  New RAID configurations are constantly being presented to the standards association, the RAID Advisory Board.   RAID levels 0 through 5 are the most commonly used configurations.  The different RAID levels have different strong points, with some configurations better-suited for speed, others for capacity and still others for fault tolerance.  For examples and descriptions of RAID 0 – 5, see Appendix G.

Online storage/storage
Storage Equipment ― Large and Small Scale
Recently, a number of new developments in hardware and software applications have impacted the way images are stored, made accessible and preserved for the long term.  The amount of affordable "live storage" one can purchase has increased while the price has come down.

Network security layer options have proliferated as have the options to secure both the metadata and digital objects.  Improvements to data center workflows in the form of automated software applications able to mange routine tasks, such as backups or the movement of data/digital objects across platforms on layers of servers running different operating systems, have made it easier and less costly for data centers to conduct business.  Routine machine setup or maintenance and data migrations that used to take days or hours to accomplish now can be run in very short time periods on software/hardware combinations that enable cross-operating systems management and server virtualization.  While running large-scale data centers still requires a highly, well-trained technical staff, smaller organizations can take advantage of some of these image and metadata sustainability software/hardware tools and practices to insure that their hard work remains viable.  For more detailed information about online storage systems see Appendix H.

RSS Feed RSS Feed
Envelope Suscribe to Newsletter

Member Services

discover. share. experience.