|
Featured Collection
Visual History Archive
With nearly 52,000 video testimonies, the USC Shoah Foundation Institute’s archive is the largest visual history archive in the world. The Institute interviewed Jewish survivors, homosexual survivors, Jehovah’s Witness survivors, liberators and liberation witnesses, political prisoners, rescuers and aid providers, Roma and Sinti survivors (Gypsy), survivors of Eugenics policies, and war crimes trials participants. |
|
Other Digital Resources
|
About Digital Collections
Digital Collections Data Model
The University Libraries' digital collections program, for non-subscription based electronic content, emphasizes distributed content storage and centralized metadata aggregation. Care and maintenance of digital collections and content remains as much as possible in the hands of the creating unit or individual.
This model is based on the assumption that an individual digital collection site on the Internet will be discovered (and used) less frequently by itself than if the same site is linked and indexed in a well-planned, larger "aggregator" database.
To maximize use of disparate and distributed digital collections across campus, the Digital Collections Unit is building an aggregator that will collect key pieces of descriptive information (metadata) about each object in distributed databases across campus. Such a gateway will be promoted as the digital content search engine for the entire campus, and will greatly enhance the odds of discovery of any object or database on campus. Once the resource is discovered through this metadata aggregator, users will be directed to its home site. The Libraries anticipate increased usage of each site that contributes/shares metadata to the aggregator, and will further expand potential audiences for digital collections by sharing metadata collections with other off-campus harvesters and Web search engines.
Specifications for Sharing Metadata With Us
In the first phase of this approach, the Libraries collected metadata aboout many distributed digital image collections on campus. This initiative was named IMAGES (Image Metadata Aggregation for Enhanced Searching). Recently, IMAGES has been mainstreamed into a larger search database of digital collections, including visual, textual, and other content.
To promote metadata sharing on campus, the University Libraries have created a core set of data elements that is used to map and exchange common pieces of information about each discrete digital object. This metadata scheme is flexible and can be used as the basis for a digital collections database, or just as a convenient common "language" for expressing information about your digital content.
Planning A Digitization Project
Numerous factors will influence the overall success of a digitization project, including staff, available expertise, equipment, and funding. The most important key to success, however, is planning and setting realistic goals, based upon knowledgeable advice. Per-item scanning costs, workflows, and many other details are bound to vary from project to project and from one original format to the next. The following links provide good advice on how to go about planning a digitization project.
Collaborative Digitization Program
Minnesota Digital Libarary
Moving Theory Into Practice
Scanning Photographic Collections
Visual Resource Image Quality
RLG Guidelines for Imaging
Digitization Standards
The following projects and sites provide a wealth of experience and information to anyone considering creating and managing large bodies of digital information.
Western States Imaging Best Practices
Art Libraries Society of North America
Visual Resources Association
Museum Computer Network
OCLC
RLG
CIMI (Consortium for Computer Interchange of Museum Information)
Society of American Archivists
Getty Research Institute
Digital Library Federation
Library of Congress
National Information Standards Organization
Technical Information on Digital Formats
The following sites offer specific recommended technical formats for different types of original information:
Digital Formats for Reproductions
Preserving Digital Information
Digitization Technical Standards
Library of Congress - Digital Formats for Content Reproductions
Creating and Distributing High Resolution Cartographic Images
Digital Imaging for Photographic Collections: Foundations for Technical Standards
Joint RLG and NPO Conference on Guidelines for Digital Imaging
Metadata
Metadata is not a new concept; it has existed in the computer science field for decades, and refers to information about electronic computer files. To update the concept a bit, the term "metadata" is now used to refer to information about any digital object that exists on the Internet. The need for certain types of data (such as creation date, file size, etc.) might seem obvious if one is managing a large group of digital objects merely as files. However, the Internet and World Wide Web offer great promise in terms of precision management, discovery and retrieval of digital objects such as images, e-texts, multimedia presentations, and other electronic files. Metadata may manifest itself either as an embedded, integral part of the digital object, to be retrieved and manipulated for various purposes, or it may exist externally from the digital object. Metadata often is broken into three broad categories:
- Descriptive metadata: Information that conveys some sense of intellectual content and context.
- Structural Metadata: Information that describes the attributes of an object, such as size, electronic format, and digital capture process.
- Administrative metadata: Information regarding rights management, creation date of the digital resource, hardware configuration, etc.
A "descriptive" metadata record consists of a set of elements, such as title, creator, format, date of creation, and subject coverage, that are necessary for describing a particular resource. Obviously, some institutions will use more elements to suit their specific needs or the needs of the resource being described.
The following sites provide useful explanations of metadata concepts and implementation:
Dublin Core Metadata Initiative
Metadata for Digital Preservation
Introduction to Metadata: Pathways to Digital Information
Interoperability
Interoperability is the ability of two or more systems to exchange information and to use the information that has been exchanged. Interoperability is highly dependent upon the ability to both a) conceptually map identical or similar elements of data structures, and b) consistently and reliably extract relevant information from within data structures. Numerous strategies can promote interoperability between multiple systems, but the simplest strategy is for the owners/operators of each system to employ similar data structures and to utilize similar or identical semantics and vocabularies as information is entered into these systems. For example, consider the following:
- System One contains images and information about cattle. Its database structure includes a field named TERMS, and every record related to cows or cattle has a "Cattle" entry in this field.
- System Two also contains images and information about cattle. Its database structure include a field named SUBJECTS, and every record related to cows or cattle has a "Cows" entry in this field.
Achieving interoperability between these two systems will require coordination on two levels:
- Establishing a common mapping between the fields each system uses to hold subject terms for each record. Both system owners could agree to simply change this field to a common name used by both, or they can alternatively decide to keep their respective data structures, and build a "map" that equates these and other similar data fields. When exchanging information, each system owner would then know what type of information would be found in this field.
- Providing some means for equating the related subjects "Cows" and "Cattle" so that a cross-system search will consistently and reliably retrieve all material on this topic. Although both subject terms are valid terms from a standardized vocabulary (Library of Congress Subject Headings), owners of both System One and System Two cannot guarantee users full retrieval of relevant content until they reach some agreement on how to address differences in descriptive metadata.
Obviously, the second aspect of interoperability, semantic and taxonomic compatibility, is the more difficult task. Certainly, consistently utilizing standardized vocabularies such as the Library of Congress Subject Headings, the Art & Architecture Thesaurus, or those employed by other professional communities, is a good first step. Documentation of utilized vocabularies is an essential aspect of managing large information systems. As is shown in the example above, however, the potential for variation even within the same controlled vacabulary is great. Therefore, achieving fuller interoperability between Systems One and Two would require agreement by both partners on some common application guidelines for the Library of Congress Subject Headings.
In the real world, full semantic and taxonomic interoperability across diverse systems with diverse content is impossible. Factors such as differing descriptive needs, the granularity of description, and even intended use of information, dictate that every system cannot be identical or 100 percent mappable to another system. However, by documenting employed data structure and content standards, owners of any system can still promote eventual interoperability at some level with other systems.
Controlled Vocabularies
Content data for some elements, such as the subject element, may be selected from a "controlled vocabulary," a limited set of consistently used and carefully defined terms. Using terminology from a controlled vocabulary ensures consistency and can improve the quality of search results, and may also reduce the likelihood of spelling errors when recording metadata. The description of each element indicates whether content should be selected from a controlled vocabulary, if possible.
Below are links to several "controlled vocabularies" accessible online:
- Library of Congress Subject Headings
- Medical Subject Headings
- Art and Architecture Thesaurus
- Union List of Artists Names
- Thesaurus of Graphic Names
- Thesaurus of Graphic Materials I
- Thesaurus of Graphic Materials II
- ICONCLASS
- IEEE Approved Indexing List
- Astronomy Thesaurus
- NASA
- Web Thesaurus Compendium
- The Revised Nomenclature for Museum Cataloging, A Revised and Expanded Version of Robert G. Chenhall's System for Classifying Man-Made Objects by James R. Blackaby, Patricia Greeno, and the Nomenclature Committee. Published by American Association for State and Local History, 1988.
Mappings Between Metadata Standards
A "crosswalk" has been defined as a "set of transformations applied to the content of elements in a source metadata standard that results in the storage of appropriately modified content in the analagous elements of a target metadata standard." (NISO White Paper, October 1998) A fully specified crosswalk contains a semantic mapping as well as a conversion specification. See the NISO White Paper, "Issues in Crosswalking Content Metadata Standards," for further information on a definition and specification of crosswalks.
Crosswalks provide the ability to create and maintain a set of metadata customized for local needs, and to map that metadata into any number of related content metadata standards. In order to build successful crosswalks and mapping schemes, it is important to maintain consistent data formats and data quality across metadata standards.
- Dublin Core to USMARC/GILS
- Dublin Core to EAD/GILS/USMARC
- Dublin Core to UNIMARC
- TEI header to USMARC
- GILS to USMARC
- FGDC to USMARC
- USMARC to FGDC
- MARC to Dublin Core
Related Links
The following projects and sites provide a wealth of experience and information to anyone considering creating and managing large bodies of digital information.
Art Libraries Society of North America
Visual Resources Association
Museum Computer Network
OCLC
RLG
Coalition for Networked Information
Society of American Archivists
Getty Research Institute
Digital Library Federation
Library of Congress
National Information Standards Organization


