Metadata is data about data. That is: if a Book is considered to be data then the information describing the book would be metadata. It is very important that this data be identified separately with keywords so that it can be used by library software to describe the eBook and by retailers to identify the contents.
Most library management programs extract this data from the original eBook file or container so that it can be referenced separately and in some cases they may get data from other sources instead of the file itself.
This is the equivalent of a library card from the library catalog in a public library.
Metadata should include at least:
- TITLE for the title of the book.
- AUTHOR for the authors name
- PUBLISHER for the publisher name
- COPYRIGHT for copyright information or published date
- EISBN or ISBN for the book identifier
It may also include many other things such as a cover image, a description (subject) of the book, the genre and the language of the book.
The <head> section of an HTML document can be the source of metadata for some applications. Here is a sample of some of the data that may appear.
- <meta name="gemstar-legacy" content="publisher 2.0" />
- <title>The Title</title>
- <meta name="Author" content="Authors Name" />
- <meta name="Description" content="Mystery, Suspense, History, Gothic, Literature, Books, Arts" />
- <meta http-equiv="Content-Type" content="text/html; charset=windows-1252" />
- <meta name="GENERATOR" content="rbmake v0.99d using the rbmake library v0.99d" />
Metadata on files
eBook and document files that are recognized by the appropriate operating system include a feature that can be used to add metadata to the file. For example on Windows you can right click on the file in an explorer window and select properties. A general tab and a Summary tab will appear. Select the Summary tab to fill in the data.
- Category (Genre)
- Keywords (used for searches)
- source (only in advanced view)
- Revision Number (only in advanced view)
In some OS's (MacOS X, Linux) this information can be used to correct a title or author's name in the metadata contained in the file itself but Windows does not store it in the file except for certain file types such as Word. This data will show up if the mouse cursor is placed over the icon on systems that support the hover feature.
PDF files will also show their metadata in a tab on the properties selection but you may need to open the file itself to change the data. This data is not sync'd to the Summary tab on Windows systems so it can be different.
Culture data is an important component of metadata about an eBook. The most obvious information is the language in which the eBook is written which implies the character set. Other culture items are usually subdivided from the language. Culture data includes but is not limited to, sorting order of the alphabet, conventions used in writing dates, and formatting numbers.
The culture names follow the RFC 1766 standard in the format "<languagecode2>-<country/regioncode2>", where <languagecode2> is a lowercase two-letter code derived from ISO 639-1 and <country/regioncode2> is an uppercase two-letter code derived from ISO 3166. For example, U.S. English is "en-US". In cases where a two-letter language code is not available, the three-letter code derived from ISO 639-2 is used.
The AbiWord metadata is reached via the file menu -> properties entry. It consists of two tabs of data plus permissions. The data includes:
- Title --
- Subject -- short description.
- Author --
- Contributor(s) -- Could be illustrators, editors, etc.
- category -- Genre for books otherwise type of document.
- Keywords -- used for searching and matching to other similar documents.
- Language(s) -- Helps to identify the character set needed.
- Description -- Free hand description. Often the abstract.
- Dublin Core contains the standards adopted by many publishers for metadata.
- Microsoft Culture Data reference
- XMP - An Adobe tool to add metadata. This data is in an open format.
- ID3.org - MP3 standard
- JPG metadata - tools and specifications
- Library of Congress Standards - US government reference.
- IPTC - Photo Metadata, core data is exchangeable with Addobe XMP
- Document Metadata Extraction - Tools to see metadata in various formats.
- Meta Data - An article on creating AbiWord metadata from ODF metadata.