Subscription Data XML Catalog

If you would like to index the individual items (i.e. transcribed documents) in our collection, you may download the latest XML Catalog, which is updated whenever new resources are added.

If you prefer to spider the subscription data area, a standard sitemap is available to ensure you are capturing the correct pages, and to avoid unnecessary spidering of non-content pages.

XML Catalog Field Explanations

Last updated on August 9, 2011

The XML Catalog (same for both the Subscription Data and Free Genealogy Data areas) is started with a "databaselist" tag, which has an "updated" attribute. The updated attribute indicates when the file was last prepared. Within the databaselist begin/end tags are multiple "database" tags.

Aside from title and link, all other tags are optional. You should never encounter a blank value for a tag (e.g. <state></state>).

Here is a sample database listing:

<database dbid="3923" released="20080309" sequence="2"> <title>A List of Revolutionary Soldiers from Dedham, Mass.</title> <link></link> <affiliate></affiliate> <published>1917</published> <language>eng</language> <country>USA</country> <state>MA</state> <desc>A List of Revolutionary Soldiers who ...</desc> <cover></cover> <surnames>Ackley, ..., Young</surnames> <subjects>Dedham (Mass.)--Genealogy;Revolutionary War, American, 1775-1783</subjects> </database>

dbid: This is an internal record number for the resource.

released: This is the date (YYYYMMDD) that the resource was released. We rarely re-release items, but it may occur periodically.

sequence: This is a sequence number specific to the XML file, and is used for debugging purposes only.

title: The title of the resource.

link: A URL to a static HTML page in the subdomain. PLEASE NOTE: The URL's are based on the title, and as such, may be impacted when a typographical error occurs and/or for disambiguation purposes. Whenever this occurs, the original URL is preserved on the web site with a META refresh redirect.

affiliate: A dynamic tracking URL for members of our affiliate program. To utilize this like, simply append the "A=" parameter along with your affiliate ID.

published: The year (YYYY) that the original document (used for this transcription) was published. This field is only available when we are able to determine the original publication year.

language: The MARC Code for Languages. (see official list) [Implemented 09/Aug/2011]

country: ISO 3166 Alpha-3 country. This field is only available when we are able to determine the country of the original document. For U.S. documents, this field is only available when a state abbreviation is determined.

state: U.S. state abbreviation (reference: United States Postal Service). This field is only available when we are able to determine the state of the original document.

county: County name within U.S. states only. This field is only available when we are able to determine the county of the original document.

desc: A description of the original document, often the complete title, and any supplemental information.

cover: A URL to a thumbnail image of the cover (typically 210 pixels in height). Older images may only be 120 pixels in height. Some transcriptions may not have a thumbnail image available.

surnames: A comma-delimited list of the unique surnames in the transcription, presented in alphabetical order.

subjects: A semicolon-delimited list of subjects obtained via the Library of Congress Authorities system. Alternately, some subjects may be obtained via the website. [Implemented 09/Aug/2011]

OCLC: The OCLC Control Number of the original document (when we are able to determine an exact match).

What's New in Genealogy ... Today!
click to view original photo