The Deposit Harvester allows members to retrieve metadata records for content that they've registered. The metadata retrieved is in our UNIXSD output format, which delivers the exact metadata submitted in a deposit, including any citations registered. Members (or their designated third parties) may only retrieve their own metadata.
The harvester uses Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) to deliver the metadata. The verbs Identify, ListMetadataFormats, ListSets, ListIdentifiers, ListRecords, and GetRecord are supported.
Ownership and retrieval restrictions
Who can retrieve records?
The Deposit Harvester will only retrieve records for the authorized owner of the metadata records. Metadata ownership is established by the DOI prefix(es) associated with a user's system login. Most publishers have one prefix and one login, but some publishers may have multiple prefixes. For example, Publisher A has been assigned system account "abcd", which is associated with prefixes 10.xxxx, 10.yyyy, and 10.zzzz. Publisher A can retrieve metadata owned by prefixes 10.xxxx, 10.yyyy, and 10.zzzz using their 'abcd' login.
Ownership of DOIs and titles often moves from publisher to publisher, so a title owning prefix will not always match the prefix of the DOIs attached to the title. Retrieval permission is granted to the current owner, not the original depositor. For example, Publisher B registers identifier 10.5555/jfo.33425. Ownership of the journal and all identifiers is transferred to Publisher A with prefix 10.50505. The DOI is now "owned" by prefix 10.50505, and only Publisher A may harvest the metadata record for that identifier.
The Deposit Harvester supports a hierarchy of sets. The hierarchy is in three parts: <work-type>:<prefix>:<publication-id>. For example, the set "J:10.12345:6789" will return metadata for a journal (J), with prefix "10.12345", and publication id "6789". The set "B" will return all book metadata. The set "S:10.12345" will return all the series metadata associated with the 10.12345 prefix.
The work-type designators are:
- 'J' for journals
- 'B' for books and book-like works (reports, conference proceedings, standards, dissertations)
- 'S' for non-journal series and series-like works.
If no set is specified, the set "J" is used.
ListSets - retrieve list of titles owned by the prefixes assigned to your system login:
Retrieve data for a prefix:
Retrieve data for a single title:
GetRecord - retrieve data for a single DOI:
When using GetRecord, the <DOI> value should be URL encoded.
Identify - use to check the status of the Deposit Harvester (no login needed):
ListMetadataFormats - lists available metadata formats (currently UNIXREF)
- work-type: J for journals, B for book or conference proceeding titles, S for series
- prefix: the owning prefix of the title being retrieved
- title ID: the title identification number assigned by the Crossref system. Title IDs are included in the ListSets response described above.
- username and password: system login for the prefix/title being retrieved
Results conform to Crossref's UNIXREF format and may contain the following root elements:
Some OAI-PMH requests are too big to be retrieved in a single transaction. If a given response contains a resumption token, the user must make an additional request to retrieve the rest of the data. You must provide the account name and password with both the initial request and subsequent resumption requests. A resumption without authentication details will fail.
Request with resumption token:
See OAI-PMH for details.