返回 Skill 列表
extension
分类: 开发与工程无需 API Key

harvard-library-catalog

通过LibraryCloud搜索哈佛图书馆的1300多万条书目记录,并通过PRESTO检索MARC/MODS数据。当用户想要在哈佛图书馆目录中查找书籍、手稿、指南或其他项目,验证书目信息(标题、作者、ISBN、出版日期),查找数字馆藏或检索详细的目录记录时,请使用此技能。此外,当用户从文档中提取书名并希望找到其完整的书目元数据时也会触发此技能。

person作者: jakexiaohubgithub

Harvard Library API Skill

Search and retrieve bibliographic records from Harvard Library's catalog of 13M+ items.

Critical: Things Claude Won't Know Without This Skill

LibraryCloud field-based search uses query parameters, NOT Solr syntax

The Item API uses field names as query parameters — not q=field:value Solr syntax.

CORRECT:   https://api.lib.harvard.edu/v2/items.json?title=hamlet&name=shakespeare
WRONG:     https://api.lib.harvard.edu/v2/items?q=title:hamlet

The q= parameter is for keyword search across all fields. Field-specific search uses dedicated parameters like title=, name=, subject=, identifier=, etc.

JSON requires .json in the URL path

Responses are XML by default. To get JSON, append .json before the query string:

JSON:        https://api.lib.harvard.edu/v2/items.json?title=hamlet
Dublin Core: https://api.lib.harvard.edu/v2/items.dc.json?title=hamlet
Default XML: https://api.lib.harvard.edu/v2/items?title=hamlet

PRESTO is for direct record lookup by HOLLIS ID

PRESTO returns raw MARC, MODS, or Dublin Core for a single record by its HOLLIS number. It complements LibraryCloud when you need the original catalog record:

MARC: https://webservices.lib.harvard.edu/rest/marc/hollis/{HOLLIS_ID}
MODS: https://webservices.lib.harvard.edu/rest/mods/hollis/{HOLLIS_ID}
DC:   https://webservices.lib.harvard.edu/rest/dc/hollis/{HOLLIS_ID}

PRESTO returns XML only and does not support JSON serialization. ISBN/barcode lookups may not work on all records.

User-Agent header is required

LibraryCloud returns 403 without a User-Agent header. Always include one:

curl -H 'User-Agent: MyApp/1.0' 'https://api.lib.harvard.edu/v2/items.json?title=hamlet'

The Python script includes this automatically.

Rate limit: max 1 request/second, 300 per 5 minutes

Exceeding this triggers a 5-minute lockout. The Python script handles this automatically.

Choosing an Access Method

| Need | Method | |------|--------| | Search by title, author, subject, date | LibraryCloud Item API (field params) | | Full-text keyword search | LibraryCloud Item API (q= param) | | Look up by ISBN, LCCN, or other identifier | LibraryCloud identifier= or q= keyword | | Browse digital collections | LibraryCloud collectionTitle= or Collections API | | Get raw MARC record for a known HOLLIS ID | PRESTO /rest/marc/hollis/{id} | | Faceted browsing (by language, date, genre) | LibraryCloud facets= parameter |

Typical Workflow: Book Title to Full Bibliography

This is the primary use case — an LLM extracts a book title from a document and needs complete bibliographic data:

from scripts.harvard_api import HarvardLibraryAPI
api = HarvardLibraryAPI()

# 1. Search by title (and optionally author)
results = api.search(title="The Great Gatsby", name="Fitzgerald")

# 2. Get the first match's summary
if results:
    summary = api.summarize(results[0])
    # → title, author, publisher, date, ISBN, subjects, language, physical description

# 3. For deeper data, get MARC via PRESTO
hollis_id = api.get_record_id(results[0])
if hollis_id:
    marc = api.get_presto_record(hollis_id, format="mods")

Key Search Fields

| Field | What it searches | Exact match? | |-------|-----------------|-------------| | q | All fields (keyword) | No | | title | Title, subtitle, part name/number | Yes (title_exact) | | name | All name fields (author, editor, etc.) | No | | subject | All subject fields (topic, geographic, temporal) | Yes (subject_exact) | | identifier | ISBN, LCCN, other system IDs | Yes | | languageCode | ISO language code (e.g., chi, eng) | Yes | | dateIssued | Publication date (YYYY) | Yes | | dates.start / dates.end | Date range filter | — | | genre | Genre/form (e.g., "Drawings", "Maps") | Yes (genre_exact) | | repository | Harvard library name | Yes | | isOnline | Has digital version (true/false) | — | | recordIdentifier | HOLLIS/Alma record ID | Yes |

Combine fields freely: ?title=hamlet&name=shakespeare&languageCode=ger&dates.start=1900

Pagination

  • limit=N (default 10, max 250)
  • start=N for offset-based pagination (up to ~30K results)
  • cursor=* then cursor={nextCursor} for large result sets (up to 100K)

Facets

Add facets=field1,field2 to get value counts. Useful fields: name, subject, languageCode, genre, resourceType, repository, dateIssued.

?title=china&facets=languageCode,genre

Python Script

Use scripts/harvard_api.py for programmatic access (zero dependencies):

from scripts.harvard_api import HarvardLibraryAPI
api = HarvardLibraryAPI()

# Keyword search
results = api.search(q="Chinese porcelain Ming dynasty")

# Field search
results = api.search(title="dream of the red chamber", languageCode="chi")

# With facets
results, facets = api.search_with_facets(subject="astronomy", facets=["genre", "dateIssued"])

# Pagination
all_results = api.search_all(title="peanuts", name="schulz", max_results=500)

# PRESTO lookup
marc_xml = api.get_presto_record("011557057", format="marc")

# Summarize a record
for r in results[:5]:
    print(api.summarize(r))

API Endpoints

| Endpoint | URL | |----------|-----| | LibraryCloud Items | https://api.lib.harvard.edu/v2/items | | LibraryCloud Collections | https://api.lib.harvard.edu/v2/collections | | PRESTO (MARC/MODS/DC) | https://webservices.lib.harvard.edu/rest/{format}/hollis/{id} |

Related Skills

  • wikidata-search: Cross-reference Harvard catalog entries with Wikidata for external identifiers (VIAF, LoC, etc.)
  • cbdb-api: Look up authors of Chinese historical texts in CBDB for biographical context

Resources

  • references/api_reference.md — Complete field reference with all searchable fields, facets, and query examples
  • scripts/harvard_api.py — Full-featured Python client with rate limiting, pagination, and record summarization