libtransform Skill
When to Use
- Converting PDF documents to HTML
- Extracting content from scanned documents
- Processing documents with LLM vision models
- Building document transformation pipelines
Key Concepts
pdfToHtml: Splits PDF into page images, sends to vision-capable LLM, and assembles HTML output with semantic structure.
Usage Patterns
Pattern 1: Convert PDF to HTML
import { pdfToHtml } from "@copilot-ld/libtransform";
const pdfBuffer = await fs.readFile("document.pdf");
const html = await pdfToHtml(pdfBuffer, {
model: "gpt-4-vision-preview",
maxPages: 50,
});
Integration
Used by libingest pipeline for document processing. Requires LLM with vision capabilities.
微信扫一扫