HTML to PPTX Converter Skill
This skill provides a pipeline for transforming web-based slide decks (HTML/CSS) into professional, native PowerPoint files. It uses a hybrid approach: Python for content extraction (BeautifulSoup) and Node.js (PptxGenJS) for layout reconstruction.
Prerequisites
# Python dependencies
pip install beautifulsoup4 lxml
# Node.js dependencies
npm install -g pptxgenjs
Workflow
1. Structure Analysis (Python)
Parse the HTML to extract semantic slide data (titles, bullet points, metrics, tables, and layouts).
from bs4 import BeautifulSoup
import json
def parse_slides(html_path):
soup = BeautifulSoup(open(html_path), 'lxml')
slides = []
for div in soup.find_all('div', class_='slide'):
# Extract based on classes like 'cover-slide', 'content-box', 'two-column'
# Return a structured JSON list
pass
return slides
2. Layout Mapping (JavaScript)
Map the extracted JSON data to PptxGenJS objects using a 10" x 5.625" coordinate system.
const pptxgen = require("pptxgenjs");
let pres = new pptxgen();
// Define a consistent color palette
const COLORS = { PRIMARY: "0F3460", ACCENT: "00D4FF", TEXT: "FFFFFF" };
// Iterate through parsed data and add slides
data.forEach(item => {
let slide = pres.addSlide();
// Use slide.addText, slide.addShape, slide.addTable based on item.type
});
pres.writeFile({ fileName: "output.pptx" });
Design Principles
- Native Elements: Always use
addShapeandaddTextinstead of screenshots to ensure the PPT is editable. - Metric Boxes: For data-heavy slides, use large font-size text boxes inside colored rectangles to highlight key numbers.
- Grid Systems: Map HTML
gridorflexlayouts to manual X/Y coordinates (e.g., Two-column = X:0.5 and X:5.2). - Branding: Sync PPT background colors with the HTML
background-colororlinear-gradientequivalent.
Pitfalls & Tips
- Hex Colors: PptxGenJS requires hex codes WITHOUT the
#prefix (e.g.,00D4FF). - Font Support: Default to "Microsoft YaHei" or "Arial" for cross-platform compatibility.
- Table Height: Tables don't auto-calculate height for the next element; manually increment
currentYbased on row count.
扫码联系在线客服