Back to skills
extension
Category: OtherNo API key required

Office Toolkit

A comprehensive toolkit for Microsoft Office documents (Word, Excel, PowerPoint) and PDF files. Supports reading, writing, format conversion, and batch proce...

personAuthor: weiwei2027hubclawhub

Office Toolkit

Comprehensive document processing toolkit for Office and PDF files.

ClawHub: https://clawhub.ai/weiwei2027/office-toolkit Install: clawhub install office-toolkit

Supported Formats

| Format | Read | Write | Convert From | Convert To | |--------|------|-------|--------------|------------| | DOCX | ✅ | ✅ | - | PDF | | PPTX | ✅ | ✅ | - | PDF | | XLSX | ✅ | ✅ | - | - | | PDF | ✅ | ✅ (from DOCX/PPTX) | DOCX, PPTX | - |

Quick Start

Read Documents

# Word
docx-read.py document.docx

# PowerPoint
pptx-read.py presentation.pptx

# Excel
xlsx-read.py spreadsheet.xlsx

# PDF
pdf-read.py document.pdf

Create Documents

# Word with content
docx-write.py output.docx --title "Report" --content "Hello World"

# PowerPoint with slides
pptx-write.py output.pptx --title "Presentation" --slides 5

# Excel with data
xlsx-write.py output.xlsx --sheet "Data" --data data.json

Convert Formats

# DOCX to PDF
convert.py document.docx --to pdf

# PPTX to PDF
convert.py presentation.pptx --to pdf

Installation

# Install all dependencies
pip install -r requirements/all.txt

# Or install only what you need
pip install -r requirements/docx.txt   # Word only
pip install -r requirements/pptx.txt   # PowerPoint only
pip install -r requirements/xlsx.txt   # Excel only
pip install -r requirements/pdf.txt    # PDF only

Directory Structure

office-toolkit/
├── SKILL.md                    # This file
├── requirements/               # Dependency files
│   ├── base.txt               # Core dependencies
│   ├── docx.txt               # python-docx
│   ├── pptx.txt               # python-pptx
│   ├── xlsx.txt               # openpyxl
│   └── pdf.txt                # pymupdf
├── scripts/                    # CLI tools
│   ├── docx-read.py
│   ├── docx-write.py
│   ├── pptx-read.py
│   ├── pptx-write.py
│   ├── xlsx-read.py
│   ├── xlsx-write.py
│   ├── pdf-read.py
│   ├── pdf-write.py
│   └── convert.py
├── lib/                        # Shared library
│   ├── __init__.py
│   ├── base.py                # Base classes
│   ├── utils.py               # Utilities
│   └── validators.py          # Input validation
└── tests/                      # Test suite
    ├── test_docx.py
    ├── test_pptx.py
    ├── test_xlsx.py
    └── test_pdf.py

Python API Usage

from lib.base import DocumentProcessor

# Process Word document
processor = DocumentProcessor('docx')
text = processor.read('document.docx')
processor.write('output.docx', content="New content")

# Convert format
processor.convert('document.docx', 'pdf')

Notes

  • DOCX: Uses python-docx library. Supports text, tables, styles, images.
  • PPTX: Uses python-pptx library. Supports slides, text, shapes, charts.
  • XLSX: Uses openpyxl library. Supports cells, formulas, charts, styling.
  • PDF: Uses pymupdf (fitz) for reading, reportlab for creation.

Roadmap

  • [x] Excel support (xlsx read/write) - ✅ Added in v1.0.1
  • [ ] PDF creation from scratch
  • [ ] Format conversion improvements
  • [ ] Batch processing
  • [ ] Template system

Changelog

v1.0.1 (2026-03-20)

  • Added Excel (.xlsx) read/write support
  • Improved error handling with helpful messages
  • Added JSON output option for read operations
  • Added PDF page selection support

v1.0.0 (2026-03-20)

  • Initial release
  • DOCX/PPTX/PDF read and write support