General OCR Struct
Use this skill to separate OCR recognition from downstream content整理.
Workflow
- Run the local OCR script on the image first.
- Return the raw OCR text before making business interpretations when accuracy matters.
- If the image is a transaction-detail screenshot, run structuring mode to group rows into fields.
- Mark uncertain fields explicitly as
待确认; do not guess missing content. - Only after the user confirms recognition quality, use the result for tables, summaries, or documents.
Commands
Raw OCR
python3 scripts/general_ocr.py raw /path/to/image.jpg
Structured transaction extraction
python3 scripts/general_ocr.py transactions /path/to/image.jpg
JSON output
python3 scripts/general_ocr.py transactions /path/to/image.jpg --json
Output rules
- Prefer showing the recognition result first, then the cleaned structure.
- Preserve source wording where possible.
- For uncertain content, use
待确认instead of inferring. - Adapt the structure to the source image type. For statement-like screenshots, common fields are:
card_last4,date,time,currency,merchant,amount.
Notes
- This skill uses RapidOCR locally.
- First install may need Python packages; after setup it runs offline.
- If OCR quality is weak, request a higher-resolution original screenshot before doing deeper整理.
微信扫一扫