Google Drive File Processor

Codex already knows the Google APIs exist; what it needs is a self-contained replica of the OpsDashboard helpers. Copying this skill directory into any project gives you references/google_drive_processing_example.py, which contains production-ready helpers (build_drive_clients, process_drive_file, process_drive_folder, fetch_sheet_tables, etc.) that require only a service-account JSON blob.

Authentication & secrets

Store the full service account JSON (plus folder_id) under st.secrets["gdrive_secrets"] or load it from disk and pass the dict directly to build_drive_clients.
Always request the read-only scopes used here:
https://www.googleapis.com/auth/drive.readonly, https://www.googleapis.com/auth/spreadsheets.readonly, https://www.googleapis.com/auth/presentations.readonly.
Build clients with service_account.Credentials.from_service_account_info(..., scopes=SCOPES) and googleapiclient.discovery.build.

Folder listing pattern

Pull folder_id from the secret, defaulting to whole Drive when missing.
Compose the Drive query ("'<folder_id>' in parents" plus MIME filters when needed). src/show_sheet_explorer._fetch_sheet_metadata shows the paging pattern when you need more than 200 files.
Call drive.files().list(..., includeItemsFromAllDrives=True, supportsAllDrives=True) and capture id, name, and mimeType.

MIME routing matrix

Re-use the handlers implemented inside references/google_drive_processing_example.py:

| MIME | Handler | Result | | --- | --- | --- | | application/vnd.google-apps.spreadsheet | read_google_sheet + sheet_rows_to_dataframe | Dict of tab -> rows/DataFrames | | application/vnd.google-apps.presentation | read_google_slides | Returns ordered text snippets | | application/vnd.google-apps.document | export_google_file(..., target_mime='application/vnd.openxmlformats-officedocument.wordprocessingml.document', suffix='docx') | Saves .docx | | application/vnd.google-apps.form | Emits warning; Drive cannot export responses | None | | Other application/vnd.google-apps.* | export_google_file(..., target_mime='application/pdf', suffix='pdf') | Saves .pdf | | Everything else (PPTX, XLSX, PDF, etc.) | save_binary_file | Saves raw bytes |

process_drive_file(...) already contains this matrix and returns a dict describing the work performed (artifact path + metadata). Re-use it instead of re-implementing the branching logic.

Recommended workflow

Build Drive/Sheets/Slides clients once per request and pass them into helpers; cache expensive sheet reads with @st.cache_data(ttl=3600) if the UI displays them repeatedly.
Iterate files, call the right handler, and collect structured outputs (dataframes, exported files). Persist exports on disk or keep them in memory (BytesIO) before attaching to downstream tasks.
When surfacing in Streamlit, expose controls to refresh the cache, filter by sheet name, and show links using st.data_editor (see show_sheet_explorer for pattern).

Example resources

references/google_drive_processing_example.py exposes reusable helpers plus process_drive_folder(...) and fetch_sheet_tables(...). Import it directly or execute it as a module to download/export folders in other projects—no OpsDashboard dependencies remain.