Google Ads Transparency Center — Ad Extraction
advertiser ID → ad creatives list with preview URLs, formats, date ranges, targeting, and variations
Language
All process output to user (progress updates, process notifications) follows the user's language.
Objective
Extract ad creative data from Google Ads Transparency Center for a given advertiser, including creative listings with pagination, individual creative details with targeting and variation data, and advertiser profile information.
Prerequisites
- Browser is open on the advertiser's transparency page:
https://adstransparency.google.com/advertiser/{advertiser_id} - No login required (public data)
Pre-execution Checks
1. Tool Readiness
If browser-act has been confirmed available in the current session, skip this step.
Invoke browser-act via Skill tool to load usage. If installation or configuration issues arise, follow its guidance to resolve then retry.
2. XSRF Token Acquisition
Before calling any API endpoint, obtain the XSRF token:
eval "$(python scripts/get-xsrf-token.py)"
Output example:
{
"error": false,
"token": "ABCxyz123..."
}
Store the returned token value for use in all subsequent API calls via --token.
Capability Components
This Skill's operational boundary = what the user can manually do in their browser. It only reads data already displayed to the user on the page, never bypassing authentication or access controls. Its role is equivalent to copy-pasting on the user's behalf — the data is already on screen, automation merely saves time. JS code is encapsulated in Python files under the
scripts/directory, invoked viaeval "$(python scripts/xxx.py {params})".$(...)is bash syntax; it is recommended to use the bash tool for execution.
Below are all atomic capabilities discovered and verified during the exploration phase, listed by command template with parameters. Simply invoke them as needed — no need to read scripts/*.py source code or re-verify. Only inspect scripts when execution fails for troubleshooting. Combine freely as needed during execution.
API: search creatives by advertiser
eval "$(python scripts/search-creatives.py '{advertiser_id}' --token '{token}' --page-size {count})"
Parameters:
advertiser_id(positional, required): Advertiser ID string (e.g.AR01234567890123456)--token: XSRF token obtained from get-xsrf-token.py (required)--page-size: Number of results per page, default40--cursor: Pagination cursor from previous response'snext_cursor, omit for first page--region: Geo target ID for region filter (e.g.2840for US,2724for Spain,2276for Germany)
Output example:
{
"error": false,
"count": 40,
"has_next_page": true,
"next_cursor": "CAIoAQ==",
"ads": [
{
"advertiser_id": "AR01234567890123456",
"creative_id": "CR98765432109876543",
"preview_url": "https://adstransparency.google.com/...",
"format": 3,
"format_label": "IMAGE",
"first_shown": 1700000000,
"last_shown": 1710000000,
"advertiser_name": "Example Corp",
"days_served": 45
}
]
}
API: get creative detail
eval "$(python scripts/get-creative-detail.py '{advertiser_id}' '{creative_id}' --token '{token}')"
Parameters:
advertiser_id(positional, required): Advertiser ID stringcreative_id(positional, required): Creative ID string (e.g.CR98765432109876543)--token: XSRF token (required)
Output example:
{
"error": false,
"advertiser_id": "AR01234567890123456",
"creative_id": "CR98765432109876543",
"last_shown": 1710000000,
"format": 3,
"format_label": "IMAGE",
"variations_count": 3,
"variations": [
{"preview_url": "https://adstransparency.google.com/..."}
],
"targeting": {
"demographics": null,
"geo_impressions_lower": 100000,
"impressions_upper": 500000,
"first_shown": 20240101,
"last_shown": 20240315,
"surface_stats": [
{
"impressions_lower": 50000,
"impressions_upper": 100000,
"first_shown": 20240101,
"last_shown": 20240315
}
]
}
}
API: get advertiser info
eval "$(python scripts/get-advertiser.py '{advertiser_id}' --token '{token}')"
Parameters:
advertiser_id(positional, required): Advertiser ID string--token: XSRF token (required)
Output example:
{
"error": false,
"advertiser_id": "AR01234567890123456",
"name": "Example Corp",
"country_code": "US"
}
API: resolve region codes to country names
python scripts/resolve-regions.py {geo_id_1} {geo_id_2} ...
Note: This script runs locally (no browser eval needed) — it's a static lookup table.
Parameters:
geo_ids(positional, optional): Space-separated geo target IDs. If omitted, outputs the full mapping.
Output example:
{
"2840": {"code": "US", "name": "United States"},
"2724": {"code": "ES", "name": "Spain"},
"2276": {"code": "DE", "name": "Germany"}
}
AI Workflow: extract ad creative content (headline, description, CTA, click URL)
Ad creative content (headline, description, CTA button text, landing page URL, image/video assets) is rendered inside cross-origin iframes that cannot be accessed programmatically from the parent page. This component uses visual extraction via screenshot.
navigate https://adstransparency.google.com/advertiser/{advertiser_id}/creative/{creative_id}?region=anywhere→ page loaded, title contains "adstransparency.google.com"wait stable→ wait for ad preview iframe to renderscreenshot→ capture the rendered ad preview area- Read the screenshot to extract:
- headline: main text at top of the ad creative
- description: body text below the headline
- cta: button text (e.g., "Install", "Learn More", "Shop Now", "Sign Up")
- click_url: visible URL or app store link shown in the ad
- image_url: if IMAGE format, the rendered image (use
tpc.googlesyndication.compreview URL fromget-creative-detail.pyvariations)
- If ad shows "removed due to policy violation" or "cannot show this ad", mark
content_available: false
Output example:
{
"content_available": true,
"headline": "Catch Pokémon in the Real World",
"description": "Join millions of Trainers worldwide",
"cta": "Install",
"click_url": "https://play.google.com/store/apps/details?id=com.nianticlabs.pokemongo"
}
Performance note: This workflow requires navigating to each creative's detail page and waiting for ifra(~3-5 seconds per ad). For bulk extraction, prefer using only search-creatives.py + get-creative-detail.py (no detail lookup). Only use this for targeted analysis of specific ads.
Composite: full advertiser ad extraction
Combines search + detail lookups for comprehensive extraction:
- Get XSRF token via
get-xsrf-token.py - Get advertiser info via
get-advertiser.py - Search all creatives via
search-creatives.py(paginate untilhas_next_page: false) - For each creative, get metadata via
get-creative-detail.py(region stats, targeting, variations) - (Optional, slow) For creatives needing content analysis, use the AI Workflow above to extract headline/description/CTA/click_url
- Resolve region codes via
resolve-regions.pyfor human-readable country names
This composite flow yields a complete dataset of an advertiser's public ad history including all creative variations, targeting regions, impression estimates, and optionally ad copy content.
Region Codes
Region geo target IDs follow the pattern: 2000 + ISO 3166-1 numeric code
Common examples:
- US (United States):
2840 - GB (United Kingdom):
2826 - DE (Germany):
2276 - ES (Spain):
2724 - FR (France):
2250 - JP (Japan):
2392 - AU (Australia):
2036 - BR (Brazil):
2076 - IN (India):
2356 - CA (Canada):
2124
To calculate any country's code: look up the ISO 3166-1 numeric code and add 2000.
Pagination
API Pagination: --cursor, type: cursor string, start value: omit for first page. Next page value source: next_cursor field in response. Termination: has_next_page: false or next_cursor: null.
Success Criteria
- Token extraction:
error = false AND token is non-empty string - Creative search:
error = false AND count >= 0 AND has_next_page field present - Creative detail:
error = false AND creative_id is non-null - Advertiser info:
error = false AND name is non-null
(count may be 0 for advertisers with no ads in the specified region; this is not an error)
Known Limitations
- XSRF token may expire after extended periods; re-extract if API calls return authentication errors
- The
formatfield uses numeric codes: 1=TEXT, 2=VIDEO, 3=IMAGE; other codes may exist for newer ad formats - Targeting and impression data may be null for some creatives (not all ads have disclosed targeting info)
- Region filter narrows results to ads shown in that specific region; omit for global results
- The internal RPC endpoint paths may change if the site is updated; if calls fail with 404, the page structure may have changed
- Rate limiting may occur with very high-frequency requests; add 1-2 second delays between requests in batch scripts
Execution Efficiency
- Batch orchestration: Write a bash script to loop through the command templates serially within a single session; do not parallelize within one browser (prone to triggering rate limits). Add 1-second delays between API calls. To increase throughput, open multiple stealth browser sessions and distribute work across them
- Test before batch execution: After writing a batch script, first test with 1-2 items to verify the script runs correctly; only then run the full batch. Never skip testing and execute in batch directly
- Reduce redundant pre-operations: The XSRF token only needs to be fetched once per session; reuse it across all subsequent calls
- Error resumption: Save results item by item during batch processing; on failure, resume from the last successful cursor rather than starting over
Experience Notes
Path: {working-directory}/browser-act-skill-forge-memories/google-ads-transparency-google-ads-transparency-ads.memory.md (working directory is determined by the Agent running the Skill, typically the project root or current working directory)
Before execution: If the file exists, read it first — it records unexpected situations encountered during past executions (e.g., a strategy has become ineffective); adjust strategy order accordingly.
After execution: If an unexpected situation is encountered (strategy became ineffective, page redesigned, token format changed, better path discovered), append a line:
{YYYY-MM-DD}: {what happened} -> {conclusion}
Normal execution does not write to the file. Do not record what advertiser IDs were queried or how many results were returned — those are task outputs, not experience.
Scan to join WeChat group