Google Ads Transparency Center — Ad Extraction

advertiser ID → ad creatives list with preview URLs, formats, date ranges, targeting, and variations

Language

All process output to user (progress updates, process notifications) follows the user's language.

Objective

Extract ad creative data from Google Ads Transparency Center for a given advertiser, including creative listings with pagination, individual creative details with targeting and variation data, and advertiser profile information.

Prerequisites

Browser is open on the advertiser's transparency page: https://adstransparency.google.com/advertiser/{advertiser_id}
No login required (public data)

Pre-execution Checks

1. Tool Readiness

If browser-act has been confirmed available in the current session, skip this step.

Invoke browser-act via Skill tool to load usage. If installation or configuration issues arise, follow its guidance to resolve then retry.

2. XSRF Token Acquisition

Before calling any API endpoint, obtain the XSRF token:

eval "$(python scripts/get-xsrf-token.py)"

Output example:

{
  "error": false,
  "token": "ABCxyz123..."
}

Store the returned token value for use in all subsequent API calls via --token.

Capability Components

This Skill's operational boundary = what the user can manually do in their browser. It only reads data already displayed to the user on the page, never bypassing authentication or access controls. Its role is equivalent to copy-pasting on the user's behalf — the data is already on screen, automation merely saves time. JS code is encapsulated in Python files under the scripts/ directory, invoked via eval "$(python scripts/xxx.py {params})". $(...) is bash syntax; it is recommended to use the bash tool for execution.

Below are all atomic capabilities discovered and verified during the exploration phase, listed by command template with parameters. Simply invoke them as needed — no need to read scripts/*.py source code or re-verify. Only inspect scripts when execution fails for troubleshooting. Combine freely as needed during execution.

API: search creatives by advertiser

eval "$(python scripts/search-creatives.py '{advertiser_id}' --token '{token}' --page-size {count})"

Parameters:

advertiser_id (positional, required): Advertiser ID string (e.g. AR01234567890123456)
--token: XSRF token obtained from get-xsrf-token.py (required)
--page-size: Number of results per page, default 40
--cursor: Pagination cursor from previous response's next_cursor, omit for first page
--region: Geo target ID for region filter (e.g. 2840 for US, 2724 for Spain, 2276 for Germany)

Output example:

{
  "error": false,
  "count": 40,
  "has_next_page": true,
  "next_cursor": "CAIoAQ==",
  "ads": [
    {
      "advertiser_id": "AR01234567890123456",
      "creative_id": "CR98765432109876543",
      "preview_url": "https://adstransparency.google.com/...",
      "format": 3,
      "format_label": "IMAGE",
      "first_shown": 1700000000,
      "last_shown": 1710000000,
      "advertiser_name": "Example Corp",
      "days_served": 45
    }
  ]
}

API: get creative detail

eval "$(python scripts/get-creative-detail.py '{advertiser_id}' '{creative_id}' --token '{token}')"

Parameters:

advertiser_id (positional, required): Advertiser ID string
creative_id (positional, required): Creative ID string (e.g. CR98765432109876543)
--token: XSRF token (required)

Output example:

{
  "error": false,
  "advertiser_id": "AR01234567890123456",
  "creative_id": "CR98765432109876543",
  "last_shown": 1710000000,
  "format": 3,
  "format_label": "IMAGE",
  "variations_count": 3,
  "variations": [
    {"preview_url": "https://adstransparency.google.com/..."}
  ],
  "targeting": {
    "demographics": null,
    "geo_impressions_lower": 100000,
    "impressions_upper": 500000,
    "first_shown": 20240101,
    "last_shown": 20240315,
    "surface_stats": [
      {
        "impressions_lower": 50000,
        "impressions_upper": 100000,
        "first_shown": 20240101,
        "last_shown": 20240315
      }
    ]
  }
}

API: get advertiser info

eval "$(python scripts/get-advertiser.py '{advertiser_id}' --token '{token}')"

Parameters:

advertiser_id (positional, required): Advertiser ID string
--token: XSRF token (required)

Output example:

{
  "error": false,
  "advertiser_id": "AR01234567890123456",
  "name": "Example Corp",
  "country_code": "US"
}

API: resolve region codes to country names

python scripts/resolve-regions.py {geo_id_1} {geo_id_2} ...

Note: This script runs locally (no browser eval needed) — it's a static lookup table.

Parameters:

geo_ids (positional, optional): Space-separated geo target IDs. If omitted, outputs the full mapping.

Output example:

{
  "2840": {"code": "US", "name": "United States"},
  "2724": {"code": "ES", "name": "Spain"},
  "2276": {"code": "DE", "name": "Germany"}
}

AI Workflow: extract ad creative content (headline, description, CTA, click URL)

Ad creative content (headline, description, CTA button text, landing page URL, image/video assets) is rendered inside cross-origin iframes that cannot be accessed programmatically from the parent page. This component uses visual extraction via screenshot.

navigate https://adstransparency.google.com/advertiser/{advertiser_id}/creative/{creative_id}?region=anywhere → page loaded, title contains "adstransparency.google.com"
wait stable → wait for ad preview iframe to render
screenshot → capture the rendered ad preview area
Read the screenshot to extract:
- headline: main text at top of the ad creative
- description: body text below the headline
- cta: button text (e.g., "Install", "Learn More", "Shop Now", "Sign Up")
- click_url: visible URL or app store link shown in the ad
- image_url: if IMAGE format, the rendered image (use tpc.googlesyndication.com preview URL from get-creative-detail.py variations)
If ad shows "removed due to policy violation" or "cannot show this ad", mark content_available: false

Output example:

{
  "content_available": true,
  "headline": "Catch Pokémon in the Real World",
  "description": "Join millions of Trainers worldwide",
  "cta": "Install",
  "click_url": "https://play.google.com/store/apps/details?id=com.nianticlabs.pokemongo"
}

Performance note: This workflow requires navigating to each creative's detail page and waiting for ifra(~3-5 seconds per ad). For bulk extraction, prefer using only search-creatives.py + get-creative-detail.py (no detail lookup). Only use this for targeted analysis of specific ads.

Composite: full advertiser ad extraction

Combines search + detail lookups for comprehensive extraction:

Get XSRF token via get-xsrf-token.py
Get advertiser info via get-advertiser.py
Search all creatives via search-creatives.py (paginate until has_next_page: false)
For each creative, get metadata via get-creative-detail.py (region stats, targeting, variations)
(Optional, slow) For creatives needing content analysis, use the AI Workflow above to extract headline/description/CTA/click_url
Resolve region codes via resolve-regions.py for human-readable country names

This composite flow yields a complete dataset of an advertiser's public ad history including all creative variations, targeting regions, impression estimates, and optionally ad copy content.

Region Codes

Region geo target IDs follow the pattern: 2000 + ISO 3166-1 numeric code

Common examples:

US (United States): 2840
GB (United Kingdom): 2826
DE (Germany): 2276
ES (Spain): 2724
FR (France): 2250
JP (Japan): 2392
AU (Australia): 2036
BR (Brazil): 2076
IN (India): 2356
CA (Canada): 2124

To calculate any country's code: look up the ISO 3166-1 numeric code and add 2000.

Pagination

API Pagination: --cursor, type: cursor string, start value: omit for first page. Next page value source: next_cursor field in response. Termination: has_next_page: false or next_cursor: null.

Success Criteria

Token extraction: error = false AND token is non-empty string
Creative search: error = false AND count >= 0 AND has_next_page field present
Creative detail: error = false AND creative_id is non-null
Advertiser info: error = false AND name is non-null

(count may be 0 for advertisers with no ads in the specified region; this is not an error)

Known Limitations

XSRF token may expire after extended periods; re-extract if API calls return authentication errors
The format field uses numeric codes: 1=TEXT, 2=VIDEO, 3=IMAGE; other codes may exist for newer ad formats
Targeting and impression data may be null for some creatives (not all ads have disclosed targeting info)
Region filter narrows results to ads shown in that specific region; omit for global results
The internal RPC endpoint paths may change if the site is updated; if calls fail with 404, the page structure may have changed
Rate limiting may occur with very high-frequency requests; add 1-2 second delays between requests in batch scripts

Execution Efficiency

Batch orchestration: Write a bash script to loop through the command templates serially within a single session; do not parallelize within one browser (prone to triggering rate limits). Add 1-second delays between API calls. To increase throughput, open multiple stealth browser sessions and distribute work across them
Test before batch execution: After writing a batch script, first test with 1-2 items to verify the script runs correctly; only then run the full batch. Never skip testing and execute in batch directly
Reduce redundant pre-operations: The XSRF token only needs to be fetched once per session; reuse it across all subsequent calls
Error resumption: Save results item by item during batch processing; on failure, resume from the last successful cursor rather than starting over

Experience Notes

Path: {working-directory}/browser-act-skill-forge-memories/google-ads-transparency-google-ads-transparency-ads.memory.md (working directory is determined by the Agent running the Skill, typically the project root or current working directory)

Before execution: If the file exists, read it first — it records unexpected situations encountered during past executions (e.g., a strategy has become ineffective); adjust strategy order accordingly.

After execution: If an unexpected situation is encountered (strategy became ineffective, page redesigned, token format changed, better path discovered), append a line: {YYYY-MM-DD}: {what happened} -> {conclusion}

Normal execution does not write to the file. Do not record what advertiser IDs were queried or how many results were returned — those are task outputs, not experience.