Back to skills
extension
Category: OtherNo API key required

Web Fetch Fallback

Fetch web content using curl as a fallback when web_fetch is blocked due to private, internal, or special-use IP address restrictions.

personAuthor: tiantian-doubahubclawhub

Web Fetch Fallback

This skill provides a fallback mechanism for fetching web content when the standard web_fetch tool is blocked due to security restrictions on private/internal IP addresses.

When to Use

Use this skill when:

  • web_fetch returns an error like "Blocked: resolves to private/internal/special-use IP address"
  • You need to fetch content from URLs that resolve to internal network addresses
  • The target URL is legitimate but blocked by OpenClaw's IP restrictions

Fallback Method: curl

When web_fetch is blocked, use curl via the exec tool to fetch the content:

Basic Usage

curl -sL "<URL>"

With Timeout and Follow Redirects

curl -sL --max-time 30 --connect-timeout 10 "<URL>"

Fetching with Custom Headers

curl -sL -H "User-Agent: Mozilla/5.0" -H "Accept: text/html" "<URL>"

Saving to File

curl -sL -o /tmp/fetched_content.html "<URL>"

Example: Fetch and Process Content

# Fetch content and extract text using html2text or similar
curl -sL "https://example.com" | html2text -utf8

# Or save and read
curl -sL -o /tmp/page.html "https://example.com"
cat /tmp/page.html

Reference Script

See scripts/curl_fetch.sh for a reusable curl-based fetching script with error handling and common options.

Limitations and Security Considerations

Limitations

  1. No built-in content extraction: Unlike web_fetch, curl returns raw HTML. You may need to parse/extract content manually.
  2. No automatic formatting: web_fetch returns markdown; curl returns raw HTTP response.
  3. Manual error handling: You must check curl exit codes and handle errors explicitly.

Security Considerations

⚠️ Important: This fallback bypasses OpenClaw's IP-based security checks. Only use when:

  1. You trust the target URL and its content
  2. The URL is from a legitimate internal service (e.g., company intranet, local development server)
  3. You have confirmed the URL is safe to access

Never use this fallback for:

  • Unknown or untrusted URLs
  • URLs from untrusted user input without validation
  • External websites that should be accessible via web_fetch (if blocked, there may be a legitimate security reason)

Best Practices

  1. Always use timeouts (--max-time, --connect-timeout) to prevent hanging
  2. Use -s (silent) and -S (show errors) for cleaner output: curl -sSL ...
  3. Check exit codes: curl returns 0 on success, non-zero on failure
  4. Consider rate limiting for multiple requests
  5. Validate URLs before fetching (avoid SSRF vulnerabilities)

Common Exit Codes

| Exit Code | Meaning | |-----------|---------| | 0 | Success | | 6 | Could not resolve host | | 7 | Failed to connect to host | | 28 | Operation timeout | | 35 | SSL/TLS handshake failed |

Example Workflow

1. Try web_fetch first:
   web_fetch(url="http://internal.company.com/docs")

2. If blocked with "private/internal IP" error, use curl fallback:
   exec(command='curl -sL --max-time 30 "http://internal.company.com/docs"')

3. Process the raw HTML output as needed