Fetch and extract content from URLs
Renders web pages using a real browser (including JavaScript-heavy sites) and returns clean extracted content in your preferred format. Submit up to 10 URLs, get back structured content. Per-URL failures appear in errors[] and do not fail the entire request.
Per-URL error codes (in errors[].error):
target_http_error— target server returned a non-2xx HTTP status; the raw status code is inerrors[].statustarget_unreachable— connection refused, TLS failure, DNS failure, or other network errortimeout— request timed outproxy_error— proxy tunnel failurebot_blocked— bot-challenge page detected (Cloudflare, etc.)empty_content— page loaded but no extractable text was foundinvalid_url— malformed URL or SSRF-blocked addressinvalid_redirect_url— redirect target rejected before fetch
Documentation Index
Fetch the complete documentation index at: https://docs.tinyfish.ai/llms.txt
Use this file to discover all available pages before exploring further.
Authorizations
API key for authentication. Get your key from the API Keys page.
Body
URLs to fetch and extraction options
Array of URLs to fetch (1-10). All URLs are fetched in parallel. Each URL is processed independently — if one fails, others still return successfully. Errors are reported per-URL in the errors array.
1 - 10 elements[
"https://example.com",
"https://example.org"
]
Output format for extracted content. "markdown" (default) is ideal for LLM consumption. "html" returns cleaned semantic HTML. "json" returns a structured document tree.
markdown, html, json "markdown"
When true and format is "html", return a complete HTML document with and . The injected head contains curated content metadata when available.
false
Extract all outbound links () from each page. Useful for discovering related pages or navigating to specific content. Links are returned as absolute URLs in the links array of each result. [blocked]
false
Extract all image URLs ([Image blocked: No description]) from each page. Useful for finding visual content or media assets. Image links are returned as absolute URLs in the image_links array of each result.
false
Caller freshness tolerance in seconds for the cached entry. Omit (default) for unlimited tolerance — any cached entry is acceptable. Set to 0 to prefer a live fetch; a cached entry is still served if the origin's Cache-Control: max-age covers its age, or the host is in the small allowlist of operator-pinned never-expire domains. Set to N > 0 to accept a cached entry whose age is below N; the upstream Cache-Control: max-age and the never-expire allowlist may extend (never shorten) this tolerance.
x >= 00