ScrapingDuck API Docs
ScrapingDuck
  1. Browser
ScrapingDuck API Docs
  • Actions
    • Returns example action objects for each available action.
      GET
  • Browser
    • Executes a list of browser actions.
      POST
    • Fetches the full HTML content of a web page.
      GET
    • Extracts the main article content from a web page.
      GET
  • Devices
    • Gets the list of all available device names.
      GET
    • Gets detailed information for a specific device.
      GET
  • Scrape
    • Fetches the raw HTML source of a web page as a direct text/html response.
      GET
    • Fetches the full scraping result including HTML content and metadata (JSON response).
      GET
    • Extracts the main article content from a web page.
      GET
  • Schemas
    • ActionExecutionMeta
    • BrowserAction
    • BrowserActionResult
    • BrowserActionResultActionExecutionResponse
    • BrowserAutomationRequest
    • DeviceContextOptionsDto
    • DeviceDetailDto
    • ProblemDetails
    • PublicBrowserAutomationRequest
    • ResultData
    • TrafficSource
    • ValidationProblemDetails
    • ViewSettings
ScrapingDuck
  1. Browser

Extracts the main article content from a web page.

Deprecated
GET
https://api.scrapingduck.com/v1/browser/article
Last modified:2026-02-11 10:03:19
A convenience endpoint that visits the specified URL, extracts the main article text, and returns it as clean HTML.
Browser behavior can be customized using the optional query parameters.
Default Behavior:
If deviceName is omitted, a realistic, rotating browser fingerprint will be used.
If locale and timezone are omitted, the browser's locale and timezone will be matched to the proxy's IP address geolocation.
Overrides:
Providing a deviceName uses that specific device profile.
Providing a locale or timezone will disable the automatic matching based on the proxy.
Example Request:
GET /browser/article?url=https://example.com/news/story
Possible error codes:
400 (validation_error): The url query parameter is missing or invalid.
401 (unauthorized): The API key is missing or invalid.
404 (device_not_found): The specified deviceName does not exist. A list of available devices can be retrieved from the /devices endpoint.
400 (invalid_url_scheme): The url parameter is not a valid HTTP or HTTPS URL.
500 (unhandled_error): An unexpected server-side error occurred during the execution of the request.

Request

Authorization
API Key
Add parameter in query
apiKey
Example:
apiKey: ********************
or
Query Params

Responses

🟢200
application/json
OK
Body

🟠400
🟠401
🟠404
🔴500
Request Request Example
Shell
JavaScript
Java
Swift
curl --location --request GET 'https://api.scrapingduck.com/v1/browser/article?url&deviceName&locale&timezone&disableResourceExclusion&useAdvancedSpoofing&disableJavaScript&apiKey=<api-key>'
Response Response Example
200 - Success
{
    "executionId": "myApiKey_a1b2c3d4",
    "results": [
        {
            "type": "VisitUrlActionResult",
            "success": true,
            "data": {
                "status": 200
            }
        },
        {
            "type": "ExtractArticleActionResult",
            "success": true,
            "data": {
                "article": {
                    "uri": "https://news.ycombinator.com/item?id=40825098",
                    "title": "Extracted Article Title",
                    "byline": "John Doe",
                    "content": "<div><p>This is the main content of the article.</p></div>",
                    "textContent": "This is the main content of the article.",
                    "excerpt": "A short summary of the article content.",
                    "author": "John Doe",
                    "siteName": "Hacker News",
                    "publicationDate": "2025-01-01T12:00:00Z",
                    "featuredImage": "https://via.placeholder.com/1200x630.png",
                    "language": "en",
                    "timeToRead": "00:05:00",
                    "isReadable": true
                }
            }
        }
    ],
    "meta": {
        "actionsRequested": 2,
        "actionsSucceeded": 2,
        "startedAtUtc": "2025-01-01T12:00:00Z",
        "finishedAtUtc": "2025-01-01T12:00:02Z",
        "durationMs": 2000,
        "responseSizeBytes": 1234,
        "maxActionsAllowed": 50
    }
}
Modified at 2026-02-11 10:03:19
Previous
Fetches the full HTML content of a web page.
Next
Gets the list of all available device names.
Built with