BrowserElf - Advanced Web Scraping API Documentation

🚀API Endpoints

GET /health

Check the health status of the BrowserElf service. Returns service status, uptime information, and current timestamp.

Response Example

{ "status": "ok", "message": "BrowserElf is alive 🧝‍♂️", "timestamp": "2024-01-15T10:30:00.000Z" }

GET /scrape

Easy browser testing endpoint. Pass URL as query parameter for quick testing.

Browser URL Example

https://your-domain.com/scrape?url=https://example.com&stealth=true&format=json

POST /scrape

The main scraping endpoint. Extract content, take screenshots, and get structured data from any website.

Request Parameters

Parameter	Type	Required	Description
url	string	Required	The URL to scrape (must include protocol)
screenshot	boolean	Optional	Whether to take a screenshot (default: false)
screenshotOptions	object	Optional	Puppeteer screenshot configuration
format	string	Optional	Response format: "html", "json", or "raw"
selector	string	Optional	CSS selector for specific content extraction
headers	object	Optional	Custom HTTP headers to send with request
extract	array	Optional	Content types to extract: ["text", "links", "images", "metadata"]
stealth	boolean	Optional	Enable stealth mode for bypassing security (default: true)
forceProxy	boolean	Optional	Force proxy usage (bypasses smart proxy logic, default: false)

Request Example

curl -X POST http://127.0.0.1:3000/scrape \ -H "Content-Type: application/json" \ -d '{ "url": "https://example.com", "screenshot": true, "extract": ["text", "links", "metadata"], "selector": "h1", "headers": {"User-Agent": "CustomBot/1.0"}, "format": "json", "stealth": true }'

{ "url": "https://example.com", "status": 200, "timestamp": "2024-01-15T10:30:00.000Z", "loadTimeMs": 1250, "size": 45230, "html": "<!DOCTYPE html>...", "text": "Example Domain...", "links": ["https://www.iana.org/"], "images": ["https://example.com/image.jpg"], "metadata": { "title": "Example Domain", "description": "This domain is for use in examples", "canonical": "https://example.com/", "headers": ["Example Domain"], "favicon": "/favicon.ico" }, "screenshot": "iVBORw0KGgoAAAANSUhEUgAA..." }

GET /logs

Retrieve the last 20 scraping requests with detailed information including status, timing, and any errors.

Response Example

{ "logs": [ { "url": "https://example.com", "timestamp": "2024-01-15T10:30:00.000Z", "options": { "screenshot": true }, "status": "success", "loadTime": 1250, "error": null } ], "total": 42, "timestamp": "2024-01-15T10:30:00.000Z" }

⚡Features & Capabilities

📸

Screenshot Capture

Take full-page or viewport screenshots with customizable Puppeteer options including viewport size, format, and quality settings.

🎯

Smart Content Extraction

Extract text, links, images, and metadata. Use CSS selectors for precise content targeting with Cheerio parsing.

🔍

Metadata Parsing

Automatically extract page title, description, canonical URLs, favicons, and heading structure (H1-H3).

⚡

High Performance

Optimized for speed with timeout handling, connection pooling, and efficient memory management for large-scale scraping.

📊

Detailed Logging

Comprehensive request logging with performance metrics, error tracking, and request history (last 100 requests).

🔧

Flexible Output

Multiple response formats: JSON for structured data, HTML for raw content, or raw format with headers.

🛡️

Robust Error Handling

Comprehensive error handling for timeouts, invalid URLs, network issues, and screenshot failures with detailed error messages.

🌐

Custom Headers

Send custom HTTP headers including User-Agent strings, authentication tokens, and other request modifications.

🥷

Stealth Mode

Advanced stealth features to bypass Cloudflare and other security measures with human-like behavior patterns.

📚Quick Start Guide

1 Basic Health Check

Verify the service is running and accessible.

Test Connection

curl http://127.0.0.1:3000/health

2 Simple Content Extraction

Extract basic content from a website.

Basic Scraping

curl -X POST http://127.0.0.1:3000/scrape \ -H "Content-Type: application/json" \ -d '{"url": "https://example.com"}'

3 Advanced Scraping with Screenshot

Extract content and capture a screenshot with custom options.

Advanced Scraping

curl -X POST http://127.0.0.1:3000/scrape \ -H "Content-Type: application/json" \ -d '{ "url": "https://example.com", "screenshot": true, "screenshotOptions": { "fullPage": true, "viewport": {"width": 1920, "height": 1080} }, "extract": ["text", "links", "metadata"], "format": "json", "stealth": true }'

4 View Scraping History

Check the logs to see your scraping activity and performance metrics.

Check Logs

curl http://127.0.0.1:3000/logs

🚀API Endpoints

Request Parameters

⚡Features & Capabilities

📚Quick Start Guide

🚀 Ready to Start Scraping?