Image Generation
Image Analysis
Model Training
Content Moderation
Authentication
Server Status
Extract web page content
curl --request POST \
--url https://api.worqhat.com/api/ai/v2/web-extract \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"url": "https://example.com",
"includeHTML": false,
"onlyMainContent": true
}'
{
"data": {
"content": "<string>",
"markdown": "<string>",
"linksOnPage": [
"<string>"
],
"metadata": {
"title": "<string>",
"description": "<string>",
"keywords": "<string>",
"robots": "<string>",
"ogTitle": "<string>",
"ogDescription": "<string>",
"ogImage": "<string>",
"ogLocale": "<string>",
"ogLocaleAlternate": [
"<string>"
],
"ogSiteName": "<string>",
"screenshot": "<string>",
"sourceURL": "<string>",
"pageStatusCode": 123,
"processingTime": 123,
"processingId": "<string>",
"processingCount": 123
}
}
}
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
The URL of the web page to extract content from.
Include the HTML version of the content in the response.
Return only the main content, excluding headers, footers, etc.
Response
The extracted text content from the webpage.
The extracted content in Markdown format.
List of links found on the page.
The title of the page.
Meta description of the page.
Meta keywords associated with the page.
Robots meta tag content.
Open Graph title.
Open Graph description.
URL of the Open Graph image.
Open Graph locale.
Alternate Open Graph locales.
Open Graph site name.
URL of the page screenshot.
The original URL of the extracted page.
The HTTP status code of the extracted page.
Processing time in milliseconds.
Unique identifier for the processing request.
Number of processed requests.
Was this page helpful?
curl --request POST \
--url https://api.worqhat.com/api/ai/v2/web-extract \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"url": "https://example.com",
"includeHTML": false,
"onlyMainContent": true
}'
{
"data": {
"content": "<string>",
"markdown": "<string>",
"linksOnPage": [
"<string>"
],
"metadata": {
"title": "<string>",
"description": "<string>",
"keywords": "<string>",
"robots": "<string>",
"ogTitle": "<string>",
"ogDescription": "<string>",
"ogImage": "<string>",
"ogLocale": "<string>",
"ogLocaleAlternate": [
"<string>"
],
"ogSiteName": "<string>",
"screenshot": "<string>",
"sourceURL": "<string>",
"pageStatusCode": 123,
"processingTime": 123,
"processingId": "<string>",
"processingCount": 123
}
}
}