Web Extraction AI
Extract data, content, tables and images from any website with ease
Function for extracting text from web pages. It sends a request to the Web Extraction AI Model and returns the extracted text. Key components such as headlines, paragraphs, images, and tables are identified, and the algorithm extracts them in a structured format like JSON. Additionally, the extracted data is cleaned and normalized to enhance its usability for analysis and processing purposes. Read more at: https://docs.worqhat.com/ai-models/text-extraction/web-extraction
Parameters
Parameter | Type | Description | Default |
---|---|---|---|
code_blocks | boolean | Indicates whether to extract code blocks. | true |
headline | boolean | Indicates whether to extract headlines. | true |
inline_code | boolean | Indicates whether to extract inline code. | true |
references | boolean | Indicates whether to extract references. | true |
url_path | string | Represents the URL of the web page to extract text from. | N/A |