Text Extraction
PDF Extraction API
Extracts text, tables and other contents and Organizes them in any PDF.
POST
/
api
/
ai
/
v2
/
pdf-extract
Authorization
Body
curl --request POST \
--url https://api.worqhat.com/api/ai/v2/pdf-extract \
--header 'Authorization: <authorization>' \
--header 'Content-Type: multi-part/form-data' \
--data '{
"file": "/path/to/pdf.pdf"
}'
{
"data": "This is a sample text to be converted.",
"processing_count": 12,
"processing_id": "4b139bfe-1f71-4345-b539-fa6c1700f166",
"processing_time": 888,
"status": "success"
}
Authorizations
Authorization
string
headerrequiredBearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
multi-part/form-data
file
string
The PDF file to be sent as input.
Response
200 - application/json
data
string
The converted text.
processing_count
integer
The number of times the request has been processed. This is what is considered in the Billing Process. This is either the number of times the image is processed or the number of words that the server processes.
processing_id
string
A universally unique identifier for the request. This can be used to track the request in the logs.
processing_time
integer
The time taken to process the request, in milliseconds.
status
string
Status indicating the success of the PDF Extraction Process. In case you have an error for multipart/form-data
requests, try removing the Content-Type
header.
Was this page helpful?
curl --request POST \
--url https://api.worqhat.com/api/ai/v2/pdf-extract \
--header 'Authorization: <authorization>' \
--header 'Content-Type: multi-part/form-data' \
--data '{
"file": "/path/to/pdf.pdf"
}'
{
"data": "This is a sample text to be converted.",
"processing_count": 12,
"processing_id": "4b139bfe-1f71-4345-b539-fa6c1700f166",
"processing_time": 888,
"status": "success"
}