Extract PDF content as LlamaIndex-compatible JSON documents. Perfect for RAG pipelines, LangChain, and other LLM frameworks.
Click to select files or drag and drop
One or more PDF files
Your files never leave your device.
Output Format:
Each PDF will be extracted as a JSON file containing an array of LlamaIndex Document objects with:
text - Extracted text
content per page
metadata - Page number,
headings, and document info
extra_info - Additional
context for RAG systems
Click or drag and drop your file to begin
Click the process button to start
Save your processed file instantly