PDF Processing
PDF Processing
Quick start
Extract text:
from llama_cloud_services import LlamaParse
parser = LlamaParse(
parse_mode="parse_page_with_agent",
model="openai-gpt-4-1-mini",
high_res_ocr=True,
adaptive_long_table=True,
outlined_table_extraction=True,
output_tables_as_HTML=True,
result_type="markdown",
project_id=project_id,
organization_id=organization_id,
)
result = parser.parse("./my_file.pdf")
documents = result.get_markdown_documents(split_by_page=True)
full_text = ""
for document in documents:
full_text += document.text + "\n\n---\n\n"
For more detailed code implementations, see REFERENCE.md.
Requirements
The llama_cloud_services package must be installed in your environment:
pip install llama_cloud_services
And the LLAMA_CLOUD_API_KEY must be available as an environment variable:
export LLAMA_CLOUD_API_KEY="..."
More from run-llama/vibe-llama
classify files according to specific rules
Invoke this skill BEFORE implementing any text/document classification task to learn the correct llama_cloud_services API usage. Required reading before writing classification code." Requires the llama_cloud_services package and LLAMA_CLOUD_API_KEY as an environment variable.
2retrieve relevant information through rag
Leverage Retrieval Augmented Generation to retrieve relevant information from a a LlamaCloud Index. Requires the llama_cloud_services package and LLAMA_CLOUD_API_KEY as an environment variable.
2use llamactl - a cli tool for llamaagents
Use llamactl to initialize, locally preview, deploy and manage LlamaIndex workflows as LlamaAgents. Required llama-index-workflows and llamactl to be installed in the environment.
2