Intelligent AI Platform for Automobile Component Managementο
link to github repository: link
You can view the full presentation by: clicking here
Introduction
Challenges
Data Extraction
- PDF Processing Pipeline Documentation
- π PDF Data Extraction docling
- π PDF Data Extraction and Structuring Tool (Custom Scripts)
- π§© Overview
- π Primary Use Case
- π Workflow Summary
- π¦ Key Features
- βοΈ Prerequisites
- π Function-by-Function Explanation
- Key Libraries Used
- Split_pdf(input_pdf_path)
- Summarize_table(table)
- Extract_image_caption(page, bbox, nlp)
- Extract_content(pdf_path, page_num, nlp)
- Save_txt_and_md(base_name, page_number, content)
- Process_pdf(pdf_path)
- Clean_generated_files(base_name, total_pages)
- π§ How This Supports LLMs and RAG
- π¬ Example Pipeline: LLM + RAG
- π Technologies Used
- Practice
'
link to github repository: link
'
'
