Technical Pipeline / Project Architecture
The overall process for implementing the project is structured in the following pipeline:
1. Global Project Architecture
1. Data Collection & Preprocessing
Extract data from existing sources such as PDF files, Excel spreadsheets, technical documents, and internal databases.
Clean, structure, and normalize textual data to make it usable by AI models.
Tools used may include PyMuPDF, Pandas, OpenRefine, etc.
2. Knowledge Base Creation
Organize the extracted data into a Knowledge Base.
Structure the information by component, requirement, test type, and associated documentation.
Possible technologies include PostgreSQL, MongoDB, and vector databases like FAISS or Weaviate.
3. Benchmarking & Integration of LLM / RAG Models
Select and experiment with different language models (LLMs) such as GPT-4, Mistral, LLaMA, etc.
Integrate a Retrieval-Augmented Generation (RAG)architecture that combines knowledge base search with text generation.
Evaluate performance, accuracy of responses, and domain adaptation for the automotive context.
4. Automatic Document Generation
Use generative AI models to automatically produce:
Technical specifications
Test plans
Validation reports
Supported formats: PDF, DOCX, JSON.
Integration options: LangChain, LlamaIndex, or HuggingFace Transformers.
5. Development of the Web Platform
- Create a user-friendly interface allowing:
Input of the component name
Submission of questions
Download of generated documents
- Proposed technologies include: React.js or Vue.js (frontend), FastAPI or Django (backend), with Docker used for deployment.
2. Key Application Features
| Feature | Description |
|---|---|
| Intelligent Search | Find requirements based on the component name. |
| AI Q&A Agent | Answer questions regarding the component's technical details. |
| Document Generation | Generate test sheets, specifications, and validation reports. |
| Requirement Explanation | Provide in-depth explanations of requirement functions, impacts, and associated tests. |
An intelligent web platform that transforms a simple component name into a rich set of technical information and documents. This will significantly reduce the time required for analysis, documentation, and validation in the automotive domain.
Technologies & Tools
| Domain | Tools / Technologies |
|---|---|
| Data Extraction | PyMuPDF, textract, Pandas |
| AI & NLP | OpenAI GPT, LangChain, HuggingFace |
| Vector Databases | FAISS, ChromaDB, Weaviate |
| Backend | FastAPI, Flask, Django |
| Frontend | React, Vue.js |
| Documentation | MkDocs, Sphinx, Notion |