📄 Document Search Examples

We provide two examples of how to search documents in different scenarios.

SQL-Based Document Search

When your documents have structured metadata, you can leverage SQL for efficient searching. This approach works best when documents can be distinguished by their metadata.

Example Pipeline

  1. Store documents and metadata in a database table
  2. Use LLM to transform natural language questions into SQL queries
  3. Execute the query to retrieve relevant documents

Description-Based Document Search

For documents that can't be distinguished by metadata, use LLM-generated descriptions for document selection.

Example Prompt

Below is a sample prompt for document selection:

prompt = """
You are given a list of documents with their IDs, file names, and descriptions.  Your job is to select documents that may contain information relevant to answering the user query. 

Query: {query}

Documents: [
    {
        "doc_id": "xxx",
        "doc_name": "xxx",
        "doc_description": "xxx"
    }
]

Response Format:
{{
    "thinking": "<Your reasoning for document selection>",
    "answer": <Python list of relevant doc_ids>, e.g. ['doc_id1', 'doc_id2']. Return [] if no documents are relevant.
}}

Return only the JSON structure, no additional output.
"""

💡 Tips

For large document collections, you could divide documents into groups and process groups in parallel for document selection.


📢 Stay Tuned!
We are continuously updating our documentation with new examples and best practices.
Last updated: 2025/05/01