Document Tree Search

When retrieving documents, it's important to select the most relevant nodes to form the context for answering a query. This node search process typically involves analyzing the document's tree structure and identifying nodes that are likely to contain the answer.

Basic Node Search Example

prompt = f"""
You are given a question and the tree structure of a document.
You need to find all nodes that are likely to contain the answer.

Question: {question}

Document tree structure: {structure}

Reply in the following JSON format:
{{
  "thinking": <your reasoning about which nodes are relevant>,
  "node_list": [node_id1, node_id2, ...]
}}
"""

Integrating User Preference or Expert Knowledge

To further improve node selection, you can incorporate user preferences or expert knowledge. This allows domain experts to define rules or highlight sections that are most likely to contain the correct answer.

Example Pipeline
  1. When a query is received, the system selects the most relevant user preference or expert knowledge snippet from a database or a set of domain-specific rules. This can be done using keyword matching, semantic similarity, or LLM-based relevance search.
  2. The selected preference is injected into the prompt, guiding the LLM to focus on the most relevant parts of the document.

Enhanced Node Search Prompt with User Preference Example

prompt = f"""
You are given a question and a tree structure of a document.
You need to find all nodes that are likely to contain the answer.

Question: {question}

Document tree structure: {structure}

Expert Knowledge of relevant sections: {expert_knowledge}

Reply in the following JSON format:
{{
  "thinking": <reasoning about which nodes are relevant>,
  "node_list": [node_id1, node_id2, ...]
}}
"""
Example Expert Preference

If the query mentions EBITDA adjustments, prioritize Item 7 (MD&A) and footnotes in Item 8 (Financial Statements) in 10-K reports.

By integrating user or expert preferences, node search becomes more targeted and effective, leveraging both the document structure and domain-specific insights.

💬 Help & Community

Contact us if you need any advice on conducting document searches for your use case.

Ready to Get Started?

Explore our comprehensive documentation and start building with PageIndex today.