Retriever Service
The Retriever is a powerful service that enables intelligent search and retrieval from knowledge graphs created by the Importer service
ArangoDB Platform
Overview
The Retriever service offers two distinct search methods:
- Global search: Analyzes entire document to identify themes and patterns, perfect for high-level insights and comprehensive summaries.
- Local search: Focuses on specific entities and their relationships, ideal for detailed queries about particular concepts.
The service supports both private (Triton Inference Server) and public (OpenAI) LLM deployments, making it flexible for various security and infrastructure requirements. With simple HTTP endpoints, you can easily query your knowledge graph and get contextually relevant responses.
Key features:
- Dual search methods for different query types
- Support for both private and public LLM deployments
- Simple REST API interface
- Integration with ArangoDB knowledge graphs
- Configurable community hierarchy levels
Search methods
The Retriever service enables intelligent search and retrieval of information from your knowledge graph. It provides two powerful search methods, global Search and local Search, that leverage the structured knowledge graph created by the Importer to deliver accurate and contextually relevant responses to your natural language queries.
Global search
Global search is designed for queries that require understanding and aggregation of information across your entire document. It’s particularly effective for questions about overall themes, patterns, or high-level insights in your data.
- Community-Based Analysis: Uses pre-generated community reports from your knowledge graph to understand the overall structure and themes of your data,
- Map-Reduce Processing:
- Map Stage: Processes community reports in parallel, generating intermediate responses with rated points.
- Reduce Stage: Aggregates the most important points to create a comprehensive final response.
Best use cases:
- “What are the main themes in the dataset?”
- “Summarize the key findings across all documents”
- “What are the most important concepts discussed?”
Local search
Local search focuses on specific entities and their relationships within your knowledge graph. It is ideal for detailed queries about particular concepts, entities, or relationships.
- Entity Identification: Identifies relevant entities from the knowledge graph based on the query.
- Context Gathering: Collects:
- Related text chunks from original documents.
- Connected entities and their strongest relationships.
- Entity descriptions and attributes.
- Context from the community each entity belongs to.
- Prioritized Response: Generates a response using the most relevant gathered information.
Best use cases:
- “What are the properties of [specific entity]?”
- “How is [entity A] related to [entity B]?”
- “What are the key details about [specific concept]?”
Installation
The Retriever service can be configured to use either the Triton Inference Server (for private LLM deployments) or OpenAI/OpenRouter (for public LLM deployments).
To start the service, use the GenAI service endpoint /v1/graphragretriever
.
Please refer to the documentation of GenAI service for more
information on how to use it.
Using Triton Inference Server (Private LLM)
The first step is to install the LLM Host service with the LLM and embedding models of your choice. The setup will the use the Triton Inference Server and MLflow at the backend. For more details, please refer to the Triton Inference Server and Mlflow documentation.
Once the llmhost
service is up-and-running, then you can start the Importer
service using the below configuration:
{
"env": {
"username": "your_username",
"db_name": "your_database_name",
"api_provider": "triton",
"triton_url": "your-arangodb-llm-host-url",
"triton_model": "mistral-nemo-instruct"
},
}
Where:
username
: ArangoDB database user with permissions to access collections.db_name
: Name of the ArangoDB database where the knowledge graph is stored.api_provider
: Specifies which LLM provider to use.triton_url
: URL of your Triton Inference Server instance. This should be the URL where yourllmhost
service is running.triton_model
: Name of the LLM model to use for text processing.
Using OpenAI (Public LLM)
{
"env": {
"openai_api_key": "your_openai_api_key",
"username": "your_username",
"db_name": "your_database_name",
"api_provider": "openai"
},
}
Where:
username
: ArangoDB database user with permissions to access collections.db_name
: Name of the ArangoDB database where the knowledge graph is stored.api_provider
: Specifies which LLM provider to use.openai_api_key
: Your OpenAI API key.
gpt-4o-mini
and text-embedding-3-small
models as LLM and
embedding model respectively.Using OpenRouter (Gemini, Anthropic, etc.)
OpenRouter makes it possible to connect to a huge array of LLM API providers, including non-OpenAI LLMs like Gemini Flash, Anthropic Claude and publicly hosted open-source models.
When using the OpenRouter option, the LLM responses are served via OpenRouter while OpenAI is used for the embedding model.
{
"env": {
"db_name": "your_database_name",
"username": "your_username",
"api_provider": "openrouter",
"openai_api_key": "your_openai_api_key",
"openrouter_api_key": "your_openrouter_api_key",
"openrouter_model": "mistralai/mistral-nemo" // Specify a model here
},
}
Where:
username
: ArangoDB database user with permissions to access collections.db_name
: Name of the ArangoDB database where the knowledge graph is stored.api_provider
: Specifies which LLM provider to use.openai_api_key
: Your OpenAI API key (for the embedding model).openrouter_api_key
: Your OpenRouter API key (for the LLM).openrouter_model
: Desired LLM (optional; default ismistral-nemo
).
mistral-nemo
for generation
(via OpenRouter) and text-embedding-3-small
for embeddings (via OpenAI).Executing queries
After the Retriever service is installed successfully, you can interact with it using the following HTTP endpoints, based on the selected search method.
curl -X POST /v1/graphrag-query \
-H "Content-Type: application/json" \
-d '{
"query": "What is the AR3 Drone?",
"query_type": 2,
"provider": 0
}'
curl -X POST /v1/graphrag-query \
-H "Content-Type: application/json" \
-d '{
"query": "What is the AR3 Drone?",
"level": 1,
"query_type": 1,
"provider": 0
}'
The request parameters are the following:
query
: Your search query text.level
: The community hierarchy level to use for the search (1
for top-level communities).query_type
: The type of search to perform.1
: Global search.2
: Local search.
provider
: The LLM provider to use0
: OpenAI (or OpenRouter)1
: Triton
Health check
You can also monitor the service health:
GET /v1/health
Verify status
You can verify the state of the retriever process via the following endpoint:
GET /gen-ai/v1/project_by_name/<your_project>
For example, the status
object found within retrieverServices
may contain the following
properties:
"status": {
"status": "service_started",
"progress": 100,
}
Best Practices
Choose the right search method:
- Use global search for broad, thematic queries.
- Use local search for specific entity or relationship queries.
Performance considerations:
- Global search may take longer due to its map-reduce process.
- Local search is typically faster for concrete queries.
API Reference
For detailed API documentation, see the GraphRAG Retrievers API Reference .