The SignalWire Agents SDK includes optional local search capabilities that can be installed separately to avoid adding large dependencies to the base installation.
search
- Fast, lightweight, good for development and testingsearch-full
- Comprehensive document support without NLP overheadsearch-nlp
- Better search relevance with advanced query processingpython -m spacy download en_core_web_sm
search-all
- Complete feature setpython -m spacy download en_core_web_sm
For vector embeddings and keyword search with minimal dependencies:
pip install signalwire-agents[search]
Size: ~500MB
Includes: sentence-transformers, scikit-learn, nltk, numpy
For comprehensive document processing including PDF, DOCX, Excel, PowerPoint:
pip install signalwire-agents[search-full]
Size: ~600MB
Includes: Basic search + pdfplumber, python-docx, openpyxl, python-pptx, markdown, striprtf, python-magic
For advanced natural language processing with spaCy:
pip install signalwire-agents[search-nlp]
Size: ~600MB
Includes: Basic search + spaCy
⚠️ Additional Setup Required: After installation, you must download the spaCy language model:
python -m spacy download en_core_web_sm
Performance Note: Advanced NLP features provide better query understanding and synonym expansion, but are significantly slower than basic search. You can control which NLP backend to use:
Use the nlp_backend
parameter to choose:
# Fast NLTK processing (default)
self.add_skill("native_vector_search", {
"nlp_backend": "nltk" # or omit for default
})
# Better quality spaCy processing
self.add_skill("native_vector_search", {
"nlp_backend": "spacy" # requires spaCy model download
})
For complete search functionality:
pip install signalwire-agents[search-all]
Size: ~700MB
Includes: All search features combined + pgvector support
⚠️ Additional Setup Required: After installation, you must download the spaCy language model:
python -m spacy download en_core_web_sm
Performance Note: This includes advanced NLP features which are slower but provide better search quality.
You can control which NLP backend to use with the nlp_backend
parameter:
"nltk"
(default): Fast processing"spacy"
: Better quality but slower, requires model downloadFor scalable vector search with PostgreSQL:
# Just pgvector support
pip install signalwire-agents[pgvector]
# Or with search features
pip install signalwire-agents[search,pgvector]
# Already included in search-all
pip install signalwire-agents[search-all]
Includes: psycopg2-binary, pgvector
Use Case: Multi-agent deployments, centralized knowledge bases, production systems
Feature | Basic | Full | NLP | All | pgvector |
---|---|---|---|---|---|
Vector embeddings | ✅ | ✅ | ✅ | ✅ | ❌ |
Keyword search | ✅ | ✅ | ✅ | ✅ | ❌ |
Text files (txt, md) | ✅ | ✅ | ✅ | ✅ | ❌ |
PDF processing | ❌ | ✅ | ❌ | ✅ | ❌ |
DOCX processing | ❌ | ✅ | ❌ | ✅ | ❌ |
Excel/PowerPoint | ❌ | ✅ | ❌ | ✅ | ❌ |
Advanced NLP | ❌ | ❌ | ✅ | ✅ | ❌ |
POS tagging | ❌ | ❌ | ✅ | ✅ | ❌ |
Named entity recognition | ❌ | ❌ | ✅ | ✅ | ❌ |
PostgreSQL support | ❌ | ❌ | ❌ | ✅ | ✅ |
Scalable vector search | ❌ | ❌ | ❌ | ✅ | ✅ |
You can check if search functionality is available in your code:
try:
from signalwire_agents.search import IndexBuilder, SearchEngine
print("✅ Search functionality is available")
except ImportError as e:
print(f"❌ Search not available: {e}")
print("Install with: pip install signalwire-agents[search]")
Once installed, you can start using search functionality:
# Using the CLI tool with the comprehensive concepts guide
sw-search docs/signalwire_agents_concepts_guide.md --output concepts.swsearch
# Build from multiple sources (files and directories)
sw-search docs/signalwire_agents_concepts_guide.md examples README.md --file-types md,py,txt --output comprehensive.swsearch
# Traditional directory approach
sw-search ./docs --output knowledge.swsearch --file-types md,txt,pdf
# Build index in PostgreSQL
sw-search ./docs \
--backend pgvector \
--connection-string "postgresql://user:pass@localhost/dbname" \
--output docs_collection
# Overwrite existing collection
sw-search ./docs \
--backend pgvector \
--connection-string "postgresql://user:pass@localhost/dbname" \
--output docs_collection \
--overwrite
from signalwire_agents.search import IndexBuilder
from pathlib import Path
# SQLite backend
builder = IndexBuilder()
builder.build_index_from_sources(
sources=[Path("docs/signalwire_agents_concepts_guide.md")],
output_file="concepts.swsearch",
file_types=['md']
)
# pgvector backend
builder = IndexBuilder(
backend='pgvector',
connection_string='postgresql://user:pass@localhost/dbname'
)
builder.build_index_from_sources(
sources=[Path("docs"), Path("README.md")],
output_file="docs_collection",
file_types=['md', 'txt'],
overwrite=True # Drop existing collection first
)
# Search SQLite index
sw-search search concepts.swsearch "how to build agents"
# Search pgvector collection
sw-search search docs_collection "how to build agents" \
--backend pgvector \
--connection-string "postgresql://user:pass@localhost/dbname"
from signalwire_agents.search import SearchEngine
from signalwire_agents.search.query_processor import preprocess_query
# SQLite backend
engine = SearchEngine("concepts.swsearch")
# pgvector backend
engine = SearchEngine(
backend='pgvector',
connection_string='postgresql://user:pass@localhost/dbname',
collection_name='docs_collection'
)
# Preprocess query
enhanced = preprocess_query("How do I build agents?", vector=True)
# Search
results = engine.search(
query_vector=enhanced['vector'],
enhanced_text=enhanced['enhanced_text'],
count=5
)
for result in results:
print(f"Score: {result['score']:.2f}")
print(f"File: {result['metadata']['filename']}")
print(f"Content: {result['content'][:200]}...")
print("---")
from signalwire_agents import AgentBase
from signalwire_agents.core.function_result import SwaigFunctionResult
class SearchAgent(AgentBase):
def __init__(self):
super().__init__(name="search-agent")
# Check if search is available
try:
from signalwire_agents.search import SearchEngine
self.search_engine = SearchEngine("concepts.swsearch")
self.search_available = True
except ImportError:
self.search_available = False
@AgentBase.tool(
name="search_knowledge",
description="Search the knowledge base",
parameters={
"query": {"type": "string", "description": "Search query"}
}
)
def search_knowledge(self, args, raw_data):
if not self.search_available:
return SwaigFunctionResult(
"Search not available. Install with: pip install signalwire-agents[search]"
)
# Perform search...
return SwaigFunctionResult("Search results...")
ImportError: No module named 'sentence_transformers'
bash
pip install signalwire-agents[search]
ImportError: No module named 'pdfplumber'
bash
pip install signalwire-agents[search-full]
NLTK data not found
python
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
spaCy model not found
bash
python -m spacy download en_core_web_sm
If you see "spaCy model 'en_core_web_sm' not found. Falling back to NLTK", this means the spaCy language model wasn't installed. This is required for search-nlp
and search-all
installations.
search
instead of search-all
if you don't need document processingModel Selection: Use smaller models for faster inference:
python
builder = IndexBuilder(model_name='sentence-transformers/all-MiniLM-L6-v2')
Chunk Size: Adjust chunk size based on your documents:
python
builder = IndexBuilder(chunk_size=300, chunk_overlap=30) # Smaller chunks
File Filtering: Only index relevant file types:
python
builder.build_index(
source_dir="./docs",
file_types=['md', 'txt'], # Skip heavy formats like PDF
exclude_patterns=['**/test/**', '**/__pycache__/**']
)
To remove search dependencies:
pip uninstall sentence-transformers scikit-learn nltk
# Add other packages as needed
The core SignalWire Agents SDK will continue to work without search functionality.
For issues with search functionality:
--force-reinstall