🗂 Source Map
▼- lib/segment/src/index/hnsw_index/hnsw.rs -- HNSWIndex struct
- lib/segment/src/index/hnsw_index/graph_layers.rs -- GraphLayers + search
- lib/segment/src/vector_storage/query_scorer/mod.rs -- QueryScorer trait
- lib/segment/src/types.rs -- Distance enum
- lib/segment/src/segment/mod.rs -- Segment struct
- lib/segment/src/entry/entry_point.rs -- SegmentEntry trait
- lib/collection/src/shards/local_shard/mod.rs -- LocalShard struct
- lib/segment/src/payload_storage/payload_storage_enum.rs -- PayloadStorageEnum
- lib/segment/src/index/hnsw_index/graph_links.rs -- GraphLinks storage
Getting Started
# Start Qdrant server
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant
# Install Python client
pip install qdrant-client
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct
client = QdrantClient(url="http://localhost:6333")
# Create a collection with HNSW tuning
client.create_collection(
collection_name="my_collection",
vectors_config=VectorParams(size=384, distance=Distance.COSINE),
)
# Upsert points with vectors and payloads
client.upsert(collection_name="my_collection", points=[
PointStruct(id=1, vector=[0.05, 0.61, 0.76, ...],
payload={"city": "Berlin", "price": 299}),
])
# Search with filtering
results = client.query_points(
collection_name="my_collection",
query=[0.2, 0.1, 0.9, ...],
query_filter={"must": [{"key": "city", "match": {"value": "Berlin"}}]},
limit=10,
)
Source Code Walkthrough
HNSWIndex -- The Core Index Structure
The primary vector index wrapping a multi-layer graph with vector storage, quantization, and payload filtering.
pub struct HNSWIndex {
id_tracker: Arc<AtomicRefCell<IdTrackerEnum>>,
vector_storage: Arc<AtomicRefCell<VectorStorageEnum>>,
quantized_vectors: Arc<AtomicRefCell<Option<QuantizedVectors>>>,
payload_index: Arc<AtomicRefCell<StructPayloadIndex>>,
config: HnswGraphConfig,
path: PathBuf,
graph: GraphLayers,
searches_telemetry: HNSWSearchesTelemetry,
is_on_disk: bool,
}
Note the Arc<AtomicRefCell<...>> pattern: shared ownership with interior mutability for concurrent reads with serialized writes. The optional quantized_vectors is populated only when quantization is configured.
GraphLayers -- The Multi-Layer HNSW Graph
pub struct GraphLayers {
pub(super) hnsw_m: HnswM,
pub(super) links: GraphLinks,
pub(super) entry_points: EntryPoints,
pub(super) visited_pool: VisitedPool,
}
// The core search method signature
fn search_on_level(
&self,
level_entry: ScoredPointOffset,
level: usize,
ef: usize,
points_scorer: &mut FilteredScorer,
is_stopped: &AtomicBool,
) -> CancellableResult<FixedLengthPriorityQueue<ScoredPointOffset>>
FilteredScorer wraps a distance scorer with a filter check, allowing beam search to skip non-matching points during traversal. VisitedPool maintains thread-local bitsets to prevent re-scoring.
QueryScorer -- Distance Computation
pub trait QueryScorer {
type TVector: ?Sized;
fn score_stored(&self, idx: PointOffsetType) -> ScoreType;
fn score_stored_batch(
&self, ids: &[PointOffsetType], scores: &mut [ScoreType]
);
fn score(&self, v2: &Self::TVector) -> ScoreType;
fn score_internal(
&self, a: PointOffsetType, b: PointOffsetType
) -> ScoreType;
}
score_stored_batch enables CPU prefetch optimization when scoring multiple stored vectors. The generic TMetric parameter plugs in different distance functions without changing the search algorithm.
Distance Enum
pub enum Distance {
Cosine,
Euclid,
Dot,
Manhattan,
}
Cosine and Dot prefer larger scores; Euclidean and Manhattan prefer smaller. Qdrant pre-normalizes Cosine vectors at insertion, making Cosine equivalent to Dot on unit vectors.
Segment -- The Storage Unit
pub struct Segment {
pub uuid: Uuid,
pub version: Option<SeqNumberType>,
pub segment_path: PathBuf,
pub id_tracker: Arc<AtomicRefCell<IdTrackerEnum>>,
pub vector_data: HashMap<VectorNameBuf, VectorData>,
pub payload_index: Arc<AtomicRefCell<StructPayloadIndex>>,
pub payload_storage: Arc<AtomicRefCell<PayloadStorageEnum>>,
pub appendable_flag: bool,
pub segment_type: SegmentType,
pub segment_config: SegmentConfig,
}
The vector_data HashMap maps vector names to VectorData structs (index + storage + quantized reps). This is how named vectors work: independent indexes per vector name. appendable_flag distinguishes write-optimized from read-optimized segments.
LocalShard -- Shard Coordination
pub struct LocalShard {
collection_name: CollectionId,
pub(super) segments: LockedSegmentHolder,
pub(super) wal: RecoverableWal,
pub(super) update_handler: Arc<Mutex<UpdateHandler>>,
pub(super) path: PathBuf,
pub(super) optimizers: ArcSwap<Vec<Arc<Optimizer>>>,
}
ArcSwap for optimizers allows hot-swapping configurations without restart. The WAL (RecoverableWal) replays uncommitted entries on crash recovery.