Navigable Proximity Graph-Driven Native Hybrid Queries with Structured and Unstructured Constraints

Jiongkang Ni; Lingwei Lv; Mengzhao Wang; Qiang Yue; Xiaoliang Xu; Yuxiang Wang

arxiv: 2203.13601 · v1 · pith:BLYRIDFXnew · submitted 2022-03-25 · 💻 cs.DB · cs.CV· cs.IR

Navigable Proximity Graph-Driven Native Hybrid Queries with Structured and Unstructured Constraints

Mengzhao Wang , Lingwei Lv , Xiaoliang Xu , Yuxiang Wang , Qiang Yue , Jiongkang Ni This is my paper

classification 💻 cs.DB cs.CVcs.IR

keywords hybridqueryvectorqueriesobjectsearchsimilaritydatasets

0 comments

read the original abstract

As research interest surges, vector similarity search is applied in multiple fields, including data mining, computer vision, and information retrieval. {Given a set of objects (e.g., a set of images) and a query object, we can easily transform each object into a feature vector and apply the vector similarity search to retrieve the most similar objects. However, the original vector similarity search cannot well support \textit{hybrid queries}, where users not only input unstructured query constraint (i.e., the feature vector of query object) but also structured query constraint (i.e., the desired attributes of interest). Hybrid query processing aims at identifying these objects with similar feature vectors to query object and satisfying the given attribute constraints. Recent efforts have attempted to answer a hybrid query by performing attribute filtering and vector similarity search separately and then merging the results later, which limits efficiency and accuracy because they are not purpose-built for hybrid queries.} In this paper, we propose a native hybrid query (NHQ) framework based on proximity graph (PG), which provides the specialized \textit{composite index and joint pruning} modules for hybrid queries. We easily deploy existing various PGs on this framework to process hybrid queries efficiently. Moreover, we present two novel navigable PGs (NPGs) with optimized edge selection and routing strategies, which obtain better overall performance than existing PGs. After that, we deploy the proposed NPGs in NHQ to form two hybrid query methods, which significantly outperform the state-of-the-art competitors on all experimental datasets (10$\times$ faster under the same \textit{Recall}), including eight public and one in-house real-world datasets. Our code and datasets have been released at \url{https://github.com/AshenOn3/NHQ}.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 7 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Don't Stir the Pot! Authorized Vector Data Retrieval via Access-Aware Indexing
cs.DB 2026-05 unverdicted novelty 7.0

Veda and EffVeda build access-aware lattice indexes on role-partitioned vector blocks to support authorized top-k queries with controlled duplication and pruned search.
Don't Stir the Pot! Authorized Vector Data Retrieval via Access-Aware Indexing
cs.DB 2026-05 conditional novelty 7.0

Veda and EffVeda partition vectors into disjoint role-combination blocks, apply lattice-based copy and merge operations within a storage budget, index large nodes with HNSW, and use coordinated search with distance bo...
CubeGraph: Efficient Retrieval-Augmented Generation for Spatial and Temporal Data
cs.DB 2026-04 unverdicted novelty 7.0

CubeGraph uses hierarchical spatial grids and on-the-fly stitching of per-cell vector graphs to enable single-pass nearest-neighbor search for hybrid vector-spatial queries.
HNSW with Accuracy Guarantees Using Graph Spanners -- A Technical Report
cs.DB 2026-07 unverdicted novelty 6.0

A tiered Certify-then-Rectify system for HNSW that certifies approximate results statistically and falls back to exact recovery by treating the graph as a spanner whose stretch is bounded via extreme value theory.
CLIP: Lightweight Cosine-Law-Based Inverted-List Pruning for IVF-Based Vector Search
cs.DB 2026-06 unverdicted novelty 6.0

CLIP proposes a cosine-law-based pruning method for IVF vector search enabling O(1) cluster and log-time vector pruning with guarantees, plus variants for hierarchical and dynamic settings, showing up to 78% pruning a...
RACORN-1: Adaptive Recall-Preserving Speedup for Low-Selectivity Filtered Vector Search
cs.DB 2026-07 unverdicted novelty 5.0

RACORN-1 adds adaptive search fallback to ACORN-1 to fix recall collapse at low selectivity in filtered vector search, achieving 9-26x speedups over HNSW with recovered recall on 1M-40M datasets.
Don't Stir the Pot! Authorized Vector Data Retrieval via Access-Aware Indexing
cs.DB 2026-05 unverdicted novelty 5.0

Veda and EffVeda partition vector data by role combinations, apply lattice-based copy/merge under storage budget, index large nodes with HNSW and small nodes with linear scan, then use query plans and coordinated sear...