To GPU or Not to GPU: Vector Search in Relational Engines
Pith reviewed 2026-05-19 19:01 UTC · model grok-4.3
The pith
An alternative organization of vector indexes and embeddings lets GPUs accelerate both relational queries and vector search in database engines.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
With an alternative organization of vector index and embeddings that reduces index size, both the relational and vector search components are faster on the GPU, particularly on fast interconnects, in contrast with the architecture used in existing engines.
What carries the argument
Alternative organization of vector index and embeddings that reduces the size of the index, allowing GPU execution of SQL+VS queries without the data-movement penalty of conventional designs.
If this is right
- Relational components of SQL+VS queries benefit more from GPU execution than the vector-search component itself.
- Moving existing vector indexes and embeddings to the GPU is not the best option even with fast interconnects.
- Reducing index size through reorganization makes GPU-based vector search competitive with CPU versions.
- Both relational and vector-search parts become faster on GPU than on CPU when the smaller index is used, especially over fast interconnects.
Where Pith is reading between the lines
- Database architects may need to treat vector indexes as first-class GPU-resident structures rather than CPU-first objects that are occasionally copied.
- The same reorganization technique could be tested on other vector-search workloads outside TPC-H to check whether the size reduction generalizes.
- Future engines might expose the choice of index layout as a tunable parameter so users can trade index size for GPU acceleration.
Load-bearing premise
The modular execution engine accurately models the overheads and integration costs that would appear in a production relational database engine when adding GPU vector search support.
What would settle it
A production implementation of the optimized index inside an actual database engine that still shows higher end-to-end latency on GPU than on CPU even with NVLink.
Figures
read the original abstract
Vector search (VS) is now available in most database engines. However, while vector search is a common feature in AI/ML/LLMs where the dominant computing platforms are GPUs, existing database engines operate on CPUs even when implementing vector search. This raises the question of whether integrating vector processing on GPUs as part of the engine would be a better design. In this paper, we explore this question in detail. First, we extend the TPC-H benchmark with vector data (from text and images) and propose a number of representative SQL+VS queries. Second, we develop a modular execution engine that can run SQL+VS queries across CPU and GPU. Third, we perform extensive experiments on a number of deployments: running the SQL+VS queries across CPU and/or GPU, with data residing in CPU or GPU memory, with existing indices and novel, optimized versions, as well as across different GPUs and interconnects (PCIe, NVLink). The results provide actionable and counter-intuitive insights on how to run such queries over CPUs and GPUs. For instance, the relational components benefit much more from running on the GPU than the vector search part. In addition, when the vector search involves moving data and indexes, using the GPU is not the best option, even with fast interconnects. Thus, we develop an alternative organization of vector index and embeddings that reduces the size of the index, making GPU-based vector search more competitive. With these improvements, the final result is that both the relational and vector search components are faster on the GPU, particularly on fast interconnects, in contrast with the architecture used in existing engines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper investigates whether GPU-based vector search should be integrated into relational database engines, which currently rely on CPUs. It extends TPC-H with vector data from text and images, defines representative SQL+VS queries, builds a modular execution engine supporting CPU/GPU execution with varying data placements and interconnects (PCIe, NVLink), and evaluates existing and novel index organizations. The central claim is that an alternative vector index/embedding layout reducing index size makes both relational and vector-search components faster on GPU than CPU, especially on fast interconnects, in contrast to architectures in existing engines.
Significance. If the results hold, the work provides actionable guidance for hybrid SQL+vector workloads in AI/ML contexts by quantifying when GPU acceleration benefits relational components more than vector search itself and by demonstrating a size-reduced index organization that improves GPU competitiveness. Strengths include the broad experimental matrix across hardware, data locations, and index variants, plus direct measurements rather than model-derived claims.
major comments (2)
- [Modular execution engine description and experimental setup] The central performance claims rest on a custom modular execution engine whose fidelity to production relational engine costs is not demonstrated. Query optimizer extensions, cost-model integration, buffer-pool interactions, transaction/concurrency semantics, and data-movement consistency checks are omitted; if these costs are material, the reported GPU advantages with the reduced-size index may not hold in a real deployment such as PostgreSQL.
- [Results and index organization sections] The paper should quantify the index-size reduction achieved by the alternative organization and show its effect on data-movement volume and query plans; without these measurements it is difficult to isolate whether the reported speedups are due to the new layout or to other experimental factors.
minor comments (2)
- [Benchmark and query definitions] Clarify the exact set of SQL+VS queries used and whether they are representative of production vector workloads beyond TPC-H extensions.
- [Experimental results] Add statistical significance tests or confidence intervals for the reported performance differences across configurations.
Simulated Author's Rebuttal
Thank you for the opportunity to respond to the referee's comments. We address each major comment below with explanations and indicate where revisions will be made to improve the manuscript.
read point-by-point responses
-
Referee: The central performance claims rest on a custom modular execution engine whose fidelity to production relational engine costs is not demonstrated. Query optimizer extensions, cost-model integration, buffer-pool interactions, transaction/concurrency semantics, and data-movement consistency checks are omitted; if these costs are material, the reported GPU advantages with the reduced-size index may not hold in a real deployment such as PostgreSQL.
Authors: We thank the referee for this observation. Our modular execution engine is a research prototype constructed specifically to isolate and directly measure the execution costs of relational operators and vector search on CPU versus GPU across controlled data placements and interconnects. This design choice enables precise attribution of performance differences to hardware and layout factors without the overheads of a full production stack. We acknowledge that a complete integration into a system such as PostgreSQL would introduce additional costs from query optimization, buffer-pool management, concurrency control, and consistency mechanisms that are outside the current scope. In the revised manuscript we will expand the experimental-setup section to explicitly discuss these limitations and their possible influence on generalizability, thereby clarifying the boundaries of our claims while retaining the value of the measured trade-offs. revision: partial
-
Referee: The paper should quantify the index-size reduction achieved by the alternative organization and show its effect on data-movement volume and query plans; without these measurements it is difficult to isolate whether the reported speedups are due to the new layout or to other experimental factors.
Authors: We agree that explicit quantification of the index-size reduction is needed to strengthen attribution of the observed speedups. The alternative organization reduces index size by co-locating compact embeddings with a pruned index structure, which directly lowers the volume of data transferred over the interconnect. In the revised version we will report concrete index sizes (in absolute terms and as percentage reduction) for both the baseline and proposed organizations, present measured data-movement volumes for representative queries, and describe how the smaller footprint alters execution plans within our modular engine. These additions will make it clearer that the performance gains stem from the reduced data movement enabled by the new layout. revision: yes
Circularity Check
No circularity: results are direct experimental measurements
full rationale
The paper conducts an empirical study: it extends TPC-H with vector data, builds a modular execution engine, and reports measured runtimes for SQL+VS queries across CPU/GPU, memory placements, indices, and interconnects. The central claim (alternative index/embedding organization improves GPU performance) follows from these measurements rather than any equation, fitted parameter, or self-citation that reduces the outcome to its own inputs by construction. No load-bearing derivation step collapses to a prior result or definition; the work is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The modular execution engine accurately captures the integration and data-movement costs of a full relational database engine.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We develop a modular execution engine that can run SQL+VS queries across CPU and GPU... alternative organization of vector index and embeddings that reduces the size of the index
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Key Insight.With current data-owning vector indexes, executing vector search on a GPU does not pay off, even with fast interconnects
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
DuckDB Vector Similarity Search (VSS) Extension
2024. DuckDB Vector Similarity Search (VSS) Extension. https://github.com/ duckdb/duckdb-vss. Accessed: 2026-05-15
work page 2024
-
[2]
Apache Software Foundation. 2026. Apache Arrow: A Cross-Language Devel- opment Platform for In-Memory Data. https://arrow.apache.org/. Accessed: 12 2026-04-29
work page 2026
-
[3]
Felipe Aramburú, William Malpica, Kaouther Abrougui, Amin Aramoon, Ro- mulo Auccapuclla, Claude Brisson, Matthijs Brobbel, Colby Farrell, Pradeep Garigipati, Joost Hoozemans, et al. 2025. Theseus: A Distributed and Scalable GPU-Accelerated Query Processing Platform Optimized for Efficient Data Move- ment.arXiv preprint arXiv:2508.05029(2025)
-
[4]
Martin Aumüller, Erik Bernhardsson, and Alexander Faithfull. 2020. ANN- Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms. Information Systems87 (2020), 101374. https://doi.org/10.1016/j.is.2019.02.006
-
[5]
David Boehme, Todd Gamblin, David Beckingsale, Peer-Timo Bremer, Alfredo Gimenez, Matthew LeGendre, Olga Pearce, and Martin Schulz. 2016. Caliper: performance introspection for HPC software stacks. InProceedings of the Inter- national Conference for High Performance Computing, Networking, Storage and Analysis(Salt Lake City, Utah)(SC ’16). IEEE Press, Art...
work page 2016
-
[6]
Cheng Chen, Chenzhe Jin, Yunan Zhang, Sasha Podolsky, Chun Wu, Szu- Po Wang, Eric Hanson, Zhou Sun, Robert Walzer, and Jianguo Wang. 2024. SingleStore-V: An Integrated Vector Database System in SingleStore.Proc. VLDB Endow.17, 12 (Aug. 2024), 3772–3785. https://doi.org/10.14778/3685800.3685805
-
[7]
Yannis Chronis, Helena Caminal, Yannis Papakonstantinou, Fatma Özcan, and Anastasia Ailamaki. 2025. Filtered Vector Search: State-of-the-Art and Research Opportunities.Proc. VLDB Endow.18, 12 (Aug. 2025), 5488–5492. https://doi. org/10.14778/3750601.3750700
-
[8]
Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, and Hervé Jégou. 2026. The Faiss Library.IEEE Transactions on Big Data12, 2 (2026), 346–361. https: //doi.org/10.1109/TBDATA.2025.3618474
- [9]
-
[10]
Google Cloud. 2025. ScaNN for AlloyDB. https://services.google.com/fh/files/ misc/scann_for_alloydb_whitepaper.pdf. Accessed: 2026-05-15
work page 2025
-
[11]
Mark Harris. 2012. How to Optimize Data Transfers in CUDA C/C++. NVIDIA Technical Blog. https://developer.nvidia.com/blog/how-optimize-data-transfers- cuda-cc/ Accessed: 2026-04-30
work page 2012
-
[12]
Yupeng Hou, Jiacheng Li, Zhankui He, An Yan, Xiusi Chen, and Julian McAuley
-
[13]
Bridging language and items for retrieval and recommendation.arXiv preprint arXiv:2403.03952(2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[14]
Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2021. Billion-Scale Similarity Search with GPUs.IEEE Transactions on Big Data7, 3 (2021), 535–547. https: //doi.org/10.1109/TBDATA.2019.2921572
-
[15]
Marko Kabić, Shriram Chandran, and Gustavo Alonso. 2025. Maximus: A Modular Accelerated Query Engine for Data Analytics on Heterogeneous Systems.Proc. ACM Manag. Data3, 3, Article 187 (June 2025), 25 pages. https://doi.org/10.1145/ 3725324
work page 2025
-
[16]
Marko Kabić, Bowen Wu, Jonas Dann, and Gustavo Alonso. 2025. Powerful GPUs or Fast Interconnects: Analyzing Relational Workloads on Modern GPUs.Proc. VLDB Endow.18, 11 (July 2025), 4350–4363. https://doi.org/10.14778/3749646. 3749698
-
[17]
Andrew Kane et al. 2025. pgvector: Open-Source Vector Similarity Search for Postgres. https://github.com/pgvector/pgvector. Accessed: 2026-05-15
work page 2025
-
[18]
Guoxin Kang, Zhongxin Ge, Jingpei Hu, Xueya Zhang, Lei Wang, and Jianfeng Zhan. 2025. BigVectorBench: Heterogeneous Data Embedding and Compound Queries are Essential in Evaluating Vector Databases.Proc. VLDB Endow.18, 5 (Jan. 2025), 1536–1550. https://doi.org/10.14778/3718057.3718078
-
[19]
Hyunjoon Kim, Chaerim Lim, Hyeonjun An, Rathijit Sen, and Kwanghyun Park
- [20]
- [21]
- [22]
-
[23]
Clemens Lutz, Sebastian Breß, Steffen Zeuch, Tilmann Rabl, and Volker Markl
-
[24]
Pump Up the Volume: Processing Large Data on GPUs with Fast Inter- connects. InProceedings of the 2020 ACM SIGMOD International Conference on Management of Data(Portland, OR, USA)(SIGMOD ’20). Association for Comput- ing Machinery, New York, NY, USA, 1633–1649. https://doi.org/10.1145/3318464. 3389705
-
[25]
Vasilis Mageirakos, Bowen Wu, and Gustavo Alonso. 2025. Cracking Vector Search Indexes.Proc. VLDB Endow.18, 11 (July 2025), 3951–3964. https://doi. org/10.14778/3749646.3749666
-
[26]
Yu A. Malkov and D. A. Yashunin. 2020. Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence42, 4 (2020), 824–
work page 2020
-
[27]
https://doi.org/10.1109/TPAMI.2018.2889473
-
[28]
Meta AI Research. 2025. FAISS v1.13.0, gpu/impl/IndexUtils.cu: getMaxKSelection. https://github.com/facebookresearch/faiss/blob/v1.13.0/ faiss/gpu/impl/IndexUtils.cu. Accessed 2026-04-13
work page 2025
-
[29]
Chenghao Mo, Ben Karsin, Philip Adams, and Minjia Zhang. 2026. VecFlow- Chamfer: A GPU-based Data Management System for High-Performance Multi- Vector Search on Superchips.Proc. ACM Manag. Data4, 1, Article 92 (April 2026), 26 pages. https://doi.org/10.1145/3786706
-
[30]
Hubert Mohr-Daurat, Xuan Sun, and Holger Pirk. 2023. BOSS - An Architecture for Database Kernel Composition.Proc. VLDB Endow.17, 4 (Dec. 2023), 877–890. https://doi.org/10.14778/3636218.3636239
-
[31]
NVIDIA. 2026. CUDA C++ Programming Guide: Full Unified Memory with Hardware Coherency. https://docs.nvidia.com/cuda/cuda-programming- guide/02-basics/understanding-memory.html#full-unified-memory-with- hardware-coherency. Accessed 2026-04-29
work page 2026
-
[32]
NVIDIA Corporation. 2023. Matrix Multiplication Background User’s Guide. https://docs.nvidia.com/deeplearning/performance/dl-performance- matrix-multiplication/. NVIDIA Deep Learning Performance Documentation. Accessed: 2026-04-29
work page 2023
-
[33]
NVIDIA Corporation. 2024. NVIDIA Grace Hopper Superchip. https://www. nvidia.com/en-us/data-center/grace-hopper-superchip/. Accessed: 2026-04-29
work page 2024
-
[34]
NVIDIA Corporation. 2025. NVIDIA DGX Spark Datasheet. https: //nvdam.widen.net/s/tlzm8smqjx/workstation-datasheet-dgx-spark-gtc25- spring-nvidia-us-3716899-web. GTC 2025 Spring. Accessed: 2026-05-01
work page 2025
-
[35]
NVIDIA Corporation. 2026. NVIDIA Nsight Systems. https://developer.nvidia. com/nsight-systems. Accessed: 2026-04-29
work page 2026
-
[36]
Hiroyuki Ootomo, Akira Naruse, Corey Nolet, Ray Wang, Tamas Feher, and Yong Wang. 2024. CAGRA: Highly Parallel Graph Construction and Approximate Nearest Neighbor Search for GPUs. In2024 IEEE 40th International Conference on Data Engineering (ICDE). 4236–4247. https://doi.org/10.1109/ICDE60146.2024. 00323
-
[37]
Oracle Corporation. 2025. Oracle AI Vector Search User’s Guide. https://docs.oracle.com/en/database/oracle/oracle-database/23/vecse/ai- vector-search-users-guide.pdf. Accessed: 2026-05-15
work page 2025
-
[38]
Pinecone. 2025. Pinecone: The Vector Database for AI Search and Retrieval. https://www.pinecone.io/. Accessed: 2026-05-15
work page 2025
-
[39]
Mark Raasveldt and Hannes Mühleisen. 2019. DuckDB: an Embeddable Analytical Database. InProceedings of the 2019 International Conference on Management of Data(Amsterdam, Netherlands)(SIGMOD ’19). Association for Computing Machinery, New York, NY, USA, 1981–1984. https://doi.org/10.1145/3299869. 3320212
-
[40]
RAPIDS Development Team. 2026. cuDF: A GPU DataFrame Library. https: //github.com/rapidsai/cudf. NVIDIA RAPIDS
work page 2026
-
[41]
RAPIDS Development Team. 2026. cuVS: Vector Search and Clustering on the GPU. https://github.com/rapidsai/cuvs. NVIDIA RAPIDS
work page 2026
-
[42]
RAPIDS Development Team. 2026. RMM: RAPIDS Memory Manager. https: //github.com/rapidsai/rmm. NVIDIA RAPIDS
work page 2026
-
[43]
Yasin N. Silva, Walid G. Aref, and Mohamed H. Ali. 2010. The similarity join database operator. In2010 IEEE 26th International Conference on Data Engineering (ICDE 2010). 892–903. https://doi.org/10.1109/ICDE.2010.5447873
-
[44]
Josef Sivic and Andrew Zisserman. 2003. Video Google: A Text Retrieval Ap- proach to Object Matching in Videos. InProceedings of the Ninth IEEE Inter- national Conference on Computer Vision - Volume 2 (ICCV ’03). IEEE Computer Society, USA, 1470
work page 2003
-
[45]
Michael Stonebraker and Andrew Pavlo. 2024. What Goes Around Comes Around... And Around...SIGMOD Rec.53, 2 (July 2024), 21–37. https://doi.org/ 10.1145/3685980.3685984
-
[46]
Ji Sun, Guoliang Li, James Pan, Jiang Wang, Yongqing Xie, Ruicheng Liu, and Wen Nie. 2025. GaussDB-Vector: A Large-Scale Persistent Real-Time Vector Database for LLM Applications.Proc. VLDB Endow.18, 12 (Aug. 2025), 4951–4963. https://doi.org/10.14778/3750601.3750619
-
[47]
2022.TPC Benchmark H (Deci- sion Support) Standard Specification
Transaction Processing Performance Council. 2022.TPC Benchmark H (Deci- sion Support) Standard Specification. Technical Report. Transaction Processing Performance Council (TPC). https://www.tpc.org/TPC_Documents_Current_ Versions/pdf/TPC-H_v3.0.1.pdf Version 3.0.1, Accessed: 2026-05-15
work page 2022
-
[48]
Transaction Processing Performance Council. 2024. TPC Benchmark DS (TPC- DS) Standard Specification. https://www.tpc.org/tpcds/. Version 4.0.0, Accessed: 2026-04-29
work page 2024
-
[49]
Michael Tschannen, Alexey Gritsenko, Xiao Wang, Muhammad Ferjad Naeem, Ibrahim Alabdulmohsin, Nikhil Parthasarathy, Talfan Evans, Lucas Beyer, Ye Xia, Basil Mustafa, et al. 2025. Siglip 2: Multilingual vision-language encoders with improved semantic understanding, localization, and dense features.arXiv preprint arXiv:2502.14786(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[50]
Nitish Upreti, Harsha Vardhan Simhadri, Hari Sudan Sundar, Krishnan Sundaram, Samer Boshra, Balachandar Perumalswamy, Shivam Atri, Martin Chisholm, Revti Raman Singh, Greg Yang, Tamara Hass, Nitesh Dudhey, Subramanyam Pattipaka, Mark Hildebrand, Magdalen Manohar, Jack Moffitt, Haiyang Xu, Naren 13 Datha, Suryansh Gupta, Ravishankar Krishnaswamy, Prashant ...
-
[51]
Karthik Venkatasubba, Saim Khan, Somesh Singh, Harsha Vardhan Simhadri, and Jyothi Vedurada. 2025. BANG: Billion-Scale Approximate Nearest Neighbour Search Using a Single GPU.IEEE Transactions on Big Data11, 6 (2025), 3142–3157. https://doi.org/10.1109/TBDATA.2025.3581085
-
[52]
Jianguo Wang, Xiaomeng Yi, Rentong Guo, Hai Jin, Peng Xu, Shengjun Li, Xiangyu Wang, Xiangzhou Guo, Chengming Li, Xiaohai Xu, Kun Yu, Yuxing Yuan, Yinghao Zou, Jiquan Long, Yudong Cai, Zhenxiang Li, Zhifeng Zhang, Yihua Mo, Jun Gu, Ruiyi Jiang, Yi Wei, and Charles Xie. 2021. Milvus: A Purpose-Built Vector Data Management System. InProceedings of the 2021 ...
-
[53]
Weaviate. 2025. Weaviate Vector Database. https://weaviate.io/. Accessed: 2026-05-15
work page 2025
-
[54]
Chuangxian Wei, Bin Wu, Sheng Wang, Renjie Lou, Chaoqun Zhan, Feifei Li, and Yuanzhe Cai. 2020. AnalyticDB-V: a hybrid analytical engine towards query fusion for structured and unstructured data.Proc. VLDB Endow.13, 12 (Aug. 2020), 3152–3165. https://doi.org/10.14778/3415478.3415541
- [55]
-
[56]
Bowen Wu, Dimitrios Koutsoukos, and Gustavo Alonso. 2025. Efficiently Pro- cessing Joins and Grouped Aggregations on GPUs.Proc. ACM Manag. Data3, 1, Article 39 (Feb. 2025), 27 pages. https://doi.org/10.1145/3709689
-
[57]
Jingyi Xi, Chenghao Mo, Ben Karsin, Artem Chirkin, Mingqin Li, and Minjia Zhang. 2025. VecFlow: A High-Performance Vector Data Management System for Filtered-Search on GPUs.Proc. ACM Manag. Data3, 4, Article 271 (Sept. 2025), 27 pages. https://doi.org/10.1145/3749189
-
[58]
Jiadong Xie, Jeffrey Xu Yu, and Yingfan Liu. 2025. Fast Approximate Similarity Join in Vector Databases.Proc. ACM Manag. Data3, 3, Article 158 (June 2025), 26 pages. https://doi.org/10.1145/3725403
- [59]
-
[60]
Qianxi Zhang, Shuotao Xu, Qi Chen, Guoxin Sui, Jiadong Xie, Zhizhen Cai, Yaoqi Chen, Yinxuan He, Yuqing Yang, Fan Yang, Mao Yang, and Lidong Zhou
-
[61]
In17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23)
VBASE: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity. In17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23). USENIX Association, Boston, MA, 377–395. https://www.usenix.org/conference/osdi23/presentation/zhang-qianxi
-
[62]
Yanzhao Zhang, Mingxin Li, Dingkun Long, Xin Zhang, Huan Lin, Baosong Yang, Pengjun Xie, An Yang, Dayiheng Liu, Junyang Lin, Fei Huang, and Jingren Zhou
-
[63]
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models.arXiv preprint arXiv:2506.05176(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[64]
Zili Zhang, Fangyue Liu, Gang Huang, Xuanzhe Liu, and Xin Jin. 2024. Fast vector query processing for large datasets beyond GPU memory with reordered pipelin- ing. InProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation(Santa Clara, CA, USA)(NSDI’24). USENIX Association, USA, Article 2, 18 pages
work page 2024
-
[65]
Jiaxu Zhu, Jiayu Yuan, Kaiwen Yang, Xiaobao Chen, Shihuan Yu, Hongchang Lv, Yan Li, and Bolong Zheng. 2025. An Experimental Evaluation of Hybrid Querying on Vectors.Proc. VLDB Endow.19, 2 (Oct. 2025), 183–195. https: //doi.org/10.14778/3773749.3773757 14
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.