PERMA is a new benchmark using temporally ordered events, text variability, and linguistic alignment to evaluate LLM memory agents on persona consistency beyond simple retrieval.
Milvus: A purpose-built vector data management system
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
PipeANN-Filter improves filtered vector search latency and throughput on SSD by exploring a superset of valid vectors identified via probabilistic filters and verifying attributes only after selecting top-k candidates.
FAVOR achieves 1.3-5x higher QPS at 95% Recall@10 for arbitrary filtered ANNS by combining exclusion-distance reshaping in HNSW graphs with a selectivity-driven router that switches between brute-force and optimized search.
TensorHub uses Reference-Oriented Storage to enable scalable weight transfer in LLM RL training by referencing replicated GPU weights, achieving up to 19x reduction in cross-datacenter stall time.
PDR is a user-context-aware framework for LLM research agents that improves report relevance over static baselines, supported by a new dataset and hybrid evaluation.
Relational engines achieve faster SQL+vector-search queries on GPU than CPU when using compact vector indexes and fast interconnects, reversing the CPU-only design in current systems.
This survey discusses key components and challenges for Personal LLM Agents and reviews solutions for their capability, efficiency, and security.
citing papers explorer
-
PERMA: Benchmarking Personalized Memory Agents via Event-Driven Preference and Realistic Task Environments
PERMA is a new benchmark using temporally ordered events, text variability, and linguistic alignment to evaluate LLM memory agents on persona consistency beyond simple retrieval.
-
PipeANN-Filter: An Efficient Filtered Vector Search System on SSD
PipeANN-Filter improves filtered vector search latency and throughput on SSD by exploring a superset of valid vectors identified via probabilistic filters and verifying attributes only after selecting top-k candidates.
-
FAVOR: Efficient Filter-Agnostic Vector ANNS Based on Selectivity-Aware Exclusion Distances
FAVOR achieves 1.3-5x higher QPS at 95% Recall@10 for arbitrary filtered ANNS by combining exclusion-distance reshaping in HNSW graphs with a selectivity-driven router that switches between brute-force and optimized search.
-
TensorHub: Scalable and Elastic Weight Transfer for LLM RL Training
TensorHub uses Reference-Oriented Storage to enable scalable weight transfer in LLM RL training by referencing replicated GPU weights, achieving up to 19x reduction in cross-datacenter stall time.
-
Personalized Deep Research: A User-Centric Framework, Dataset, and Hybrid Evaluation for Knowledge Discovery
PDR is a user-context-aware framework for LLM research agents that improves report relevance over static baselines, supported by a new dataset and hybrid evaluation.
-
To GPU or Not to GPU: Vector Search in Relational Engines
Relational engines achieve faster SQL+vector-search queries on GPU than CPU when using compact vector indexes and fast interconnects, reversing the CPU-only design in current systems.
-
Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security
This survey discusses key components and challenges for Personal LLM Agents and reviews solutions for their capability, efficiency, and security.