pith. sign in

Space/time trade-offs in hash coding with allowable errors,

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

citation-role summary

method 1 other 1

citation-polarity summary

polarities

unclear 1 use method 1

representative citing papers

PipeANN-Filter: An Efficient Filtered Vector Search System on SSD

cs.OS · 2026-05-18 · unverdicted · novelty 6.0

PipeANN-Filter improves filtered vector search latency and throughput on SSD by exploring a superset of valid vectors identified via probabilistic filters and verifying attributes only after selecting top-k candidates.

StarCoder: may the source be with you!

cs.CL · 2023-05-09 · accept · novelty 5.0

StarCoderBase matches or beats OpenAI's code-cushman-001 on multi-language code benchmarks; the Python-fine-tuned StarCoder reaches 40% pass@1 on HumanEval while retaining other-language performance.

Bloom Filter Encoding for Machine Learning

cs.LG · 2025-12-23 · unverdicted · novelty 4.0

Bloom filter encodings convert data samples to bit arrays that support comparable classifier performance to raw data across text, time-series, tabular, and image datasets while delivering consistent memory savings.

citing papers explorer

Showing 4 of 4 citing papers.

  • PipeANN-Filter: An Efficient Filtered Vector Search System on SSD cs.OS · 2026-05-18 · unverdicted · none · ref 4

    PipeANN-Filter improves filtered vector search latency and throughput on SSD by exploring a superset of valid vectors identified via probabilistic filters and verifying attributes only after selecting top-k candidates.

  • DataComp-LM: In search of the next generation of training sets for language models cs.LG · 2024-06-17 · unverdicted · none · ref 27

    DCLM-Baseline dataset lets a 7B model reach 64% 5-shot MMLU accuracy after 2.6T tokens, beating prior open-data models by 6.6 points on MMLU with 40% less compute.

  • StarCoder: may the source be with you! cs.CL · 2023-05-09 · accept · none · ref 261

    StarCoderBase matches or beats OpenAI's code-cushman-001 on multi-language code benchmarks; the Python-fine-tuned StarCoder reaches 40% pass@1 on HumanEval while retaining other-language performance.

  • Bloom Filter Encoding for Machine Learning cs.LG · 2025-12-23 · unverdicted · none · ref 3

    Bloom filter encodings convert data samples to bit arrays that support comparable classifier performance to raw data across text, time-series, tabular, and image datasets while delivering consistent memory savings.