pith. sign in

Feature hashing for large scale multitask learning

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it
abstract

Empirical evidence suggests that hashing is an effective strategy for dimensionality reduction and practical nonparametric estimation. In this paper we provide exponential tail bounds for feature hashing and show that the interaction between random subspaces is negligible with high probability. We demonstrate the feasibility of this approach with experimental results for a new use case -- multitask learning with hundreds of thousands of tasks.

citation-role summary

background 1

citation-polarity summary

roles

background 1

polarities

background 1

representative citing papers

Semantic Product Search

cs.IR · 2019-07-01 · unverdicted · novelty 5.0

A neural semantic matcher for product search uses a custom loss on behavior data, n-gram pooling, and hashing to beat prior methods by 4.7% Recall@100 and 14.5% MAP.

citing papers explorer

Showing 3 of 3 citing papers.

  • Sketching the Readout of Large Language Models for Scalable Data Attribution and Valuation cs.LG · 2026-04-17 · unverdicted · none · ref 53

    RISE applies CountSketch to dual lexical and semantic channels derived from output-layer gradient outer products, cutting data attribution storage by up to 112x and enabling retrospective and prospective influence analysis on LLMs up to 32B parameters.

  • Applying Graph Analysis for Unsupervised Fast Malware Fingerprinting cs.CR · 2025-10-07 · conditional · none · ref 21 · internal anchor

    TrapNet applies PCA-based FloatHash vectors and graph community detection to enable unsupervised malware fingerprinting and family attribution from static analysis.

  • Semantic Product Search cs.IR · 2019-07-01 · unverdicted · none · ref 31 · internal anchor

    A neural semantic matcher for product search uses a custom loss on behavior data, n-gram pooling, and hashing to beat prior methods by 4.7% Recall@100 and 14.5% MAP.