TabEmbed is the first generalist embedding model for tabular data that unifies classification and retrieval in one space via contrastive learning and outperforms text embedding models on the new TabBench benchmark.
Title resolution pending
7 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 7representative citing papers
PEFT-Bench is a standardized end-to-end benchmark for 7 PEFT methods across 27 NLP datasets on autoregressive LLMs, accompanied by the PSCP metric that penalizes based on trainable parameters, inference speed, and training memory.
PEFT-Factory supplies a ready-to-use, extensible codebase that unifies 19 PEFT methods and evaluation pipelines for fine-tuning large autoregressive language models.
8:16 sparsity with variance correction and outlier handling lets compressed LLMs match or exceed dense-model accuracy under fixed memory limits, outperforming the common 2:4 pattern in flexibility.
ModernBERT is a new bidirectional encoder model achieving SOTA performance on diverse classification and retrieval benchmarks while offering superior speed and memory efficiency for long-context inference.
Constructs gender-perturbed Bangla classification benchmarks and proposes RandSymKL debiasing that reduces extrinsic gender bias in pretrained models.
A 72GB Tibetan corpus enables continual pre-training of Qwen2.5-7B and a 50B-A10B MoE model, with new benchmarks showing outperformance over prior Tibetan models.
citing papers explorer
-
TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding
TabEmbed is the first generalist embedding model for tabular data that unifies classification and retrieval in one space via contrastive learning and outperforms text embedding models on the new TabBench benchmark.
-
PEFT-Bench: A Parameter-Efficient Fine-Tuning Methods Benchmark
PEFT-Bench is a standardized end-to-end benchmark for 7 PEFT methods across 27 NLP datasets on autoregressive LLMs, accompanied by the PSCP metric that penalizes based on trainable parameters, inference speed, and training memory.
-
PEFT-Factory: Unified Parameter-Efficient Fine-Tuning of Autoregressive Large Language Models
PEFT-Factory supplies a ready-to-use, extensible codebase that unifies 19 PEFT methods and evaluation pipelines for fine-tuning large autoregressive language models.
-
From 2:4 to 8:16 sparsity patterns in LLMs for Outliers and Weights with Variance Correction
8:16 sparsity with variance correction and outlier handling lets compressed LLMs match or exceed dense-model accuracy under fixed memory limits, outperforming the common 2:4 pattern in flexibility.
-
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
ModernBERT is a new bidirectional encoder model achieving SOTA performance on diverse classification and retrieval benchmarks while offering superior speed and memory efficiency for long-context inference.
-
Mitigating Extrinsic Gender Bias for Bangla Classification Tasks
Constructs gender-perturbed Bangla classification benchmarks and proposes RandSymKL debiasing that reduces extrinsic gender bias in pretrained models.
-
From Curated Data to Scalable Models: Continual Pre-training of Dense and MoE Large Language Models for Tibetan
A 72GB Tibetan corpus enables continual pre-training of Qwen2.5-7B and a 50B-A10B MoE model, with new benchmarks showing outperformance over prior Tibetan models.