A new corpus of 100,502 annotated movie reviews from Kazakhstan enables sentiment analysis research in Russian, Kazakh, and code-switched texts.
Title resolution pending
17 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
For binary classification in the NTK regime, LoRA rank r=1 suffices and is often optimal under cross-entropy loss, reducing the prior sufficient condition from r>=12.
FiBeR adds a closed-form filter-aware correction A(ω)σ_w² to the second-moment term for temporally filtered DP gradients, improving adaptive optimization performance.
Adaptive Instruction Composition uses a neural contextual bandit with RL to adaptively combine crowdsourced texts, generating more effective and diverse LLM jailbreaks than random or prior adaptive methods on Harmbench.
NodePFN pre-trains on synthetic graphs with controllable homophily and causal feature-label models to achieve 71.27 average accuracy on 23 node classification benchmarks without graph-specific training.
DeepSeek-V2 delivers top-tier open-source LLM performance using only 21B active parameters by compressing the KV cache 93.3% and cutting training costs 42.5% via MLA and DeepSeekMoE.
Computational experiments show verb learning benefits in child-directed language likely stem from spoken register properties rather than unique optimization for children.
Frozen Mamba patch-boundary readouts do not outperform mean pooling for sentence representations on SST-2, CoLA, MRPC, STS-B, and IMDb due to anisotropy (cosine similarity ~0.9999) and representational collapse (MCC=0 on CoLA).
Math reasoning gains in LLMs rarely transfer to general domains; RL tuning generalizes while SFT causes forgetting and representation drift.
Empirical analysis shows scaling inference compute via strategies like tree search can be more efficient than scaling model parameters, with 7B models plus novel search outperforming 34B models.
Fine-tuned language models store knowledge in parameters to answer questions competitively with retrieval-based open-domain QA systems.
Compares LLMs against semantic similarity for binary classification of student self-explanations in programming education.
Socio-Contrastive Learning jointly learns socio-demographic representations and textual features via contrastive objectives to predict annotator perspectives more accurately than concatenation baselines.
StarCoderBase matches or beats OpenAI's code-cushman-001 on multi-language code benchmarks; the Python-fine-tuned StarCoder reaches 40% pass@1 on HumanEval while retaining other-language performance.
DeepSeek LLM 67B exceeds LLaMA-2 70B on code, mathematics and reasoning benchmarks after pre-training on 2 trillion tokens and alignment via SFT and DPO.
A tutorial synthesizing foundations, recent models such as PALO and Maya, and low-cost methods for tri-modal multilingual AI in resource-constrained settings.
Llama 3.1 8B fine-tuned with calibrated 5% synthetic data augmentation reaches 0.6234 F1-macro on multi-class toxicity detection in gaming chat and places fourth among 35 teams.
citing papers explorer
No citing papers match the current filters.