Scaling noise magnitude in NCE aligns gradients with MLE, enabling a practical approximation that improves performance on CIFAR-10 and ImageNet image modeling with fewer training steps.
Chip placement with deep reinforcement learning
2 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 2representative citing papers
An RL agent using Soft Actor-Critic with Mixture-of-Experts jointly optimizes ASIC architecture, memory hierarchy, and partitioning for AI inference, achieving 29809 tokens/s for Llama 3.1 at 3nm and under 13mW for SmolVLM across 3-28nm nodes without manual retuning.
citing papers explorer
-
"Noisier" Noise Contrastive Eestimation is (Almost) Maximum Likelihood
Scaling noise magnitude in NCE aligns gradients with MLE, enabling a practical approximation that improves performance on CIFAR-10 and ImageNet image modeling with fewer training steps.
-
From LLM to Silicon: RL-Driven ASIC Architecture Exploration for On-Device AI Inference
An RL agent using Soft Actor-Critic with Mixture-of-Experts jointly optimizes ASIC architecture, memory hierarchy, and partitioning for AI inference, achieving 29809 tokens/s for Llama 3.1 at 3nm and under 13mW for SmolVLM across 3-28nm nodes without manual retuning.