Empowering 1000 tokens/second on-device LLM prefilling with mllm- NPU,

· 2024 · arXiv 2407.05858

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

Less is More: Lightweight Prompt Compression for Question Answering Applications on Edge Devices

cs.CL · 2026-04-27 · unverdicted · novelty 4.0

CORE is a lightweight two-stage prompt compression method for edge-device RAG QA that builds answer and clue sets via NER and semantic matching then refines them to deliver higher accuracy and lower resource costs than baselines.

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

cs.MM · 2024-10-28 · unverdicted · novelty 3.0

Survey proposing a taxonomy for document parsing into pipeline-based systems and VLM-driven unified models, reviewing components, metrics, benchmarks, and challenges.

citing papers explorer

Showing 2 of 2 citing papers.

Less is More: Lightweight Prompt Compression for Question Answering Applications on Edge Devices cs.CL · 2026-04-27 · unverdicted · none · ref 39
CORE is a lightweight two-stage prompt compression method for edge-device RAG QA that builds answer and clue sets via NER and semantic matching then refines them to deliver higher accuracy and lower resource costs than baselines.
Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction cs.MM · 2024-10-28 · unverdicted · none · ref 269
Survey proposing a taxonomy for document parsing into pipeline-based systems and VLM-driven unified models, reviewing components, metrics, benchmarks, and challenges.

Empowering 1000 tokens/second on-device LLM prefilling with mllm- NPU,

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer