{"work":{"id":"c0bc4689-3ce8-4e3d-9442-bd74869445bb","openalex_id":null,"doi":null,"arxiv_id":"2404.06654","raw_key":null,"title":"RULER: What's the Real Context Size of Your Long-Context Language Models?","authors":null,"authors_text":"Cheng-Ping Hsieh, Simeng Sun, Samuel Kriman, Shantanu Acharya, Dima Rekesh, Fei Jia","year":2024,"venue":"cs.CL","abstract":"The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of information (the \"needle\") from long distractor texts (the \"haystack\"), has been widely adopted to evaluate long-context language models (LMs). However, this simple retrieval-based test is indicative of only a superficial form of long-context understanding. To provide a more comprehensive evaluation of long-context LMs, we create a new synthetic benchmark RULER with flexible configurations for customized sequence length and task complexity. RULER expands upon the vanilla NIAH test to encompass variations with diverse types and quantities of needles. Moreover, RULER introduces new task categories multi-hop tracing and aggregation to test behaviors beyond searching from context. We evaluate 17 long-context LMs with 13 representative tasks in RULER. Despite achieving nearly perfect accuracy in the vanilla NIAH test, almost all models exhibit large performance drops as the context length increases. While these models all claim context sizes of 32K tokens or greater, only half of them can maintain satisfactory performance at the length of 32K. Our analysis of Yi-34B, which supports context length of 200K, reveals large room for improvement as we increase input length and task complexity. We open source RULER to spur comprehensive evaluation of long-context LMs.","external_url":"https://arxiv.org/abs/2404.06654","cited_by_count":null,"metadata_source":"pith","metadata_fetched_at":"2026-05-25T06:45:25.639099+00:00","pith_arxiv_id":"2404.06654","created_at":"2026-05-09T05:55:30.315201+00:00","updated_at":"2026-05-25T06:45:25.639099+00:00","title_quality_ok":true,"display_title":"RULER: What's the Real Context Size of Your Long-Context Language Models?","render_title":"RULER: What's the Real Context Size of Your Long-Context Language Models?"},"hub":{"state":{"work_id":"c0bc4689-3ce8-4e3d-9442-bd74869445bb","tier":"super_hub","tier_reason":"100+ Pith inbound or 10,000+ external citations","pith_inbound_count":118,"external_cited_by_count":null,"distinct_field_count":10,"first_pith_cited_at":"2024-06-12T05:25:15+00:00","last_pith_cited_at":"2026-05-22T02:42:41+00:00","author_build_status":"needed","summary_status":"needed","contexts_status":"needed","graph_status":"needed","ask_index_status":"needed","reader_status":"not_needed","recognition_status":"not_needed","updated_at":"2026-05-30T14:31:07.993108+00:00","tier_text":"super_hub"},"tier":"super_hub","role_counts":[{"context_role":"dataset","n":16},{"context_role":"background","n":12}],"polarity_counts":[{"context_polarity":"use_dataset","n":16},{"context_polarity":"background","n":12}],"runs":{"ask_index":{"job_type":"ask_index","status":"succeeded","result":{"title":"RULER: What's the Real Context Size of Your Long-Context Language Models?","claims":[{"claim_text":"The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of information (the \"needle\") from long distractor texts (the \"haystack\"), has been widely adopted to evaluate long-context language models (LMs). However, this simple retrieval-based test is indicative of only a superficial form of long-context understanding. To provide a more comprehensive evaluation of long-context LMs, we create a new synthetic benchmark RULER with flexible configurations for customized sequence length and task complexity. RULER expands upon the vanilla NIAH test to encompass variations wi","claim_type":"abstract","evidence_strength":"source_metadata"},{"claim_text":"ing UniPrefill as a continuous batching operator [38] and extending vLLM [15]'s scheduler to natively support prefill-decode co-processing under UniPrefill's token-dropping regime. This tight integration allows UniPrefill to function as a transparent acceleration layer within produc- tion inference engines, without requiring changes to model weights or serving infrastructure. We evaluate UniPrefill on RULER [11] with multiple model architectures. Results demon- strate that UniPrefill introduces ","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"0 68.2 39.4 76.1 40.9 63.8 57.9 73.2 69.7 62.9 52.0 58.9 Table 3:Comparison of different techniques across backbone models Llama-3.2-3B. which includes ARC-Challenge (ARC) [9], ARC-Easy (ARE) [9], HellaSwag (HS) [53], OpenBookQA (OB) [ 31], PIQA [4], RACE (RA) [23], and WinoGrande (WG) [ 41]. For long context evaluations we use all 13 tasks from RULER [20] benchmark. For math reasoning, we include GSM8K [10]. Baselines.We compare with hybrid model upcycling approaches including MambainLlama[ 46]","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"HellaSwag [54] ! HLE [55]! HumanEval [56]! IFEval [57]! ! INCLUDE [58]! InfiniteBench [59]! LiveBench [60]! LiveCodeBench [61]! MATH [62]! ! MATH-500 [63]! MBPP EvalPlus [64]! MGSM [65]! MlogiQA [66]! MMMLU [67]! ! MMLU [67]! ! ! MMLU-Pro [68]! MMLU-Redux [69]! Multi-IF [70]! MultiPL-E [71]! Needle-in-a-Haystack [72]! Nexus [73]! Pile [74] ! PolyMath [75]! RULER [76]! TruthfulQA [77]! WinoGrande [78]! WritingBench [79]! ZebraLogic [80]! ZeroSCROLLS [81]! τ-Bench Retail [82]! Table 2: Datasets us","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"ThinK [12] applies pruning at the channel level of the K cache rather than the token level (paired with SnapKV). SnapKV+ZipCache [11, 14] combines token eviction with post-hoc quantization of the retained cache. Benchmarks.We evaluate on LongBench [47] (16 tasks covering 6 different task categories), Needle- in-a-Haystack [48] (single-fact retrieval at varying depths), RULER [49] (11 retrieval and reasoning tasks from 4K to 128K), and InfiniteBench [50] (10 tasks with context lengths up to2M tok","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"thoroughly validates the model's foundational capabilities with nine benchmarks. • General Knowledge: MMLU [45] (5-shot, Cot), MMLU-Pro [46] (5-shot, Cot), CMMLU [47] (5-shot, Cot). 7 •Math: GSM8K [48] (4-shot, Cot), MATH [49] (4-shot), MATH-500 [49] (4-shot). •Coding: HumanEval [50] (5-shot), LiveCodeBench [51] (v6, 2023.05-2025.04). •Long-Context: RULER [52]. To ensure fair and reproducible comparisons, we adopt a standardized evaluation pipeline. Most benchmarks are executed with OpenCompass ","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"pattern is becoming increasingly common as frontier models support context windows approaching one million tokens [ 1, 17]. As increasingly large codebases and documents fit into context, the bottleneck shifts from encoding information in the context window to reliably retrieving precise, multi- level information from it. Existing long-context evaluations, including Needle in a Haystack [11], RULER [9], LongBench [2], ∞Bench [25], and Graphwalks [15], primarily focus oncontent-based retrieval, a","claim_type":"background","confidence":0.9,"evidence_strength":"citation_context"}],"why_cited":"Pith tracks RULER: What's the Real Context Size of Your Long-Context Language Models? because it crossed a citation-hub threshold. Current citing contexts most often use it as dataset evidence (16 contexts).","role_counts":[{"n":16,"context_role":"dataset"},{"n":12,"context_role":"background"}]},"error":null,"updated_at":"2026-05-20T17:52:08.089485+00:00"},"author_expand":{"job_type":"author_expand","status":"succeeded","result":{"authors_linked":[{"id":"ac6ee87a-e473-4edf-8093-eb3213732287","orcid":null,"display_name":"Cheng-Ping Hsieh"},{"id":"9ce33d49-9220-4e01-a424-4a854e156066","orcid":null,"display_name":"Simeng Sun"},{"id":"59f84804-a1b4-4be1-bee3-e329cf2e7e59","orcid":null,"display_name":"Samuel Kriman"},{"id":"e76f6972-4678-4f3b-95e3-f025317a3d20","orcid":null,"display_name":"Shantanu Acharya"},{"id":"5e329f55-0af3-4abe-a369-2ac78bb83de4","orcid":null,"display_name":"Dima Rekesh"},{"id":"1f9ff70a-08d1-4f4e-aed6-96e522541b7d","orcid":null,"display_name":"Fei Jia"}]},"error":null,"updated_at":"2026-05-20T17:52:08.375443+00:00"},"context_extract":{"job_type":"context_extract","status":"succeeded","result":{"enqueued_papers":25},"error":null,"updated_at":"2026-05-14T11:39:57.314793+00:00"},"graph_features":{"job_type":"graph_features","status":"succeeded","result":{"co_cited":[{"title":"Qwen3 Technical Report","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","shared_citers":16},{"title":"The Llama 3 Herd of Models","work_id":"1549a635-88af-4ac1-acfe-51ae7bb53345","shared_citers":16},{"title":"Efficient Streaming Language Models with Attention Sinks","work_id":"a8d25452-c237-48c9-88a4-682717c3979a","shared_citers":14},{"title":"DeepSeek-V3 Technical Report","work_id":"57d2791d-2219-4c31-a077-afc04b12a75c","shared_citers":12},{"title":"Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge","work_id":"28ea1282-d657-4c61-a83c-f1249be6d6b1","shared_citers":12},{"title":"Mamba: Linear-Time Sequence Modeling with Selective State Spaces","work_id":"4ee75248-1199-492c-a52f-6661e0f4adff","shared_citers":11},{"title":"Training Verifiers to Solve Math Word Problems","work_id":"acab1aa8-b4d6-40e0-a3ee-25341701dca2","shared_citers":10},{"title":"YaRN: Efficient Context Window Extension of Large Language Models","work_id":"31f454a9-7de2-4696-86f2-9bfa1410f80d","shared_citers":10},{"title":"DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning","work_id":"e6b75ad5-2877-4168-97c8-710407094d20","shared_citers":9},{"title":"Retentive Network: A Successor to Transformer for Large Language Models","work_id":"5b0449ac-92b0-41f2-8b4f-586c2b5a08b6","shared_citers":9},{"title":"DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models","work_id":"c5006563-f3ec-438a-9e35-b7b484f34828","shared_citers":8},{"title":"FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning","work_id":"fff3953b-5efb-4753-bee4-002f59995810","shared_citers":8},{"title":"Longformer: The Long-Document Transformer","work_id":"abea7a44-6668-4de7-aab6-f53a6e5aa088","shared_citers":8},{"title":"Instruction-Following Evaluation for Large Language Models","work_id":"3aa06177-125a-4f5a-8f4a-8070c5986c26","shared_citers":7},{"title":"PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling","work_id":"6317700d-f903-4ce1-8f53-b43cb146d48b","shared_citers":7},{"title":"Qwen2.5 Technical Report","work_id":"d8432992-4980-4a81-85c7-9fa2c2b87f85","shared_citers":7},{"title":"Evaluating Large Language Models Trained on Code","work_id":"042493e9-b26f-4b4e-bbde-382072ca9b08","shared_citers":6},{"title":"Generating Long Sequences with Sparse Transformers","work_id":"c5b81688-45ee-4a9a-b095-e6290f45cb6c","shared_citers":6},{"title":"gpt-oss-120b & gpt-oss-20b Model Card","work_id":"178c1f7e-4f19-4392-a45d-45a6dfa88ead","shared_citers":6},{"title":"Kimi Linear: An Expressive, Efficient Attention Architecture","work_id":"b2f9f1cd-c39c-4dbc-8637-f575681cdc01","shared_citers":6},{"title":"LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code","work_id":"ea9e51ce-1e75-4182-92d8-4d25f70d2ee4","shared_citers":6},{"title":"LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding","work_id":"ba7831c4-9427-4e0e-a5c1-4e98511f4b53","shared_citers":6},{"title":"Mistral 7B","work_id":"eb5e1305-ad11-4875-ad8d-ad8b8f697599","shared_citers":6},{"title":"A-MEM: Agentic Memory for LLM Agents","work_id":"3b98feb2-fdb1-479a-bbe4-2c298a4592e2","shared_citers":5}],"time_series":[{"n":4,"year":2025},{"n":54,"year":2026}],"dependency_candidates":[]},"error":null,"updated_at":"2026-05-14T11:39:57.336259+00:00"},"identity_refresh":{"job_type":"identity_refresh","status":"succeeded","result":{"items":[{"title":"Qwen3 Technical Report","outcome":"unchanged","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","resolver":"local_arxiv","confidence":0.98,"old_work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e"}],"counts":{"fixed":0,"merged":0,"unchanged":1,"quarantined":0,"needs_external_resolution":0},"errors":[],"attempted":1},"error":null,"updated_at":"2026-05-14T11:39:53.149637+00:00"},"role_polarity":{"job_type":"role_polarity","status":"succeeded","result":{"title":"RULER: What's the Real Context Size of Your Long-Context Language Models?","claims":[{"claim_text":"The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of information (the \"needle\") from long distractor texts (the \"haystack\"), has been widely adopted to evaluate long-context language models (LMs). However, this simple retrieval-based test is indicative of only a superficial form of long-context understanding. To provide a more comprehensive evaluation of long-context LMs, we create a new synthetic benchmark RULER with flexible configurations for customized sequence length and task complexity. RULER expands upon the vanilla NIAH test to encompass variations wi","claim_type":"abstract","evidence_strength":"source_metadata"},{"claim_text":"ing UniPrefill as a continuous batching operator [38] and extending vLLM [15]'s scheduler to natively support prefill-decode co-processing under UniPrefill's token-dropping regime. This tight integration allows UniPrefill to function as a transparent acceleration layer within produc- tion inference engines, without requiring changes to model weights or serving infrastructure. We evaluate UniPrefill on RULER [11] with multiple model architectures. Results demon- strate that UniPrefill introduces ","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"0 68.2 39.4 76.1 40.9 63.8 57.9 73.2 69.7 62.9 52.0 58.9 Table 3:Comparison of different techniques across backbone models Llama-3.2-3B. which includes ARC-Challenge (ARC) [9], ARC-Easy (ARE) [9], HellaSwag (HS) [53], OpenBookQA (OB) [ 31], PIQA [4], RACE (RA) [23], and WinoGrande (WG) [ 41]. For long context evaluations we use all 13 tasks from RULER [20] benchmark. For math reasoning, we include GSM8K [10]. Baselines.We compare with hybrid model upcycling approaches including MambainLlama[ 46]","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"HellaSwag [54] ! HLE [55]! HumanEval [56]! IFEval [57]! ! INCLUDE [58]! InfiniteBench [59]! LiveBench [60]! LiveCodeBench [61]! MATH [62]! ! MATH-500 [63]! MBPP EvalPlus [64]! MGSM [65]! MlogiQA [66]! MMMLU [67]! ! MMLU [67]! ! ! MMLU-Pro [68]! MMLU-Redux [69]! Multi-IF [70]! MultiPL-E [71]! Needle-in-a-Haystack [72]! Nexus [73]! Pile [74] ! PolyMath [75]! RULER [76]! TruthfulQA [77]! WinoGrande [78]! WritingBench [79]! ZebraLogic [80]! ZeroSCROLLS [81]! τ-Bench Retail [82]! Table 2: Datasets us","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"ThinK [12] applies pruning at the channel level of the K cache rather than the token level (paired with SnapKV). SnapKV+ZipCache [11, 14] combines token eviction with post-hoc quantization of the retained cache. Benchmarks.We evaluate on LongBench [47] (16 tasks covering 6 different task categories), Needle- in-a-Haystack [48] (single-fact retrieval at varying depths), RULER [49] (11 retrieval and reasoning tasks from 4K to 128K), and InfiniteBench [50] (10 tasks with context lengths up to2M tok","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"thoroughly validates the model's foundational capabilities with nine benchmarks. • General Knowledge: MMLU [45] (5-shot, Cot), MMLU-Pro [46] (5-shot, Cot), CMMLU [47] (5-shot, Cot). 7 •Math: GSM8K [48] (4-shot, Cot), MATH [49] (4-shot), MATH-500 [49] (4-shot). •Coding: HumanEval [50] (5-shot), LiveCodeBench [51] (v6, 2023.05-2025.04). •Long-Context: RULER [52]. To ensure fair and reproducible comparisons, we adopt a standardized evaluation pipeline. Most benchmarks are executed with OpenCompass ","claim_type":"dataset","confidence":0.95,"evidence_strength":"citation_context"},{"claim_text":"pattern is becoming increasingly common as frontier models support context windows approaching one million tokens [ 1, 17]. As increasingly large codebases and documents fit into context, the bottleneck shifts from encoding information in the context window to reliably retrieving precise, multi- level information from it. Existing long-context evaluations, including Needle in a Haystack [11], RULER [9], LongBench [2], ∞Bench [25], and Graphwalks [15], primarily focus oncontent-based retrieval, a","claim_type":"background","confidence":0.9,"evidence_strength":"citation_context"}],"why_cited":"Pith tracks RULER: What's the Real Context Size of Your Long-Context Language Models? because it crossed a citation-hub threshold. Current citing contexts most often use it as dataset evidence (16 contexts).","role_counts":[{"n":16,"context_role":"dataset"},{"n":12,"context_role":"background"}]},"error":null,"updated_at":"2026-05-20T17:52:08.378177+00:00"},"summary_claims":{"job_type":"summary_claims","status":"succeeded","result":{"title":"RULER: What's the Real Context Size of Your Long-Context Language Models?","claims":[{"claim_text":"The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of information (the \"needle\") from long distractor texts (the \"haystack\"), has been widely adopted to evaluate long-context language models (LMs). However, this simple retrieval-based test is indicative of only a superficial form of long-context understanding. To provide a more comprehensive evaluation of long-context LMs, we create a new synthetic benchmark RULER with flexible configurations for customized sequence length and task complexity. RULER expands upon the vanilla NIAH test to encompass variations wi","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks RULER: What's the Real Context Size of Your Long-Context Language Models? because it crossed a citation-hub threshold.","role_counts":[]},"error":null,"updated_at":"2026-05-14T11:39:50.840063+00:00"}},"summary":{"title":"RULER: What's the Real Context Size of Your Long-Context Language Models?","claims":[{"claim_text":"The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of information (the \"needle\") from long distractor texts (the \"haystack\"), has been widely adopted to evaluate long-context language models (LMs). However, this simple retrieval-based test is indicative of only a superficial form of long-context understanding. To provide a more comprehensive evaluation of long-context LMs, we create a new synthetic benchmark RULER with flexible configurations for customized sequence length and task complexity. RULER expands upon the vanilla NIAH test to encompass variations wi","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks RULER: What's the Real Context Size of Your Long-Context Language Models? because it crossed a citation-hub threshold.","role_counts":[]},"graph":{"co_cited":[{"title":"Qwen3 Technical Report","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","shared_citers":16},{"title":"The Llama 3 Herd of Models","work_id":"1549a635-88af-4ac1-acfe-51ae7bb53345","shared_citers":16},{"title":"Efficient Streaming Language Models with Attention Sinks","work_id":"a8d25452-c237-48c9-88a4-682717c3979a","shared_citers":14},{"title":"DeepSeek-V3 Technical Report","work_id":"57d2791d-2219-4c31-a077-afc04b12a75c","shared_citers":12},{"title":"Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge","work_id":"28ea1282-d657-4c61-a83c-f1249be6d6b1","shared_citers":12},{"title":"Mamba: Linear-Time Sequence Modeling with Selective State Spaces","work_id":"4ee75248-1199-492c-a52f-6661e0f4adff","shared_citers":11},{"title":"Training Verifiers to Solve Math Word Problems","work_id":"acab1aa8-b4d6-40e0-a3ee-25341701dca2","shared_citers":10},{"title":"YaRN: Efficient Context Window Extension of Large Language Models","work_id":"31f454a9-7de2-4696-86f2-9bfa1410f80d","shared_citers":10},{"title":"DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning","work_id":"e6b75ad5-2877-4168-97c8-710407094d20","shared_citers":9},{"title":"Retentive Network: A Successor to Transformer for Large Language Models","work_id":"5b0449ac-92b0-41f2-8b4f-586c2b5a08b6","shared_citers":9},{"title":"DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models","work_id":"c5006563-f3ec-438a-9e35-b7b484f34828","shared_citers":8},{"title":"FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning","work_id":"fff3953b-5efb-4753-bee4-002f59995810","shared_citers":8},{"title":"Longformer: The Long-Document Transformer","work_id":"abea7a44-6668-4de7-aab6-f53a6e5aa088","shared_citers":8},{"title":"Instruction-Following Evaluation for Large Language Models","work_id":"3aa06177-125a-4f5a-8f4a-8070c5986c26","shared_citers":7},{"title":"PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling","work_id":"6317700d-f903-4ce1-8f53-b43cb146d48b","shared_citers":7},{"title":"Qwen2.5 Technical Report","work_id":"d8432992-4980-4a81-85c7-9fa2c2b87f85","shared_citers":7},{"title":"Evaluating Large Language Models Trained on Code","work_id":"042493e9-b26f-4b4e-bbde-382072ca9b08","shared_citers":6},{"title":"Generating Long Sequences with Sparse Transformers","work_id":"c5b81688-45ee-4a9a-b095-e6290f45cb6c","shared_citers":6},{"title":"gpt-oss-120b & gpt-oss-20b Model Card","work_id":"178c1f7e-4f19-4392-a45d-45a6dfa88ead","shared_citers":6},{"title":"Kimi Linear: An Expressive, Efficient Attention Architecture","work_id":"b2f9f1cd-c39c-4dbc-8637-f575681cdc01","shared_citers":6},{"title":"LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code","work_id":"ea9e51ce-1e75-4182-92d8-4d25f70d2ee4","shared_citers":6},{"title":"LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding","work_id":"ba7831c4-9427-4e0e-a5c1-4e98511f4b53","shared_citers":6},{"title":"Mistral 7B","work_id":"eb5e1305-ad11-4875-ad8d-ad8b8f697599","shared_citers":6},{"title":"A-MEM: Agentic Memory for LLM Agents","work_id":"3b98feb2-fdb1-479a-bbe4-2c298a4592e2","shared_citers":5}],"time_series":[{"n":4,"year":2025},{"n":54,"year":2026}],"dependency_candidates":[]},"authors":[{"id":"ac6ee87a-e473-4edf-8093-eb3213732287","orcid":null,"display_name":"Cheng-Ping Hsieh","source":"manual","import_confidence":0.72},{"id":"5e329f55-0af3-4abe-a369-2ac78bb83de4","orcid":null,"display_name":"Dima Rekesh","source":"manual","import_confidence":0.72},{"id":"1f9ff70a-08d1-4f4e-aed6-96e522541b7d","orcid":null,"display_name":"Fei Jia","source":"manual","import_confidence":0.72},{"id":"59f84804-a1b4-4be1-bee3-e329cf2e7e59","orcid":null,"display_name":"Samuel Kriman","source":"manual","import_confidence":0.72},{"id":"e76f6972-4678-4f3b-95e3-f025317a3d20","orcid":null,"display_name":"Shantanu Acharya","source":"manual","import_confidence":0.72},{"id":"9ce33d49-9220-4e01-a424-4a854e156066","orcid":null,"display_name":"Simeng Sun","source":"manual","import_confidence":0.72}]}}