pith. machine review for the scientific record. sign in
Pith Number

pith:FCIVDTSE

pith:2025:FCIVDTSE6VC5MSH75IS7VYCXE2
not attested not anchored not stored refs resolved

Lit Silicon: A Case Where Thermal Imbalance Couples Concurrent Execution in Multiple GPUs

Di Wu, Marco Kurzynski, Shaizeen Aga

Thermal imbalance across GPUs introduces stragglers that slow down the system when using concurrent computation and communication.

arxiv:2511.09861 v3 · 2025-11-13 · cs.DC · cs.AR

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Thermally induced straggling coupled with concurrent computation and communication (C3) impacts performance variation, which we coin the Lit Silicon effect. More specifically, Lit Silicon describes that in a multi-GPU node, thermal imbalance across GPUs can introduce node-level straggler GPUs (hotter and slower), which in turn slow down the leader GPUs (cooler and faster).

C2weakest assumption

That the observed kernel-level performance variation is primarily caused by thermal imbalance interacting with C3 rather than other factors such as workload imbalance, interconnect variability, or unmeasured hardware differences.

C3one line summary

Thermal imbalance in multi-GPU nodes creates hotter straggler GPUs that slow down cooler leader GPUs during overlapped computation and communication in LLM training.

References

56 extracted · 56 resolved · 7 Pith anchors

[1] A High-Performance Matrix-Multiplication Algorithm on a Distributed-Memory Parallel Computer, Using Overlapped Communication , 1994
[2] ConCCL: Optimizing ML Concurrent Computation and Communication with GPU DMA Engines, 2025
[3] Accelerating SQL database operations on a GPU with CUDA, 2010
[4] Language Models are Few-Shot Learners 2005 · arXiv:2005.14165
[5] GPU Database Systems Characterization and Optimization, 2023
Receipt and verification
First computed 2026-05-18T03:09:33.292027Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

289151ce44f545d648ffea25fae057268f810a911bb84d17410b4f29641fd649

Aliases

arxiv: 2511.09861 · arxiv_version: 2511.09861v3 · doi: 10.48550/arxiv.2511.09861 · pith_short_12: FCIVDTSE6VC5 · pith_short_16: FCIVDTSE6VC5MSH7 · pith_short_8: FCIVDTSE
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/FCIVDTSE6VC5MSH75IS7VYCXE2 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 289151ce44f545d648ffea25fae057268f810a911bb84d17410b4f29641fd649
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "8853a9e3d2ecf7d02d6fcd4690f20a373655e8865728ec3109d1dcaa20dc187c",
    "cross_cats_sorted": [
      "cs.AR"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.DC",
    "submitted_at": "2025-11-13T01:41:47Z",
    "title_canon_sha256": "3f2f23862e808398126cf576b91c3a515d18a5f7fc16dbc02343cb89c5f5ba21"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2511.09861",
    "kind": "arxiv",
    "version": 3
  }
}