pith. machine review for the scientific record. sign in

arxiv: 2604.08242 · v1 · submitted 2026-04-09 · 💻 cs.DC

Recognition: no theorem link

Scheduling Coflows in Multi-Core OCS Networks with Performance Guarantee

Authors on Pith no claims yet

Pith reviewed 2026-05-10 17:06 UTC · model grok-4.3

classification 💻 cs.DC
keywords coflow schedulingmulti-core OCSapproximation algorithmcoflow completion timeoptical circuit switchingdata center networksperformance guaranteecross-core assignment
0
0 comments X

The pith

An approximation algorithm jointly assigns flows across multiple OCS cores and schedules circuits within each core to minimize total weighted coflow completion time with a performance guarantee.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops an approximation algorithm for coflow scheduling in data center networks built from several independent optical circuit switching cores that operate concurrently. It combines decisions on routing traffic across cores with the scheduling of circuits on each individual core, while respecting port exclusivity and reconfiguration delays under a not-all-stop model. A reader would care because coflows represent the communication patterns of parallel jobs, and poor scheduling in high-bandwidth multi-core fabrics can inflate job completion times. If the algorithm succeeds, it delivers a bounded approximation to the optimal weighted completion time and extends directly to multi-core electrical packet switching networks. Trace-driven simulations on Facebook workloads show reductions in both average and tail weighted completion times.

Core claim

The paper claims that jointly integrating cross-core flow assignment and per-core circuit scheduling under the not-all-stop reconfiguration model yields an approximation algorithm that minimizes the total weighted coflow completion time while establishing a provable worst-case performance guarantee.

What carries the argument

The joint optimization that pairs cross-core traffic assignment with per-core circuit scheduling to handle coupling, port exclusivity, and reconfiguration delays.

Load-bearing premise

Cross-core traffic assignment can be decided without violating per-core port exclusivity or introducing unmodeled heterogeneity among cores.

What would settle it

A coflow instance where the algorithm returns a weighted completion time that exceeds the claimed approximation ratio times the optimal value.

Figures

Figures reproduced from arXiv: 2604.08242 by Dong Wang, Hong Shen, Hui Tian, Xin Wang.

Figure 1
Figure 1. Figure 1: Coflow Abstraction in MapReduce model, where the data center network (DCN) is abstracted as a single non-blocking switch fabric with full bisection bandwidth, which simplifies the characterization of port band￾width constraints and algorithm design. However, with the continuous growth of data center communication scale, the EPS architecture is gradually showing pressure in terms of bandwidth expansion and … view at source ↗
Figure 2
Figure 2. Figure 2: Heterogeneous Multi-Core DCN Architecture [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Circuit Reconfiguration Models in an OCS Core [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Normalized total weighted CCT and tail CCT [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 7
Figure 7. Figure 7: Normalized total weighted CCT versus reconfiguration delay δ for K = 5. TABLE III: Normalized total weighted CCT versus number of ports N for K=3. N RH-AS RA-AS SU-CO RA-SU Imbalanced rates: [10, 20, 30] 8 1.380 1.246 2.951 3.637 12 1.503 1.255 2.749 3.224 16 1.640 1.322 2.647 3.179 24 1.657 1.341 2.245 2.768 32 1.524 1.370 2.008 2.539 Balanced rates: [20, 20, 20] 8 1.134 1.049 3.060 3.167 12 1.115 1.044 2… view at source ↗
Figure 5
Figure 5. Figure 5: Normalized total weighted CCT versus reconfiguration delay δ for K = 3. • K = 4 ( [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Normalized total weighted CCT versus reconfiguration delay δ for K = 4. • K = 5 ( [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: Normalized total weighted CCT versus number of [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Normalized total weighted CCT versus number of [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Normalized total weighted CCT versus number of [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗
read the original abstract

Coflow provides a key application-layer abstraction for capturing communication patterns, enabling the efficient coordination of parallel data flows to reduce job completion times in distributed systems. Modern data center networks (DCNs) are employing multiple independent optical circuit switching (OCS) cores operating concurrently to meet the massive bandwidth demands of application jobs. However, existing coflow scheduling research primarily focuses on the single-core setting, with multi-core fabrics only for EPS (electrical packet switching) networks. To address this gap, this paper studies the coflow scheduling problem in multi-core OCS networks under the not-all-stop reconfiguration model in which one circuit's reconfiguration does not interrupt other circuits. The challenges stem from two aspects: (i) cross-core coupling induced by traffic assignment across heterogeneous cores; and (ii) per-core OCS scheduling constraints, namely port exclusivity and reconfiguration delay. We propose an approximation algorithm that jointly integrates cross-core flow assignment and per-core circuit scheduling to minimize the total weighted coflow completion time (CCT) and establish a provable worst-case performance guarantee. Furthermore, our algorithm framework can be directly applied to the multi-core EPS scenario with the corresponding approximation ratio under packet-switched fabrics. Trace-driven simulations using real Facebook workloads demonstrate that our algorithm effectively reduces weighted CCT and tail CCT.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper studies coflow scheduling in multi-core OCS networks under the not-all-stop reconfiguration model. It proposes an approximation algorithm that jointly performs cross-core flow assignment and per-core circuit scheduling to minimize total weighted coflow completion time (CCT), claims a provable worst-case performance guarantee, shows the framework extends to multi-core EPS networks, and reports trace-driven improvements on Facebook workloads for both weighted and tail CCT.

Significance. If the approximation ratio is rigorously established while respecting per-core port exclusivity and the not-all-stop property, the result would meaningfully extend single-core coflow scheduling theory to realistic multi-core OCS fabrics, offering both a theoretical bound and practical simulation evidence.

major comments (3)
  1. [Algorithm and proof sections (around the joint integration and Theorem on approximation ratio)] The central claim requires that the joint algorithm assigns traffic across cores without violating any core's port-exclusivity constraint while preserving the not-all-stop property. The abstract identifies these as the two sources of difficulty, yet the guarantee is asserted to hold after integration; the proof must explicitly show that the assignment subroutine never produces infeasible per-core schedules (otherwise the ratio derivation collapses).
  2. [Proof of the performance guarantee] Under the not-all-stop model, one circuit's reconfiguration must not stall others, but cross-core assignment can couple the schedules. The derivation of the worst-case ratio needs to bound the interaction between assignment decisions and per-core reconfiguration delays; without an explicit accounting for this coupling, the claimed guarantee may only hold under additional unstated assumptions.
  3. [Extension to EPS networks] The extension to multi-core EPS is stated to carry the same framework with the corresponding ratio, but EPS lacks reconfiguration delay; the paper must derive or adjust the ratio for the packet-switched case rather than claiming direct applicability.
minor comments (2)
  1. [Abstract] The abstract asserts a 'provable worst-case performance guarantee' but does not state the numerical ratio; including the ratio (e.g., O(log n) or similar) would immediately clarify the contribution.
  2. [Trace-driven simulations] Simulation results mention reductions in weighted and tail CCT but provide limited detail on the number of cores, degree of heterogeneity, or exact baseline algorithms; adding these would strengthen the empirical section.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major comment point by point below, clarifying the existing analysis where possible and indicating revisions to improve explicitness and rigor.

read point-by-point responses
  1. Referee: [Algorithm and proof sections (around the joint integration and Theorem on approximation ratio)] The central claim requires that the joint algorithm assigns traffic across cores without violating any core's port-exclusivity constraint while preserving the not-all-stop property. The abstract identifies these as the two sources of difficulty, yet the guarantee is asserted to hold after integration; the proof must explicitly show that the assignment subroutine never produces infeasible per-core schedules (otherwise the ratio derivation collapses).

    Authors: The cross-core assignment phase allocates flows to cores while respecting the per-core port capacity constraints (i.e., the sum of assigned traffic to any port on a given core does not exceed its capacity). Consequently, each core's resulting traffic matrix is feasible by construction for the subsequent per-core scheduling step, which then enforces port exclusivity and the not-all-stop reconfiguration model independently per core. The not-all-stop property holds globally because cores operate concurrently without mutual stalling. To eliminate any potential ambiguity in the integration, we will add an explicit feasibility lemma immediately preceding the main approximation theorem in the revised manuscript. revision: yes

  2. Referee: [Proof of the performance guarantee] Under the not-all-stop model, one circuit's reconfiguration must not stall others, but cross-core assignment can couple the schedules. The derivation of the worst-case ratio needs to bound the interaction between assignment decisions and per-core reconfiguration delays; without an explicit accounting for this coupling, the claimed guarantee may only hold under additional unstated assumptions.

    Authors: The existing proof bounds the coupling by first obtaining an optimal solution to a relaxed LP formulation that captures aggregate cross-core assignment, then analyzing the additive delay from per-core reconfiguration in the worst case. The interaction term is controlled by the maximum number of cores and the bounded reconfiguration time, yielding the overall approximation ratio. We agree that this bounding step can be stated more transparently. In the revision we will insert a dedicated intermediate lemma that isolates and bounds the coupling between assignment choices and reconfiguration delays, confirming that no additional assumptions are required. revision: yes

  3. Referee: [Extension to EPS networks] The extension to multi-core EPS is stated to carry the same framework with the corresponding ratio, but EPS lacks reconfiguration delay; the paper must derive or adjust the ratio for the packet-switched case rather than claiming direct applicability.

    Authors: When the framework is instantiated for multi-core EPS networks, the per-core scheduling subproblem reduces to standard coflow scheduling without reconfiguration delays. The approximation ratio therefore remains valid and is in fact at most as large as the OCS ratio (the delay terms vanish). We will revise the extension section to explicitly derive the EPS-specific ratio from the same LP relaxation and scheduling steps, showing the direct applicability with the adjusted (non-delay) analysis. revision: yes

Circularity Check

0 steps flagged

No circularity; approximation guarantee derived independently from algorithm construction

full rationale

The abstract describes proposing a joint approximation algorithm for cross-core assignment and per-core scheduling, then establishing a worst-case performance guarantee. No equations, fitted parameters, self-citations, or ansatzes are referenced that would make the guarantee equivalent to its inputs by construction. The derivation chain (algorithm design to ratio proof) is presented as standard theoretical analysis under the stated model assumptions, with no visible self-definitional loops or renamed empirical patterns. This is the expected non-finding for a pure algorithmic paper with an explicit proof claim.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the standard coflow abstraction, the not-all-stop OCS reconfiguration model, and the existence of heterogeneous cores; no free parameters or new entities are introduced in the abstract description.

axioms (2)
  • domain assumption Not-all-stop reconfiguration model in which one circuit's reconfiguration does not interrupt other circuits.
    Explicitly stated as the operating model for the multi-core OCS fabric.
  • domain assumption Port exclusivity and reconfiguration delay are the only per-core OCS scheduling constraints.
    Listed as the per-core constraints the algorithm must respect.

pith-pipeline@v0.9.0 · 5526 in / 1309 out tokens · 41677 ms · 2026-05-10T17:06:00.451059+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. O(K)-Approximation Coflow Scheduling in K-Core Optical Circuit Switching Networks

    cs.DC 2026-04 unverdicted novelty 6.0

    An LP-guided algorithm achieves 8K-approximation for minimizing weighted coflow completion time in K-core OCS networks under asynchronous reconfiguration.

Reference graph

Works this paper leans on

37 extracted references · cited by 1 Pith paper

  1. [1]

    Mapreduce: simplified data processing on large clusters,

    J. Dean and S. Ghemawat, “Mapreduce: simplified data processing on large clusters,”Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008

  2. [2]

    Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing,

    M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauly, M. J. Franklin, S. Shenker, and I. Stoica, “Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing,” in9th {USENIX}Symposium on Networked Systems Design and Implementa- tion ({NSDI}12), 2012, pp. 15–28

  3. [3]

    Dryad: distributed data-parallel programs from sequential building blocks,

    M. Isard, M. Budiu, Y . Yu, A. Birrell, and D. Fetterly, “Dryad: distributed data-parallel programs from sequential building blocks,” in Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, 2007, pp. 59–72

  4. [4]

    Coflow: A networking abstraction for cluster applications,

    M. Chowdhury and I. Stoica, “Coflow: A networking abstraction for cluster applications,” inProceedings of the 11th ACM Workshop on Hot Topics in Networks, 2012, pp. 31–36

  5. [5]

    Manag- ing data transfers in computer clusters with orchestra,

    M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica, “Manag- ing data transfers in computer clusters with orchestra,”ACM SIGCOMM Computer Communication Review, vol. 41, no. 4, pp. 98–109, 2011

  6. [6]

    Efficient coflow scheduling with varys,

    M. Chowdhury, Y . Zhong, and I. Stoica, “Efficient coflow scheduling with varys,” inProceedings of the 2014 ACM conference on SIGCOMM, 2014, pp. 443–454

  7. [7]

    Decentralized task-aware scheduling for data center networks,

    F. R. Dogar, T. Karagiannis, H. Ballani, and A. Rowstron, “Decentralized task-aware scheduling for data center networks,”ACM SIGCOMM Computer Communication Review, vol. 44, no. 4, pp. 431–442, 2014

  8. [8]

    Minimizing average coflow completion time with decentralized scheduling,

    S. Luo, H. Yu, Y . Zhao, B. Wu, S. Wanget al., “Minimizing average coflow completion time with decentralized scheduling,” in2015 IEEE International Conference on Communications (ICC). IEEE, 2015, pp. 307–312

  9. [9]

    Efficient coflow scheduling without prior knowledge,

    M. Chowdhury and I. Stoica, “Efficient coflow scheduling without prior knowledge,”ACM SIGCOMM Computer Communication Review, vol. 45, no. 4, pp. 393–406, 2015

  10. [10]

    Coda: Toward automatically identifying and scheduling coflows in the dark,

    H. Zhang, L. Chen, B. Yi, K. Chen, M. Chowdhury, and Y . Geng, “Coda: Toward automatically identifying and scheduling coflows in the dark,” in Proceedings of the 2016 ACM SIGCOMM Conference, 2016, pp. 160– 173

  11. [11]

    Online scheduling of coflows by attention- empowered scalable deep reinforcement learning,

    X. Wang and H. Shen, “Online scheduling of coflows by attention- empowered scalable deep reinforcement learning,”Future Generation Computer Systems, vol. 146, pp. 195–206, 2023

  12. [12]

    Efficient and fair: Information-agnostic online coflow scheduling by combining limited multiplexing with drl,

    X. Wang, H. Shen, and H. Tian, “Efficient and fair: Information-agnostic online coflow scheduling by combining limited multiplexing with drl,” IEEE Transactions on Network and Service Management, vol. 20, no. 4, pp. 4572–4584, 2023

  13. [13]

    Minimizing the total weighted com- pletion time of coflows in datacenter networks,

    Z. Qiu, C. Stein, and Y . Zhong, “Minimizing the total weighted com- pletion time of coflows in datacenter networks,” inProceedings of the 27th ACM symposium on Parallelism in Algorithms and Architectures, 2015, pp. 294–303

  14. [14]

    An improved bound for minimizing the total weighted completion time of coflows in datacenters,

    M. Shafiee and J. Ghaderi, “An improved bound for minimizing the total weighted completion time of coflows in datacenters,”IEEE/ACM Transactions on Networking, vol. 26, no. 4, pp. 1674–1687, 2018

  15. [15]

    Efficient scheduling of weighted coflows in data centers,

    Z. Wang, H. Zhang, X. Shi, X. Yin, Y . Li, H. Geng, Q. Wu, and J. Liu, “Efficient scheduling of weighted coflows in data centers,”IEEE Transactions on Parallel and Distributed Systems, vol. 30, no. 9, pp. 2003–2017, 2019

  16. [16]

    Omco: Online multiple coflow scheduling in optical circuit switch,

    C. Xu, H. Tan, J. Hou, C. Zhang, and X.-Y . Li, “Omco: Online multiple coflow scheduling in optical circuit switch,” in2018 IEEE International Conference on Communications (ICC). IEEE, 2018, pp. 1–6

  17. [17]

    Reco: Efficient regularization-based coflow scheduling in optical circuit switches,

    C. Zhang, H. Tan, C. Xu, X.-Y . Li, S. Tang, and Y . Li, “Reco: Efficient regularization-based coflow scheduling in optical circuit switches,” in 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). IEEE, 2019, pp. 111–121

  18. [18]

    Regularization- based coflow scheduling in optical circuit switches,

    H. Tan, C. Zhang, C. Xu, Y . Li, Z. Han, and X.-Y . Li, “Regularization- based coflow scheduling in optical circuit switches,”IEEE/ACM Trans- actions on Networking, vol. 29, no. 3, pp. 1280–1293, 2021

  19. [19]

    Sunflow: Efficient optical circuit scheduling for coflows,

    X. S. Huang, X. S. Sun, and T. E. Ng, “Sunflow: Efficient optical circuit scheduling for coflows,” inProceedings of the 12th International on Conference on emerging Networking EXperiments and Technologies, 2016, pp. 297–311

  20. [20]

    Minimizing coflow completion time in optical circuit switched networks,

    T. Zhang, F. Ren, J. Bao, R. Shu, and W. Cheng, “Minimizing coflow completion time in optical circuit switched networks,”IEEE Transac- tions on Parallel and Distributed Systems, vol. 32, no. 2, pp. 457–469, 2020

  21. [21]

    Scheduling coflows in hybrid optical- circuit and electrical-packet switches with performance guarantee,

    X. Wang, H. Shen, and H. Tian, “Scheduling coflows in hybrid optical- circuit and electrical-packet switches with performance guarantee,” IEEE/ACM Transactions on Networking, vol. 32, no. 3, pp. 2299–2314, 2024

  22. [22]

    Co-scheduler: Accelerating data-parallel jobs in datacenter networks with optical circuit switching,

    Z. Li and H. Shen, “Co-scheduler: Accelerating data-parallel jobs in datacenter networks with optical circuit switching,” in2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). IEEE, 2019, pp. 186–195

  23. [23]

    Optimal partitioning of traffic demand for coflow scheduling in hybrid switches,

    X. Wang, H. Shen, and H. Tian, “Optimal partitioning of traffic demand for coflow scheduling in hybrid switches,”IEEE Transactions on Network and Service Management, 2025

  24. [24]

    Effective coflow scheduling in hybrid circuit and packet switching networks,

    R. Jiang, T. Zhang, and C. Yi, “Effective coflow scheduling in hybrid circuit and packet switching networks,” in2023 IEEE Symposium on Computers and Communications (ISCC). IEEE, 2023, pp. 1156–1161

  25. [25]

    The future is 40 gigabit ethernet,

    Cisco White Paper. (2016), “The future is 40 gigabit ethernet,” https://www.cisco.com/c/dam/en/us/products/collateral/switches/catalyst- 6500-series-switches/white-paper-c11-737238.pdf

  26. [26]

    Cisco global cloud in- dex: Forecast and methodology, 2015–2020,

    Cisco. (2016), “Cisco global cloud in- dex: Forecast and methodology, 2015–2020,” https://www.cisco.com/c/dam/en/us/solutions/collateral/service- provider/global-cloud-index-gci/white-paper-c11-738085.pdf

  27. [27]

    Weaver: Efficient coflow scheduling in heterogeneous parallel networks,

    X. S. Huang, Y . Xia, and T. E. Ng, “Weaver: Efficient coflow scheduling in heterogeneous parallel networks,” in2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2020, pp. 1071–1081

  28. [28]

    Efficient approximation algorithms for scheduling coflows with total weighted completion time in identical parallel networks,

    C.-Y . Chen, “Efficient approximation algorithms for scheduling coflows with total weighted completion time in identical parallel networks,” IEEE Transactions on Cloud Computing, vol. 12, no. 1, pp. 116–129, 2023

  29. [29]

    Jupiter evolving: transforming google’s datacenter network via optical circuit switches and software-defined networking,

    L. Poutievski, O. Mashayekhi, J. Ong, A. Singh, M. Tariq, R. Wang, J. Zhang, V . Beauregard, P. Conner, S. Gribbleet al., “Jupiter evolving: transforming google’s datacenter network via optical circuit switches and software-defined networking,” inProceedings of the ACM SIGCOMM 2022 Conference, 2022, pp. 66–85

  30. [30]

    Brief announcement: Improved approxi- mation algorithms for scheduling co-flows,

    S. Khuller and M. Purohit, “Brief announcement: Improved approxi- mation algorithms for scheduling co-flows,” inProceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, 2016, pp. 239–240

  31. [31]

    Matroid coflow schedul- ing

    S. Im, B. Moseley, K. Pruhs, and M. Purohit, “Matroid coflow schedul- ing.” inICALP, 2019, pp. 1–13

  32. [32]

    Scheduling coflows for minimizing the total weighted completion time in heterogeneous parallel networks,

    C.-Y . Chen, “Scheduling coflows for minimizing the total weighted completion time in heterogeneous parallel networks,”Journal of Parallel and Distributed Computing, vol. 182, p. 104752, 2023

  33. [33]

    Rapier: Integrating routing and scheduling for coflow- aware data center networks,

    Y . Zhao, K. Chen, W. Bai, M. Yu, C. Tian, Y . Geng, Y . Zhang, D. Li, and S. Wang, “Rapier: Integrating routing and scheduling for coflow- aware data center networks,” in2015 IEEE Conference on Computer Communications (INFOCOM). IEEE, 2015, pp. 424–432

  34. [34]

    Scheduling coflows with dependency graph,

    M. Shafiee and J. Ghaderi, “Scheduling coflows with dependency graph,” IEEE/ACM Transactions on Networking, vol. 30, no. 1, pp. 450–463, 2021

  35. [35]

    Tres observaciones sobre el algebra lineal,

    G. Birkhoff, “Tres observaciones sobre el algebra lineal,”Univ. Nac. Tucuman, Ser. A, vol. 5, pp. 147–154, 1946

  36. [36]

    Open shop scheduling to minimize finish time,

    T. Gonzalez and S. Sahni, “Open shop scheduling to minimize finish time,”Journal of the ACM (JACM), vol. 23, no. 4, pp. 665–679, 1976

  37. [37]

    Facebooktrace,

    “Facebooktrace,” https://github.com/coflow/coflow-benchmark, 2019