FiLMMeD: Feature-wise Linear Modulation for Cross-Problem Multi-Depot Vehicle Routing
Pith reviewed 2026-05-07 07:25 UTC · model grok-4.3
The pith
Feature-wise linear modulation lets one neural model solve 24 multi-depot vehicle routing variants without retraining.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose FiLMMeD, a unified neural model for 24 MDVRP variants that augments the Transformer encoder with Feature-wise Linear Modulation to dynamically condition learned internal representations on the active set of constraints. The work also demonstrates preference optimization as a superior alternative to reinforcement learning in the multi-task setting and introduces targeted curriculum learning to mitigate the generalization gap from multi-depot constraints. Extensive experiments confirm that FiLMMeD consistently outperforms state-of-the-art baselines on the 24 variants, including eight novel formulations, as well as on 16 single-depot VRPs.
What carries the argument
Feature-wise Linear Modulation (FiLM) layers that scale and shift the features inside the Transformer encoder according to a conditioning vector encoding the active problem constraints.
If this is right
- A single trained network can solve any of the 24 MDVRP variants by receiving the appropriate constraint encoding at inference time.
- Preference optimization outperforms reinforcement learning when training the model across multiple problem variants simultaneously.
- The curriculum strategy reduces the performance drop that normally occurs when multi-depot constraints are added.
- The same architecture also delivers strong results on single-depot vehicle routing problems without modification.
Where Pith is reading between the lines
- The same conditioning approach could be applied to other combinatorial optimization families that come in many constraint variants.
- Industry systems could switch between routing rules on the fly by updating only the conditioning input rather than deploying multiple models.
- Further work might test whether the method scales when the number of variants grows beyond 24 or when real-time data streams replace static instances.
- Because the modulation is feature-wise, the technique may transfer to other sequence-to-sequence architectures used in optimization.
Load-bearing premise
The combination of FiLM conditioning, targeted curriculum learning, and preference optimization will close the generalization gap introduced by multi-depot constraints sufficiently for the model to maintain strong performance across all 24 variants without requiring problem-specific retraining or architectural changes.
What would settle it
If FiLMMeD fails to outperform the baselines on one or more of the tested MDVRP variants, or if new constraint combinations require separate retraining or architecture changes, the claim of a truly unified solver would not hold.
Figures
read the original abstract
Solving practical multi-depot vehicle routing problems (MDVRP) is a challenging optimization task central to modern logistics, increasingly driven by e-commerce. To address the MDVRP's computational complexity, neural-based combinatorial optimization methods offer a promising scalable alternative to traditional approaches. However, neural-based methods typically rely on rigid architectures and input encodings tailored to specific problem formulations. In real-world settings, heterogeneous constraints create multiple MDVRP variants, limiting the applicability of such models. While multi-task learning (MTL) has begun to accelerate the development of unified neural-based solvers, prior works focus almost exclusively on single-depot VRPs, leaving the MDVRP unaddressed. To bridge this gap, we propose Feature-wise Linear Modulation for Cross-Problem Multi-Depot Vehicle Routing (FiLMMeD), a novel unified neural-based model for 24 different MDVRP variants. We introduce three main contributions: (1) to improve the model's generalization, we augment the standard Transformer encoder with Feature-wise Linear Modulation (FiLM), which dynamically conditions learned internal representations based on the active set of constraints; (2) we provide an initial demonstration of Preference Optimization in the MTL setting, establishing it as a superior alternative to Reinforcement Learning for future MTL works; (3) to mitigate the generalization gap caused by the introduction of multi-depot constraints, we introduce a targeted curriculum learning strategy that progressively exposes the model to increasingly more complex constraint interactions. Extensive experiments on 24 MDVRP variants (including 8 novel formulations) and 16 single-depot VRPs confirm the effectiveness of FiLMMeD, which consistently outperforms state-of-the-art baselines. Our code is available at: https://github.com/AJ-Correa/FiLMMeD/tree/main
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes FiLMMeD, a unified Transformer-based neural solver for 24 MDVRP variants (including 8 novel formulations) and 16 single-depot VRPs. It augments the encoder with Feature-wise Linear Modulation (FiLM) layers to dynamically condition representations on the active constraint set, introduces a targeted curriculum learning strategy to progressively expose the model to multi-depot interactions, and demonstrates preference optimization as an alternative to reinforcement learning in the multi-task setting. Experiments claim that the single model consistently outperforms state-of-the-art baselines without per-variant retraining or architectural changes.
Significance. If the empirical results hold under rigorous controls, the work is significant for extending multi-task neural combinatorial optimization to heterogeneous MDVRP settings, where constraint diversity has previously limited unified solvers. The FiLM conditioning mechanism and curriculum strategy directly target the generalization gap from depot and constraint heterogeneity, while the preference optimization contribution offers a new direction for MTL in routing problems. Code release supports reproducibility and follow-up work.
major comments (2)
- [§4 Experiments] §4 Experiments: The central claim that FiLMMeD 'consistently outperforms state-of-the-art baselines' across all 24 MDVRP variants rests on the reported results, yet the section provides insufficient detail on how baselines were adapted or re-implemented for the 8 novel formulations, whether instance sets were held out identically, and whether statistical significance (e.g., paired t-tests or confidence intervals over multiple seeds) was assessed; without these, the cross-problem generalization advantage cannot be fully verified.
- [§3.2 and §3.4] §3.2 FiLM Integration and §3.4 Curriculum: The weakest assumption—that FiLM conditioning plus curriculum closes the multi-depot generalization gap sufficiently for zero-shot transfer across variants—is load-bearing, but the manuscript lacks an ablation isolating the contribution of each component (e.g., FiLM-only vs. curriculum-only vs. both) on a held-out subset of the 24 variants; such an ablation is necessary to confirm the components are jointly responsible rather than one dominating.
minor comments (3)
- [Abstract and §1] The abstract and §1 Introduction should explicitly list or tabulate the 24 variants (e.g., which constraints are active in each) to make the heterogeneity concrete for readers.
- [§3.2] Notation for the constraint encoding vector fed to FiLM layers is introduced but not formalized with an equation; adding a short definition (e.g., as a one-hot or embedding of active constraints) would improve clarity.
- [§4] Table captions in the results section should include the number of instances per variant and the instance size distribution to allow direct comparison of difficulty.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and the recommendation for minor revision. We address each major comment below and will update the manuscript to incorporate the suggested clarifications and additional analyses.
read point-by-point responses
-
Referee: [§4 Experiments] §4 Experiments: The central claim that FiLMMeD 'consistently outperforms state-of-the-art baselines' across all 24 MDVRP variants rests on the reported results, yet the section provides insufficient detail on how baselines were adapted or re-implemented for the 8 novel formulations, whether instance sets were held out identically, and whether statistical significance (e.g., paired t-tests or confidence intervals over multiple seeds) was assessed; without these, the cross-problem generalization advantage cannot be fully verified.
Authors: We agree that additional experimental details are needed for full verification. In the revised manuscript, we will expand §4 with: (i) explicit descriptions of how each baseline (including POMO, AM, and others) was adapted or re-implemented for the 8 novel MDVRP formulations, noting that only input encoding and constraint masking were modified while keeping the core architecture unchanged; (ii) confirmation that identical instance generation procedures, sizes, and train/test splits were used for all methods; and (iii) statistical significance results, including paired t-tests and 95% confidence intervals computed over 5 independent random seeds for all reported gaps. These additions will be placed in a new subsection on experimental protocol and will not alter the existing tables or claims. revision: yes
-
Referee: [§3.2 and §3.4] §3.2 FiLM Integration and §3.4 Curriculum: The weakest assumption—that FiLM conditioning plus curriculum closes the multi-depot generalization gap sufficiently for zero-shot transfer across variants—is load-bearing, but the manuscript lacks an ablation isolating the contribution of each component (e.g., FiLM-only vs. curriculum-only vs. both) on a held-out subset of the 24 variants; such an ablation is necessary to confirm the components are jointly responsible rather than one dominating.
Authors: We acknowledge the value of isolating component contributions. In the revised version, we will add a dedicated ablation study (new subsection in §4) evaluating three model variants—FiLM only, curriculum only, and the full FiLMMeD—on a held-out subset of 6 MDVRP variants (3 seen during training, 3 unseen). Results will be reported as average optimality gaps with the same statistical controls as the main experiments. This will demonstrate that neither component alone suffices for the observed zero-shot transfer and that their combination is necessary, directly addressing the concern about the load-bearing assumption. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper proposes an empirical neural architecture (FiLM-conditioned Transformer + curriculum + preference optimization) for 24 MDVRP variants and reports experimental outperformance. No mathematical derivations, closed-form predictions, or first-principles results appear in the provided text. Performance claims rest on training and evaluation across problem instances rather than any quantity defined in terms of itself or fitted parameters renamed as predictions. No self-citation chains, uniqueness theorems, or ansatzes are invoked to justify core components. The central claim (one model generalizes across variants) is therefore an empirical statement, not a definitional reduction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Cordeau, J.F., Gendreau, M., Laporte, G., 1997
URL:https://proceedings.mlr.press/v206/cheng23a.html. Cordeau, J.F., Gendreau, M., Laporte, G., 1997. A tabu search heuristic for periodic and multi-depot vehicle routing problems. Networks 30, 105–119. A. Corrêa et al. Page 20 of 22 FiLMMeD: Feature-wise Linear Modulation for Cross-Problem MDVRP Corrêa, A., Silva, C., Xu, L., Brintrup, A., Moniz, S., 202...
-
[2]
Winner takes it all: Training performant rl populations for com- binatorial optimization, in: Advances in Neural Information Processing Systems. Ha, D., Dai, A.M., Le, Q.V., 2017. Hypernetworks, in: International Conference on Learning Representations. URL:https://openreview. net/forum?id=rkpACe1lx. Hottung, A., Tierney, K., 2020. Neural large neighborhoo...
-
[3]
New benchmark instances for the capacitated vehicle routing problem. European Journal of Operational Research 257, 845–858. doi:10.1016/j.ejor.2016.08.012. Vaswani, A., Shazeer, N.M., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I., 2017. Attention is all you need, in: Advances in Neural Information Processing Systems, pp. 59...
-
[4]
Wouda, N.A., Lan, L., Kool, W., 2024
URL:https://www.sciencedirect.com/science/article/pii/ S0191261518307884, doi:10.1016/j.trb.2019.03.005. Wouda, N.A., Lan, L., Kool, W., 2024. PyVRP: a high-performance VRP solver package. INFORMS Journal on Computing 36, 943–955. URL: https://doi.org/10.1287/ijoc.2023.0055, doi:10.1287/ijoc.2023.0055. Wu,Y.,Song,W.,Cao,Z.,Zhang,J.,Lim,A.,2022. Learningim...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.