Bagging the Network
Pith reviewed 2026-05-23 18:59 UTC · model grok-4.3
The pith
A joint method-of-moments estimator refined by one-step correction and split-network jackknife bagging produces asymptotically normal, unbiased homophily estimates that attain the Cramér-Rao bound in dyadic networks with fixed effects.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that combining a joint method-of-moments initial estimator, a Le Cam one-step refinement, and a split-network jackknife bagging step removes incidental parameter bias without inflating variance, delivering an asymptotically normal and unbiased homophily estimator that attains the Cramér-Rao lower bound for both transferable-utility and nontransferable-utility network models even when the log-likelihood is non-concave in the fixed effects.
What carries the argument
The split-network jackknife bagging step that removes incidental parameter bias after a joint method-of-moments estimator and Le Cam one-step refinement.
If this is right
- The procedure applies to both transferable-utility and nontransferable-utility specifications under general link functions.
- It extends directly to estimation and inference for average partial effects.
- The estimator remains consistent and asymptotically efficient under link-function misspecification.
- Simulations confirm the asymptotic normality, unbiasedness, and efficiency claims hold in finite samples for both design types.
- Empirical applications to Thai village networks and the Nyakatoke risk-sharing network illustrate distinct homophily patterns under the two regimes.
Where Pith is reading between the lines
- The same bias-correction sequence could be applied to other panel or network models that feature high-dimensional incidental parameters where profiling fails.
- Analysts studying social or economic networks could obtain more reliable measures of homophily when consent requirements differ across contexts.
- The method opens the possibility of comparing its finite-sample performance against analytical bias corrections on the same networks.
- Larger-scale networks could be analyzed by combining the initial moments estimator with scalable optimization routines before bagging.
Load-bearing premise
The split-network jackknife bagging step removes incidental parameter bias without inflating the variance of the estimator.
What would settle it
A Monte Carlo experiment with known true parameters in which the bagged estimator either retains finite-sample bias after the jackknife step or exhibits variance strictly larger than the Cramér-Rao lower bound.
Figures
read the original abstract
We develop a unified estimation and inference framework for dyadic network formation with individual fixed effects, covering both transferable-utility (TU) and nontransferable-utility (NTU) links under general link functions. Under NTU, bilateral consent makes the fixed effects non-additive and the log-likelihood non-concave in the high-dimensional fixed effects, so differencing and profile-likelihood methods fail. We combine a joint method-of-moments initial estimator, a Le Cam one-step refinement, and a split-network jackknife bagging step that removes the incidental parameter bias without inflating variance. The resulting homophily estimator is asymptotically normal, unbiased, and attains the Cram\'er--Rao lower bound without requiring the log-likelihood to be concave in the fixed effects; we extend the theory to average partial effects and establish robustness to link-function misspecification. Simulations under both TU and NTU designs confirm these predictions. Applied to Thai village networks (TU), kinship and wealth differences both increase linking; in the Nyakatoke risk-sharing network (NTU), wealth differences have no significant effect, mirroring the two regimes' distinct logics.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops a unified framework for estimating homophily parameters in dyadic network formation models with individual fixed effects, applicable to both transferable utility (TU) and nontransferable utility (NTU) link functions. The approach combines a method-of-moments initial estimator, a Le Cam one-step update, and a split-network jackknife bagging procedure to correct for incidental parameter bias. The central claim is that the resulting estimator is asymptotically normal, unbiased, attains the Cramér-Rao lower bound, and does not require concavity of the log-likelihood in the fixed effects. Extensions to average partial effects and robustness to misspecification are provided, with supporting simulations and two empirical applications.
Significance. If the theoretical results hold, this represents a notable contribution to the econometrics of networks by enabling efficient estimation in NTU settings where profile likelihood and differencing methods are inapplicable due to non-concavity and non-additivity. The ability to attain the efficiency bound without variance inflation from the bagging step would be particularly valuable. The simulations under both TU and NTU designs and the applications to Thai village (TU) and Nyakatoke (NTU) networks provide concrete evidence of practical utility. The paper's strength lies in its unified treatment and the explicit handling of the incidental parameters problem in a graph setting.
major comments (3)
- [§3.3] §3.3 (split-network jackknife bagging): The claim that the bagging step removes incidental parameter bias without inflating variance is central to the efficiency result. In NTU models, where fixed effects enter non-additively, each network split must still identify the full high-dimensional fixed effects vector. The manuscript should provide a detailed argument showing that cross-split dependence does not introduce residual bias or prevent attainment of the CRLB, as standard split-panel jackknife results do not directly apply to dyadic data with shared node effects.
- [Theorem 1] Theorem 1 (asymptotic normality and efficiency): The proof of asymptotic normality, unbiasedness, and CRLB attainment relies on the combination of MOM, Le Cam one-step, and bagging. It would strengthen the paper to include an explicit verification or regularity conditions under which the one-step refinement preserves efficiency when the initial estimator is consistent but the likelihood is non-concave.
- [Section 5] Section 5 (simulations): The simulation designs confirm the predictions, but to support the NTU claims, additional results on the finite-sample bias and variance under non-concave likelihoods would be helpful to demonstrate that the bagging does not inflate variance in practice.
minor comments (2)
- [Abstract] The abstract is dense with technical terms; a slight expansion on the key innovation of the split-network jackknife could improve accessibility.
- [Notation] Ensure consistent use of notation for the fixed effects across TU and NTU sections to avoid confusion.
Simulated Author's Rebuttal
We thank the referee for the constructive comments and positive assessment of the paper's contribution to network econometrics. We address each major comment below and will revise the manuscript accordingly to strengthen the presentation.
read point-by-point responses
-
Referee: [§3.3] §3.3 (split-network jackknife bagging): The claim that the bagging step removes incidental parameter bias without inflating variance is central to the efficiency result. In NTU models, where fixed effects enter non-additively, each network split must still identify the full high-dimensional fixed effects vector. The manuscript should provide a detailed argument showing that cross-split dependence does not introduce residual bias or prevent attainment of the CRLB, as standard split-panel jackknife results do not directly apply to dyadic data with shared node effects.
Authors: We agree that a more explicit treatment of cross-split dependence is useful for the NTU case. The current proof exploits the joint MOM initialization and the fact that each split retains the full node set (hence identifies the entire fixed-effects vector), with dependence across splits controlled by the dyadic structure and the Le Cam refinement. In the revision we will expand Section 3.3 and add an appendix subsection that derives the joint asymptotic distribution of the split estimators, explicitly verifying that the averaging step eliminates the O(1/N) bias term without inflating the leading variance or preventing CRLB attainment. revision: yes
-
Referee: [Theorem 1] Theorem 1 (asymptotic normality and efficiency): The proof of asymptotic normality, unbiasedness, and CRLB attainment relies on the combination of MOM, Le Cam one-step, and bagging. It would strengthen the paper to include an explicit verification or regularity conditions under which the one-step refinement preserves efficiency when the initial estimator is consistent but the likelihood is non-concave.
Authors: We will add a supporting lemma that states the precise regularity conditions. The argument relies on local asymptotic normality of the score around the true parameter (which holds under standard smoothness and moment conditions on the link function) together with root-N consistency of the initial joint MOM estimator; the Le Cam one-step then inherits the efficient influence function even when the global likelihood is non-concave. The revision will make these conditions explicit and verify that they are satisfied by the maintained assumptions on the network formation model. revision: yes
-
Referee: [Section 5] Section 5 (simulations): The simulation designs confirm the predictions, but to support the NTU claims, additional results on the finite-sample bias and variance under non-concave likelihoods would be helpful to demonstrate that the bagging does not inflate variance in practice.
Authors: We will augment the simulation section with a new set of Monte Carlo experiments that explicitly target NTU designs with non-concave likelihoods (e.g., by varying the link function to induce non-concavity while preserving identification). These will report finite-sample bias, variance, and efficiency ratios relative to the CRLB across network sizes, directly confirming that the bagging step does not inflate variance relative to the infeasible oracle estimator. revision: yes
Circularity Check
Standard MOM consistency plus jackknife bias correction on novel split; no reduction of target estimator to fitted input by construction.
full rationale
The derivation combines a joint MOM initial estimator, Le Cam one-step refinement, and split-network jackknife bagging to remove incidental-parameter bias. The abstract states the resulting homophily estimator attains asymptotic normality, unbiasedness, and the CRLB without concavity, but presents these as consequences of the proposed steps rather than definitions or renamings. No quoted equations equate the final estimator to a fitted parameter or self-cited uniqueness theorem. The central claim therefore retains independent content beyond the inputs, consistent with the default expectation of no significant circularity.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Dyadic network formation with individual fixed effects under general link functions
- domain assumption Split-network jackknife removes incidental-parameter bias without variance inflation
Forward citations
Cited by 1 Pith paper
-
Penalized Likelihood for Dyadic Network Formation Models with Degree Heterogeneity
Penalized likelihood resolves non-existence of MLE and incidental-parameter bias in network models with degree heterogeneity while allowing sparse networks and providing asymptotic guarantees.
Reference graph
Works this paper leans on
-
[1]
Acemoglu, D., A. Ozdaglar, and A. Tahbaz-Salehi (2015): Systemic risk and stability in financial networks, American Economic Review, 105, 564--608
work page 2015
-
[2]
(1985): Advanced econometrics, Harvard university press
Amemiya, T. (1985): Advanced econometrics, Harvard university press
work page 1985
-
[3]
Auerbach, E. (2022): Identification and estimation of a partially linear regression model using network data, Econometrica, 90, 347--365
work page 2022
-
[4]
Banerjee, A., A. G. Chandrasekhar, E. Duflo, and M. O. Jackson (2013): The diffusion of microfinance, Science, 341, 1236498
work page 2013
-
[5]
Battaglini, M., E. Patacchini, and E. Rainone (2022): Endogenous social interactions with unobserved networks, The Review of Economic Studies, 89, 1694--1747
work page 2022
-
[6]
Blitzstein, J. and P. Diaconis (2011): A sequential importance sampling algorithm for generating random graphs with prescribed degrees, Internet Mathematics, 6, 489--522
work page 2011
-
[7]
Boucher, V. and I. Mourifi \'e (2017): My friend far, far away: a random field approach to exponential random graph models, The Econometrics Journal, 20, S14--S46
work page 2017
-
[8]
Boucheron, S., G. Lugosi, and P. Massart (2013): Concentration inequalities: a nonasymptotic theory of independence, Oxford University Press
work page 2013
-
[9]
(1996): Bagging predictors, Machine learning, 24, 123--140
Breiman, L. (1996): Bagging predictors, Machine learning, 24, 123--140
work page 1996
- [10]
- [11]
-
[12]
Charbonneau, K. B. (2017): Multiple fixed effects in binary response panel data models, The Econometrics Journal, 20, S1--S13
work page 2017
-
[13]
Chatterjee, S., P. Diaconis, and A. Sly (2011): Random graphs with a given degree sequence, The Annals of Applied Probability, 1400--1435
work page 2011
-
[14]
Chen, M., I. Fern \'a ndez-Val, and M. Weidner (2021): Nonlinear factor models for network and panel data, Journal of Econometrics, 220, 296--324
work page 2021
-
[15]
Chen, X. (2007): Large sample sieve estimation of semi-nonparametric models, in Handbook of Econometrics, Elsevier B.V., vol. 6B
work page 2007
-
[16]
Chen, X., V. Chernozhukov, S. Lee, and W. K. Newey (2014): Local identification of nonparametric and semiparametric models, Econometrica, 82, 785--809
work page 2014
-
[17]
de Paula , \'A ., S. Richards-Shubik, and E. Tamer (2018): Identifying preferences in networks with bounded degree, Econometrica, 86, 263--288
work page 2018
-
[18]
de Paula, \'A ureo (2020): Strategic network formation, in The Econometric Analysis of Network Data, Elsevier, 41--61
work page 2020
-
[19]
(2004): Risk-sharing and endogenous group formation, in Insurance against Poverty, ed
De Weerdt, J. (2004): Risk-sharing and endogenous group formation, in Insurance against Poverty, ed. by S. Dercon, Oxford University Press, chap. 10
work page 2004
-
[20]
De Weerdt, J. and S. Dercon (2006): Risk-sharing networks and insurance against illness, Journal of Development Economics, 81, 337--356
work page 2006
-
[21]
De Weerdt, J. and M. Fafchamps (2011): Social identity and the formation of health insurance networks, Journal of Development Studies, 47, 1152--1177
work page 2011
-
[22]
Dhaene, G. and K. Jochmans (2015): Split-panel jackknife estimation of fixed-effect models, The Review of Economic Studies, 82, 991--1030
work page 2015
-
[23]
Dzemski, A. (2019): An empirical model of dyadic link formation in a network with unobserved heterogeneity, Review of Economics and Statistics, 101, 763--776
work page 2019
-
[24]
Fern \'a ndez-Val, I. and M. Weidner (2016): Individual and time effects in nonlinear panel models with large N, T, Journal of Econometrics, 192, 291--312
work page 2016
-
[25]
--- -.1pt --- -.1pt --- (2018): Fixed effects estimation of large-T panel data models, Annual Review of Economics, 10, 109--138
work page 2018
-
[26]
Gao, W. Y. (2020): Nonparametric identification in index models of link formation, Journal of Econometrics, 215, 399--413
work page 2020
-
[27]
Gao, W. Y., M. Li, and S. Xu (2023): Logical differencing in dyadic network formation models with nontransferable utilities, Journal of Econometrics, 235, 302--324
work page 2023
-
[28]
Goldsmith-Pinkham, P. and G. W. Imbens (2013): Social networks and the identification of peer effects, Journal of Business & Economic Statistics, 31, 253--264
work page 2013
-
[29]
Graham, B. S. (2017): An econometric model of network formation with degree heterogeneity, Econometrica, 85, 1033--1063
work page 2017
-
[30]
--- -.1pt --- -.1pt --- (2020): Network data, in Handbook of Econometrics, Elsevier, vol. 7, 111--218
work page 2020
-
[31]
Gualdani, C. (2021): An econometric model of network formation with an application to board interlocks between firms, Journal of Econometrics, 224, 345--370
work page 2021
-
[32]
Hahn, J. and G. Kuersteiner (2011): Bias reduction for dynamic nonlinear panel models with fixed effects, Econometric Theory, 27, 1152--1191
work page 2011
-
[33]
Hahn, J., H. R. Moon, and C. Snider (2017): LM test of neglected correlated random effects and its application, Journal of Business & Economic Statistics, 35, 359--370
work page 2017
-
[34]
Hahn, J. and W. Newey (2004): Jackknife and analytical bias reduction for nonlinear panel models, Econometrica, 72, 1295--1319
work page 2004
-
[35]
Hirano, K. and J. H. Wright (2017): Forecasting with model uncertainty: Representations and risk reduction, Econometrica, 85, 617--643
work page 2017
-
[36]
Hsieh, C.-S. and L. F. Lee (2016): A social interactions model with endogenous friendship formation and selectivity, Journal of Applied Econometrics, 31, 301--319
work page 2016
- [37]
-
[38]
Jackson, M. O., Z. Lin, and N. N. Yu (2024): Adjusting for peer-influence in propensity scoring when estimating treatment effects, Working Paper
work page 2024
-
[39]
Jackson, M. O. and A. Wolinsky (1996): A strategic model of social and economic networks, Journal of Economic Theory, 71, 44--74
work page 1996
-
[40]
Jochmans, K. (2017): Semiparametric analysis of network formation, Journal of Business & Economic Statistics, 1--9
work page 2017
-
[41]
Jochmans, K. and M. Weidner (2019): Fixed-effect regressions on network data, Econometrica, 87, 1543--1560
work page 2019
-
[42]
Johnsson, I. and H. R. Moon (2021): Estimation of peer effects in endogenous social networks: Control function approach, Review of Economics and Statistics, 103, 328--345
work page 2021
-
[43]
K \"o nig, M. D., D. Rohner, M. Thoenig, and F. Zilibotti (2017): Networks in conflict: Theory and evidence from the great war of africa, Econometrica, 85, 1093--1132
work page 2017
-
[44]
Le Cam, L. M. (1969): Th \'e orie asymptotique de la d \'e cision statistique, Les Presses de l'Universitede Montreal, Montreal
work page 1969
-
[45]
Leung, M. P. (2019): A weak law for moments of pairwise stable networks, Journal of Econometrics, 210, 310--326
work page 2019
- [46]
- [47]
-
[48]
Mei, Z., L. Sheng, and Z. Shi (2024): Nickell bias in panel local projection: Financial crises are worse than you think, arXiv preprint arXiv:2302.13455
-
[49]
(2017): A structural model of dense network formation, Econometrica, 85, 825--850
Mele, A. (2017): A structural model of dense network formation, Econometrica, 85, 825--850
work page 2017
-
[50]
--- -.1pt --- -.1pt --- (2022): A structural model of homophily and clustering in social networks, Journal of Business & Economic Statistics, 40, 1377--1389
work page 2022
-
[51]
(2024): Strategic network formation with many agents, Working Paper
Menzel, K. (2024): Strategic network formation with many agents, Working Paper
work page 2024
-
[52]
Miyauchi, Y. (2016): Structural estimation of pairwise stable networks with nonnegative externality, Journal of Econometrics, 195, 224--235
work page 2016
-
[53]
Moreira, M. J. (2009): A maximum likelihood method for the incidental parameter problem , The Annals of Statistics, 37, 3660 -- 3696
work page 2009
-
[54]
Neyman, J. and E. L. Scott (1948): Consistent estimates based on partially consistent observations, Econometrica, 1--32
work page 1948
- [55]
- [56]
-
[57]
Robbins, H. and S. Monro (1951): A stochastic approximation method, The Annals of Mathematical Statistics, 400--407
work page 1951
-
[58]
(1997): On methods of sieves and penalization, The Annals of Statistics, 25, 2555--2591
Shen, X. (1997): On methods of sieves and penalization, The Annals of Statistics, 25, 2555--2591
work page 1997
-
[59]
Sheng, S. (2020): A structural econometric analysis of network formation games through subnetworks, Econometrica, 88, 1829--1858
work page 2020
-
[60]
Toth, P. (2017): Semiparametric estimation in network formation models with homophily and degree heterogeneity, SSRN 2988698
work page 2017
-
[61]
Van der Vaart, A. W. (2000): Asymptotic Statistics, vol. 3, Cambridge university press
work page 2000
-
[62]
(1982): Maximum likelihood estimation of misspecified models, Econometrica, 1--25
White, H. (1982): Maximum likelihood estimation of misspecified models, Econometrica, 1--25
work page 1982
-
[63]
--- -.1pt --- -.1pt --- (1996): Estimation, inference and specification analysis, 22, Cambridge university press
work page 1996
-
[64]
Approximating the inverse of a diagonally dominant matrix with positive elements
Yan, T. (2019): Approximating the inverse of a diagonally dominant matrix with positive elements, arXiv preprint arXiv:1902.00668
work page internal anchor Pith review Pith/arXiv arXiv 2019
- [65]
-
[66]
Yan, T., C. Leng, and J. Zhu (2016 a ): Asymptotics in directed exponential random graph models with an increasing bi-degree sequence, The Annals of Statistics, 44, 31--57
work page 2016
-
[67]
Yan, T., H. Qin, and H. Wang (2016 b ): Asymptotics in undirected random graph models parameterized by the strengths of vertices, Statistica Sinica, 273--293
work page 2016
-
[68]
Yan, T. and J. Xu (2013): A central limit theorem in the -model for undirected random graphs with a diverging number of vertices, Biometrika, 100, 519--524
work page 2013
-
[69]
Zeleneev, A. (2020): Identification and estimation of network models with nonparametric unobserved heterogeneity, Department of Economics, Princeton University
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.