ML, PL, QL in Markov chain models

Cristiano Varin; Nils Lid Hjort

arxiv: 2604.20978 · v1 · submitted 2026-04-22 · 📊 stat.ME

ML, PL, QL in Markov chain models

Nils Lid Hjort , Cristiano Varin This is my paper

Pith reviewed 2026-05-09 23:24 UTC · model grok-4.3

classification 📊 stat.ME

keywords Markov chain modelsmaximum likelihoodpseudo-likelihoodquasi-likelihoodlimiting normalityDNA sequence evolutionstatistical inference

0 comments

The pith

Quasi-likelihood matches full maximum likelihood closely while gaining robustness over pseudo-likelihood in Markov chain models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Models with complex dependencies often make full maximum likelihood computation infeasible. This work derives limiting normal distributions for maximum likelihood, pseudo-likelihood, and quasi-likelihood estimators under general Markov chain assumptions, then compares their behavior across settings. The results indicate that quasi-likelihood typically outperforms pseudo-likelihood in efficiency and robustness while staying close to full maximum likelihood performance. The approach is illustrated on DNA sequence evolution models. A reader would care because the findings offer a workable compromise for dependent data where exact likelihood methods break down.

Core claim

The paper derives limiting normality results for the maximum likelihood, pseudo-likelihood, and quasi-likelihood estimators in general Markov chain models. It shows that the quasi-likelihood strategy is typically preferable to the pseudo-likelihood, losing very little to the maximum likelihood while gaining in model robustness, and has potential as a modelling tool.

What carries the argument

Limiting normality results for the three estimators, with pseudo-likelihood and quasi-likelihood treated as maximum penalised likelihood methods.

If this is right

Quasi-likelihood becomes a practical substitute when full maximum likelihood is computationally prohibitive due to complex dependencies.
Quasi-likelihood retains most efficiency of full maximum likelihood across the examined Markov settings.
Pseudo-likelihood shows consistent efficiency losses relative to the other two methods.
The methods apply directly to spatial-temporal and DNA sequence models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Quasi-likelihood could be implemented as a default option in software for dependent data analysis when full likelihood is intractable.
The robustness advantage might prove useful in spatial models where the exact dependence structure is uncertain.
Finite-sample checks on non-DNA Markov chains would test whether the asymptotic preference for quasi-likelihood holds in practice.

Load-bearing premise

The limiting normality results accurately capture finite-sample behavior and the performance comparisons extend beyond the DNA sequence examples considered.

What would settle it

A simulation study on finite-length Markov chains with known parameters, comparing the three estimators' bias, variance, and robustness under model misspecification, would confirm or refute the asymptotic rankings.

Figures

Figures reproduced from arXiv: 2604.20978 by Cristiano Varin, Nils Lid Hjort.

**Figure 5.1.** Figure 5.1: Contour plots of the variance ratio for PL with respect to ML, for estimation of α (left panel) and of β (right panel), for the [0, 0.15] × [0.15] subset of the parameter space. The quantities of (2.4) and (3.3) are found to be γ0,0 γ0,1 γ1,0 γ1,1 = 1 (α + β) 2 α −α −β β , γ¯0,0 γ¯0,1 γ¯1,0 γ¯1,1 = 1 − α − β (α + β) 2 α −α −β β . Furthermore, u0,0 = −1/(1 − α) 0 , u0,1 = 1/α 0 , u… view at source ↗

**Figure 5.2.** Figure 5.2: Asymptotic relative efficiency for the PL method, with respect to the ML and QL method, for a three-stage equicorrelation chain, as a function of ρ. We start by assuming p = (p1, . . . , pk) t known and consider estimation of the parameter ρ. We have ua,b = (δa,b − pb)/pa,b(ρ) and J = X a,b pa (δa,b − pb) 2 pa,b(ρ) = X a,b pa (δa,b − pb) 2 (1 − ρ)pb + ρδa,b . Since the stationary distribution p is known… view at source ↗

**Figure 5.3.** Figure 5.3: The random walk with two reflecting barriers: six states example. The solid line correspond to the ARE for PL, while the dashed one to QL. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_5_3.png] view at source ↗

**Figure 5.4.** Figure 5.4: The random walk with two reflecting barriers: the effects of increasing the number of states on the PL estimator of p [left panel] and on the QL estimator of p [right panel]. The solid line corresponds to a 15 states chain; the dashed line to 10 states; and the dotted line to 5 states. In order to compute the matrices involved in the QL and PL computations, note that v1 = −π1 n 1 pq k X−2 i=1 p q i−1 … view at source ↗

**Figure 5.5.** Figure 5.5: The random walk two reflecting barriers: ten states chain. The curves correspond to the ARE for QL of order 2, 3, 4, 10 and 100 (growing). Again, the matrices involved in describing the large-sample behaviour of the QL and PL methods may now be computed numerically. To illustrate different aspects involved in the comparison, we varied the p parameter as well as the number k of states, and examined the la… view at source ↗

**Figure 7.1.** Figure 7.1: The plot shows the least false parameter values, for α and β, when the four-parameter Kimura model (6.1) is assumed, when the real mechanism is a six-parameter Kimura model, as a function of the model departure degree ε; here γ1 = γ + ε, γ2 = γ − ε, δ1 = δ + ε, δ2 = δ − ε, and values (.03, .04, .12, .14) are used for (α, β, γ, δ). The least false values are shown for the ML (solid curve), the QL (dotted … view at source ↗

**Figure 7.2.** Figure 7.2: The plot shows the least false parameter values, for γ and δ, when the four-parameter Kimura model (6.1) is assumed, when the real mechanism is a six-parameter Kimura model, as a function of the model departure degree ε; here γ1 = γ + ε, γ2 = γ − ε, δ1 = δ + ε, δ2 = δ − ε, and values (.03, .04, .12, .14) are used for (α, β, γ, δ). The least false values are shown for the ML (solid curve), the QL (dotted … view at source ↗

read the original abstract

In many spatial and spatial-temporal models, and more generally in models with complex dependencies, it may be too difficult to carry out full maximum likelihood (ML) analysis. Remedies include the use of pseudo-likelihood (PL) and quasi-likelihood (QL) (also called the composite likelihood). The present article studies the ML, the PL and the QL methods for general Markov chain models, partly motivated by the desire to understand the precise behaviour of PL and QL methods in settings where this can be analysed. We present limiting normality results and compare performances in different settings. The PL and QL methods can be seen as maximum penalised likelihood methods. We find that the QL strategy is typically preferable to the PL, and that it loses very little to the ML, while earning in model robustness. It has also appeal and potential as a modelling tool. Our methods are illustrated for analysis of DNA sequence evolution type models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper derives limiting normality results for maximum likelihood (ML), pseudo-likelihood (PL), and quasi-likelihood (QL) estimators in general Markov chain models. It compares their asymptotic efficiencies and finite-sample performance on DNA sequence evolution examples, concluding that QL is typically preferable to PL (losing little to ML while gaining robustness) and has appeal as a modeling tool.

Significance. If the derivations and comparisons hold, the work provides useful theoretical grounding and practical guidance for choosing among ML, PL, and QL in dependent-data settings where full likelihood is intractable. The explicit limiting normality results and the framing of PL/QL as penalized likelihood are strengths that allow precise efficiency comparisons.

major comments (2)

[Performance comparisons and illustrations] The headline claim that QL is 'typically preferable' to PL (while close to ML) rests on the limiting normality results plus DNA-sequence illustrations. These are asymptotic; without finite-sample simulations across a range of chain lengths, orders, or transition structures, the finite-sample preference and robustness advantage do not necessarily follow (see stress-test concern).
[Robustness discussion] The robustness advantage of QL is asserted but not quantified beyond the DNA examples. A concrete measure (e.g., sensitivity to misspecification of the transition kernel or to higher-order dependence) would be needed to support the general claim that QL 'earns in model robustness.'

minor comments (2)

[Methods] Notation for the composite likelihood and the penalization terms could be clarified with an explicit equation early in the methods section.
[Abstract] The abstract states 'limiting normality results' but does not indicate whether the results cover both stationary and non-stationary chains; a brief statement would help readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The comments highlight opportunities to strengthen the finite-sample evidence supporting our claims. We address each major point below and will incorporate revisions to provide additional simulation-based support for the performance and robustness conclusions.

read point-by-point responses

Referee: [Performance comparisons and illustrations] The headline claim that QL is 'typically preferable' to PL (while close to ML) rests on the limiting normality results plus DNA-sequence illustrations. These are asymptotic; without finite-sample simulations across a range of chain lengths, orders, or transition structures, the finite-sample preference and robustness advantage do not necessarily follow (see stress-test concern).

Authors: We agree that the current finite-sample support relies on the DNA sequence illustrations in Section 5 rather than a broad Monte Carlo study. The limiting normality results (Theorems 3.1, 4.1, and 4.2) and the efficiency comparisons derived from them establish the asymptotic preference for QL over PL with minimal loss relative to ML. The DNA examples demonstrate this in a practical setting with finite lengths. To address the concern directly, we will add a new simulation subsection varying chain lengths (n = 50 to 2000), Markov orders (1 and 2), and transition probability structures, reporting empirical bias, variance, and coverage to confirm the finite-sample behavior aligns with the asymptotics. revision: yes
Referee: [Robustness discussion] The robustness advantage of QL is asserted but not quantified beyond the DNA examples. A concrete measure (e.g., sensitivity to misspecification of the transition kernel or to higher-order dependence) would be needed to support the general claim that QL 'earns in model robustness.'

Authors: The robustness claim follows from the construction of QL as a composite likelihood that depends only on the specified transition kernel (unlike full ML) and avoids the pairwise over-weighting issues of PL, as framed in Section 2. The DNA examples provide empirical illustration under potential model departures common in sequence data. We will add a targeted simulation study in the revision that generates data from a misspecified higher-order chain and fits first-order models, comparing mean squared error and robustness metrics (e.g., relative efficiency loss under misspecification) across ML, PL, and QL to quantify the advantage. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivations rely on standard asymptotic theory

full rationale

The paper derives limiting normality results for ML, PL and QL estimators in general Markov chain models from standard asymptotic theory for dependent processes. Efficiency comparisons and the conclusion that QL is typically preferable to PL follow directly from these limiting distributions and relative asymptotic variances, without any reduction to fitted parameters, self-definitions, or load-bearing self-citations. DNA sequence illustrations are presented as applications after the theory, not as the source of the claims. The derivation chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The analysis rests on standard regularity conditions for Markov chains to obtain asymptotic normality; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Markov chain models satisfy standard regularity conditions allowing central limit theorems for the estimators
Invoked to derive limiting normality results for ML, PL, and QL.

pith-pipeline@v0.9.0 · 5444 in / 1117 out tokens · 18248 ms · 2026-05-09T23:24:58.629600+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages

[1]

and Goodman, L.A

Anderson, T.W. and Goodman, L.A. (1957). Statistical inference about Markov chains. Annals of Mathematical Statistics28, 89–110

work page 1957
[2]

and Hartigan, J.A

Barry, D. and Hartigan, J.A. (1987). Asynchronous distance between homologous DNA sequences.Biometrics43, 261–276. Basawa and Rao (1980).Statistical Inference for Stochastic Processes.Academic Press, London

work page 1987
[3]

and Naumov, V.A

Basharin, G.P., Langville, A.N. and Naumov, V.A. (2004). The life and work of A.A. Markov.Linear Algebra and its Applications386, 3–26

work page 2004
[4]

Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems (with discussion contributions).Journal of the Royal Statistical SocietyB 36, 192–236

work page 1974
[5]

Besag, J. (1975). Statistical analysis of non-lattice data.The Statistician24, 179–195

work page 1975
[6]

Besag, J. (1977). Some methods of statistical analysis for spatial data.Bulletin of the Institute of International Statistics47, 77–92

work page 1977
[7]

Blaisdell, B. E. (1985). A method for estimating from two aligned present day DNA sequences their ancestral composition and subsequent rates of composition and subse- quent rates of substitution, possibly diﬀerent in the two lineages, corrected for multiple and parallel substitutions at the same site.Journal of Molecular Evolution22, 69–81. 32

work page 1985
[8]

Cox, D. R. and Reid, N. (2004). A note on pseudolikelihood constructed from marginal densities.Biometrika91, 729–737

work page 2004
[9]

(2003).Statistical Models.Cambridge University Press, Cambridge

Davison, A.C. (2003).Statistical Models.Cambridge University Press, Cambridge

work page 2003
[10]

(2002).Probability Models for DNA Sequence Evolution.Probability and Its

Durret, R. (2002).Probability Models for DNA Sequence Evolution.Probability and Its

work page 2002
[11]

and Donnelly, P

Fearnhead, P. and Donnelly, P. (2002). Approximate likelihood methods for estimating local recombination rates.Journal of the Royal Statistical SocietyB 64, 657–680

work page 2002
[12]

and Kedem, B

Fokianos, K. and Kedem, B. (2003). Regression theory for categorical time series.Statis- tical Science18, 357–375

work page 2003
[13]

Glasbey, C.A. (2001). Non-linear autoregressive time series with multivariate Gaussian mixtures as marginal distributions.Applied Statistics50, 143–154

work page 2001
[14]

and Lele, S.R

Heagerty, P.J. and Lele, S.R. (1998). A composite likelihood approach to binary spatial data.Journal of the American Statistical Association93, 1099–1111

work page 1998
[15]

and Shimakura, S

Henderson, R. and Shimakura, S. (2003). A serially correlated gamma frailty model for longitudinal count data.Biometrika90, 355–366

work page 2003
[16]

and Mohn, E

Hjort, N.L. and Mohn, E. (1987). Topics in the statistical analysis of remotely sensed data [with discussion].Bulletins of the International Statistical Institute52(Proceedings of the ISI Meeting, Tokyo), 23–44

work page 1987
[17]

and Omre, H

Hjort, N.L. and Omre, H. (1994). Topics in spatial statistics (with discussion contribu- tions).Scandinavian Journal of Statistics21, 289–357

work page 1994
[18]

and Mostad, P

Hjort, N.L. and Mostad, P. (1998). A quasi-likelihood method for estimating parameters in spatial covariance functions. Manuscript

work page 1998
[19]

and Jensen, J.L

Hobolth, A. and Jensen, J.L. (2005). Statistical inference in evolutionary models of DNA sequences via the EM algorithm. Research report No. 455, Department of Theoretical

work page 2005
[20]

(1995).Statistical Methods Applied in Meteorology.Cand

Homleid, M. (1995).Statistical Methods Applied in Meteorology.Cand. scient. thesis, Department of Mathematics, University of Oslo

work page 1995
[21]

and Taylor, H.M

Karlin, S. and Taylor, H.M. (1975).A First Course in Stochastic Processes.Academic

work page 1975
[22]

Kimura, M. (1980). A simple method for estimating evolutionary rates of base substi- tutions through comparative studies of nucleotide sequences.Journal of Molecular Evolution16, 111–120

work page 1980
[23]

Kimura, M. (1981). Estimation of evolutionary distances between homologous nucleotide sequences.Proceedings of the National Academy of Sciences USA78, 454–458. de Leon, A.R. (2004). Pairwise likelihood approach to grouped continuous model and its extension. Technical report, Department of Mathematics & Statistics, University of Calgary

work page 1981
[24]

Lindsay, B. (1988). Composite likelihood methods. InStatistical Inference for Stochastic Processes(ed. N.U. Prahbu), American Mathematical Society. 33

work page 1988
[25]

Markov, A.A. (1906). Rasprostranenie zakona bol~xih qisel na veliqiny, zavis wie drug ot druga.Izvesti Fiziko-matematiqeskogo obqestva pri Ka- zanskom universitete15(2- seri ), 124–156

work page 1906
[26]

Ev- geni Onegina

Markov, A.A. (1913). Primer statistiqeskogo issledovani nad tekstom “Ev- geni Onegina”, ill striru wi˘i sv z~ ispytani˘i v cep~.Izvesti Aka- demii Nauk, Sankt-Peterburg7(6- seri ), 153–162

work page 1913
[27]

and Ryd´ en, T

Nott, D.J. and Ryd´ en, T. (1999). Pairwise likelihood methods for inference in image models.Biometrika86, 661–676

work page 1999
[28]

Parner, E.T. (2001). A composite likelihood approach to multivariate survival data.Scan- dinavian Journal of Statistics28, 295–302

work page 2001
[29]

Pickard, D.K. (1987). Inference for discrete Markov ﬁelds: the simplest nontrivial case. Journal of the American Statistical Association82, 90–96

work page 1987
[30]

(1977).Eugene Onegin[translated by C.H

Pushkin, A.S. (1977).Eugene Onegin[translated by C.H. Johnston]. Penguin Clas- sics, London. There are various later reprints of essentially the same translation of Pushkin’s 1833 epic

work page 1977
[31]

and Geys, H

Renard, D., Molenberghs, G. and Geys, H. (2004). A pairwise likelihood approach to estimation in multilevel probit models.Computational Statistics & Data Analysis 44, 649–667

work page 2004
[32]

and Ikeda, M

Strauss, D. and Ikeda, M. (1990). Pseudolikelihood estimation for social networks.Journal of the American Statistical Association85, 204–212

work page 1990
[33]

and Karlin, S

Taylor, H.M. and Karlin, S. (1984).An Introduction to Stochastic Modeling.Academic

work page 1984
[34]

and Skare, Ø

Varin, C., Høst, G. and Skare, Ø. (2005). Pairwise likelihood inference in spatial general- ized linear mixed models.Computational Statistics & Data Analysis, to appear. 34

work page 2005

[1] [1]

and Goodman, L.A

Anderson, T.W. and Goodman, L.A. (1957). Statistical inference about Markov chains. Annals of Mathematical Statistics28, 89–110

work page 1957

[2] [2]

and Hartigan, J.A

Barry, D. and Hartigan, J.A. (1987). Asynchronous distance between homologous DNA sequences.Biometrics43, 261–276. Basawa and Rao (1980).Statistical Inference for Stochastic Processes.Academic Press, London

work page 1987

[3] [3]

and Naumov, V.A

Basharin, G.P., Langville, A.N. and Naumov, V.A. (2004). The life and work of A.A. Markov.Linear Algebra and its Applications386, 3–26

work page 2004

[4] [4]

Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems (with discussion contributions).Journal of the Royal Statistical SocietyB 36, 192–236

work page 1974

[5] [5]

Besag, J. (1975). Statistical analysis of non-lattice data.The Statistician24, 179–195

work page 1975

[6] [6]

Besag, J. (1977). Some methods of statistical analysis for spatial data.Bulletin of the Institute of International Statistics47, 77–92

work page 1977

[7] [7]

Blaisdell, B. E. (1985). A method for estimating from two aligned present day DNA sequences their ancestral composition and subsequent rates of composition and subse- quent rates of substitution, possibly diﬀerent in the two lineages, corrected for multiple and parallel substitutions at the same site.Journal of Molecular Evolution22, 69–81. 32

work page 1985

[8] [8]

Cox, D. R. and Reid, N. (2004). A note on pseudolikelihood constructed from marginal densities.Biometrika91, 729–737

work page 2004

[9] [9]

(2003).Statistical Models.Cambridge University Press, Cambridge

Davison, A.C. (2003).Statistical Models.Cambridge University Press, Cambridge

work page 2003

[10] [10]

(2002).Probability Models for DNA Sequence Evolution.Probability and Its

Durret, R. (2002).Probability Models for DNA Sequence Evolution.Probability and Its

work page 2002

[11] [11]

and Donnelly, P

Fearnhead, P. and Donnelly, P. (2002). Approximate likelihood methods for estimating local recombination rates.Journal of the Royal Statistical SocietyB 64, 657–680

work page 2002

[12] [12]

and Kedem, B

Fokianos, K. and Kedem, B. (2003). Regression theory for categorical time series.Statis- tical Science18, 357–375

work page 2003

[13] [13]

Glasbey, C.A. (2001). Non-linear autoregressive time series with multivariate Gaussian mixtures as marginal distributions.Applied Statistics50, 143–154

work page 2001

[14] [14]

and Lele, S.R

Heagerty, P.J. and Lele, S.R. (1998). A composite likelihood approach to binary spatial data.Journal of the American Statistical Association93, 1099–1111

work page 1998

[15] [15]

and Shimakura, S

Henderson, R. and Shimakura, S. (2003). A serially correlated gamma frailty model for longitudinal count data.Biometrika90, 355–366

work page 2003

[16] [16]

and Mohn, E

Hjort, N.L. and Mohn, E. (1987). Topics in the statistical analysis of remotely sensed data [with discussion].Bulletins of the International Statistical Institute52(Proceedings of the ISI Meeting, Tokyo), 23–44

work page 1987

[17] [17]

and Omre, H

Hjort, N.L. and Omre, H. (1994). Topics in spatial statistics (with discussion contribu- tions).Scandinavian Journal of Statistics21, 289–357

work page 1994

[18] [18]

and Mostad, P

Hjort, N.L. and Mostad, P. (1998). A quasi-likelihood method for estimating parameters in spatial covariance functions. Manuscript

work page 1998

[19] [19]

and Jensen, J.L

Hobolth, A. and Jensen, J.L. (2005). Statistical inference in evolutionary models of DNA sequences via the EM algorithm. Research report No. 455, Department of Theoretical

work page 2005

[20] [20]

(1995).Statistical Methods Applied in Meteorology.Cand

Homleid, M. (1995).Statistical Methods Applied in Meteorology.Cand. scient. thesis, Department of Mathematics, University of Oslo

work page 1995

[21] [21]

and Taylor, H.M

Karlin, S. and Taylor, H.M. (1975).A First Course in Stochastic Processes.Academic

work page 1975

[22] [22]

Kimura, M. (1980). A simple method for estimating evolutionary rates of base substi- tutions through comparative studies of nucleotide sequences.Journal of Molecular Evolution16, 111–120

work page 1980

[23] [23]

Kimura, M. (1981). Estimation of evolutionary distances between homologous nucleotide sequences.Proceedings of the National Academy of Sciences USA78, 454–458. de Leon, A.R. (2004). Pairwise likelihood approach to grouped continuous model and its extension. Technical report, Department of Mathematics & Statistics, University of Calgary

work page 1981

[24] [24]

Lindsay, B. (1988). Composite likelihood methods. InStatistical Inference for Stochastic Processes(ed. N.U. Prahbu), American Mathematical Society. 33

work page 1988

[25] [25]

Markov, A.A. (1906). Rasprostranenie zakona bol~xih qisel na veliqiny, zavis wie drug ot druga.Izvesti Fiziko-matematiqeskogo obqestva pri Ka- zanskom universitete15(2- seri ), 124–156

work page 1906

[26] [26]

Ev- geni Onegina

Markov, A.A. (1913). Primer statistiqeskogo issledovani nad tekstom “Ev- geni Onegina”, ill striru wi˘i sv z~ ispytani˘i v cep~.Izvesti Aka- demii Nauk, Sankt-Peterburg7(6- seri ), 153–162

work page 1913

[27] [27]

and Ryd´ en, T

Nott, D.J. and Ryd´ en, T. (1999). Pairwise likelihood methods for inference in image models.Biometrika86, 661–676

work page 1999

[28] [28]

Parner, E.T. (2001). A composite likelihood approach to multivariate survival data.Scan- dinavian Journal of Statistics28, 295–302

work page 2001

[29] [29]

Pickard, D.K. (1987). Inference for discrete Markov ﬁelds: the simplest nontrivial case. Journal of the American Statistical Association82, 90–96

work page 1987

[30] [30]

(1977).Eugene Onegin[translated by C.H

Pushkin, A.S. (1977).Eugene Onegin[translated by C.H. Johnston]. Penguin Clas- sics, London. There are various later reprints of essentially the same translation of Pushkin’s 1833 epic

work page 1977

[31] [31]

and Geys, H

Renard, D., Molenberghs, G. and Geys, H. (2004). A pairwise likelihood approach to estimation in multilevel probit models.Computational Statistics & Data Analysis 44, 649–667

work page 2004

[32] [32]

and Ikeda, M

Strauss, D. and Ikeda, M. (1990). Pseudolikelihood estimation for social networks.Journal of the American Statistical Association85, 204–212

work page 1990

[33] [33]

and Karlin, S

Taylor, H.M. and Karlin, S. (1984).An Introduction to Stochastic Modeling.Academic

work page 1984

[34] [34]

and Skare, Ø

Varin, C., Høst, G. and Skare, Ø. (2005). Pairwise likelihood inference in spatial general- ized linear mixed models.Computational Statistics & Data Analysis, to appear. 34

work page 2005