Transformed Naive Ratio and Product Based Estimators for Estimating Population Mode in Simple Random Sampling

Nirmal Tiwari; Sanjay Kumar

arxiv: 1907.00519 · v1 · pith:LZCIXKHVnew · submitted 2019-07-01 · 📊 stat.ME

Transformed Naive Ratio and Product Based Estimators for Estimating Population Mode in Simple Random Sampling

Sanjay Kumar , Nirmal Tiwari This is my paper

Pith reviewed 2026-05-25 12:21 UTC · model grok-4.3

classification 📊 stat.ME

keywords population moderatio estimatorproduct estimatorsimple random samplingauxiliary informationmean square errornaive estimator

0 comments

The pith

The transformed naïve ratio estimator for the population mode is more efficient than the naïve and naïve ratio estimators when auxiliary information is available.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes new estimators for the population mode in simple random sampling by transforming naïve ratio and product estimators with a characterizing scalar that incorporates auxiliary information. It compares their bias and mean square error to existing naïve estimators on both real and simulated data. The authors find that the transformed ratio version performs better in terms of efficiency. This matters because accurate mode estimation can improve survey analysis when the data distribution is skewed or multimodal. The work focuses on without-replacement sampling and provides expressions for bias, MSE, and confidence intervals.

Core claim

The paper introduces transformed naïve ratio and product based estimators for the population mode that use a characterizing scalar to leverage auxiliary information. Through theoretical derivations and numerical studies on natural populations and artificial data, it demonstrates that the proposed transformed naïve ratio based estimator has lower mean square error than the standard naïve estimator and the naïve ratio estimator.

What carries the argument

Transformed naïve ratio based estimator using a characterizing scalar in simple random sampling without replacement.

If this is right

The transformed naïve ratio estimator reduces mean square error for mode estimation compared to standard methods.
Bias and MSE expressions can be used to construct confidence intervals for the population mode.
Product based versions are also proposed but the ratio version shows better performance in the studied cases.
These estimators apply to surveys where auxiliary data is available alongside the study variable.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Generalization to other sampling designs like stratified sampling could be tested.
The characterizing scalar might be optimized further for different population types.
Real-world applications in fields like economics or biology could validate the efficiency gains beyond simulations.

Load-bearing premise

That superior performance on selected natural populations and artificial data sets proves the estimators are generally more efficient.

What would settle it

A data set or population where the mean square error of the transformed naïve ratio estimator exceeds that of the naïve estimator.

Figures

Figures reproduced from arXiv: 1907.00519 by Nirmal Tiwari, Sanjay Kumar.

**Figure 2.** Figure 2: Scatter plot of study variable Vs auxiliary variable and Box plot of the study variable and auxiliary variable for Real as well as Generated data [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

**Figure 6.** Figure 6: Exact values of the confidence intervals and the corresponding estimates of the different estimators for the real data [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗

read the original abstract

In this paper, we propose a transformed na\"ive ratio and product based estimators using the characterizing scalar in presence of auxiliary information of the study variable for estimating the population mode following simple random sampling without replacement. The bias, mean square errors, relative efficiency, ratios of the exact values of mean square errors to the simulated mean square errors and confidence interval are studied for the performance of the proposed transformed na\"ive ratio type estimator with the certain natural population as well as artificially generated data sets. We have shown that proposed transformed na\"ive ratio based estimator is more efficient than the na\"ive estimator and na\"ive ratio estimator of the population mode.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The transformed ratio estimator shows gains in the authors' simulations on selected data but without a general proof that it improves MSE for arbitrary populations.

read the letter

The key point here is that the proposed transformed naive ratio estimator appears more efficient than the basic naive and naive ratio versions in the simulations the authors ran, but this rests entirely on empirical results for selected populations rather than any general guarantee. What the paper does is modify the standard ratio approach by introducing a transformation involving a characterizing scalar, then works out the usual approximate bias and MSE formulas. It compares performance on some real datasets and generated ones, reports relative efficiencies, and includes some simulation checks on the MSE approximations. That part is fine as far as it goes for this kind of work. The derivations seem standard for survey sampling papers. The limitation is the lack of a theoretical result showing the transformation reduces the leading term in the MSE for general cases. Since the mode is a non-smooth functional, the approximations may not be uniformly good, and the gains might not carry over to other populations where the auxiliary correlation differs. The abstract mentions studying ratios of exact to simulated MSE, which is good, but the evidence stays tied to the chosen data sets. This paper is aimed at researchers in survey methodology focused on estimating the mode rather than mean or total. Someone already working on ratio estimators for non-smooth parameters might pick up the transformation idea. It deserves a serious referee to check the math and the simulation setup, even though the scope is specialized.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes transformed naive ratio and product estimators for the population mode under simple random sampling without replacement, using auxiliary information and a characterizing scalar. Approximate bias and MSE expressions are derived (likely via Taylor linearization), relative efficiencies are computed, and performance is assessed via simulations on natural populations and artificially generated datasets, with the central claim that the transformed naive ratio estimator is more efficient than the naive and naive-ratio estimators.

Significance. If the efficiency gains prove robust beyond the selected datasets, the estimators could provide practical improvements for mode estimation in survey sampling when auxiliary data is available. However, the absence of a general analytic result establishing that the transformation reduces the leading MSE term for arbitrary distributions, combined with reliance on empirical comparisons and smoothness assumptions that may not hold for the mode, limits the broader significance of the contribution.

major comments (3)

[§4] §4 (Bias and MSE derivations): the approximate MSE expressions are obtained via Taylor linearization around the characterizing scalar, but no analysis is given of the remainder term or conditions under which the approximation is valid for the mode (which lacks differentiability at the population level).
[§5] §5 (Simulation study): the reported relative efficiencies and ratios of exact to simulated MSE are tabulated for specific natural populations and artificial datasets, yet the manuscript does not state whether the characterizing scalar was chosen independently of these evaluation sets or tuned to them, raising the possibility that observed gains are not general.
[Abstract and §6] Abstract and §6 (Conclusion): the claim that the transformed estimator 'is more efficient' is supported only by the tabulated simulation results; no theorem or inequality is provided showing that the added transformation term strictly reduces the leading MSE term for arbitrary finite populations under SRSWOR.

minor comments (2)

[Abstract] The abstract refers to 'ratios of the exact values of mean square errors to the simulated mean square errors' without defining how the exact MSE is obtained or computed in the simulation section.
[§3] Notation for the characterizing scalar and the transformation function is introduced without an explicit equation number or clear definition in the proposed-estimator section.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We respond point-by-point to the major comments, indicating revisions where appropriate.

read point-by-point responses

Referee: [§4] §4 (Bias and MSE derivations): the approximate MSE expressions are obtained via Taylor linearization around the characterizing scalar, but no analysis is given of the remainder term or conditions under which the approximation is valid for the mode (which lacks differentiability at the population level).

Authors: The bias and MSE approximations follow the standard first-order Taylor linearization approach widely used in survey sampling for ratio and product estimators. The expansion is performed with respect to the characterizing scalar to obtain the leading terms. We acknowledge that no explicit remainder-term analysis or differentiability conditions are provided in the manuscript; the mode is indeed non-differentiable, so the approximation relies on large-sample behavior and local linearity around the scalar. We will add a short paragraph in the revised §4 noting these limitations and the reliance on simulation validation. revision: partial
Referee: [§5] §5 (Simulation study): the reported relative efficiencies and ratios of exact to simulated MSE are tabulated for specific natural populations and artificial datasets, yet the manuscript does not state whether the characterizing scalar was chosen independently of these evaluation sets or tuned to them, raising the possibility that observed gains are not general.

Authors: The characterizing scalar is fixed in advance using the known auxiliary information and is not tuned to any of the simulation populations. We will revise the simulation section to state this selection procedure explicitly and confirm independence from the evaluation datasets. revision: yes
Referee: [Abstract and §6] Abstract and §6 (Conclusion): the claim that the transformed estimator 'is more efficient' is supported only by the tabulated simulation results; no theorem or inequality is provided showing that the added transformation term strictly reduces the leading MSE term for arbitrary finite populations under SRSWOR.

Authors: The claim rests on the approximate MSE expressions derived in §4, which show the conditions under which the transformation term reduces the leading MSE term relative to the naive and naive-ratio estimators, together with the simulation results on both natural and artificial populations. No general theorem establishing strict reduction for every possible finite population is provided, as such a result would require distributional assumptions not assumed in the paper; the contribution is framed as practical improvement demonstrated via the MSE formulas and empirical evidence. revision: no

Circularity Check

0 steps flagged

No significant circularity detected.

full rationale

The paper proposes transformed naive ratio and product estimators for the population mode, derives approximate bias and MSE expressions (standard in sampling theory via linearization), and evaluates relative efficiency through direct computation on selected natural populations and artificially generated data sets. This empirical validation does not reduce any result to its inputs by construction, nor does it rely on self-citations, fitted parameters renamed as predictions, or uniqueness theorems imported from prior work. The central efficiency claim rests on observable performance differences across the chosen data rather than tautological re-expression of the estimator definitions themselves, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 0 axioms · 0 invented entities

Abstract alone provides no identifiable free parameters beyond the unnamed characterizing scalar, no explicit axioms, and no invented entities; all assessments rest on standard sampling theory assumptions not detailed here.

free parameters (1)

characterizing scalar
Mentioned as used to transform the naive ratio and product estimators; likely a scalar chosen or fitted but value and selection method unknown from abstract.

pith-pipeline@v0.9.0 · 5634 in / 1158 out tokens · 33188 ms · 2026-05-25T12:21:48.146767+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

[1]

Chernoff, H. (1964). Estimation of the mode. Annals of the Institute of the Statistical Mathematics 16:31-41

work page 1964
[2]

Cochran, W. G. (1940). Some properties of estimators based on sampling scheme with varying probabilities. The Australian Journal of Statistics 17:22-28

work page 1940
[3]

Cochran, W. G. (1963). Sampling Techniques. New York: John Wiley and Sons

work page 1963
[4]

Dalenius, T. (1965). The mode -a neglected parameter . J ournal of the Royal Statistical Society 128:110-118

work page 1965
[5]

Doodson, A.T. (1917). Relation of the mode, median and mean in frequency functions. Biometrika 11:425-429

work page 1917
[6]

Grenander, U. (1965). Some direct estimates of the mode . Annals of Mathematical Statistics 36:131-138

work page 1965
[7]

Gross, S. T. (1980). Median estimation in sample surveys. Proceedings of the Survey Research Methods Section (American Statistical Association) , Houstan, Texas, August 11-14, 1980. Pp. 181-184

work page 1980
[8]

G., Stuart, A

Kendall, M. G., Stuart, A. (1977). The Advanced Theory of Statistics . Vol. 1, 4 th ed. New York:Hafner Publishing Co

work page 1977
[9]

Kuk, A. Y. C., Mak, T. K. (1989). Median estimation in the presence of auxiliary information. Journal of the Royal Statistical Society B 51:261-269

work page 1989
[10]

Robertson, T., Cryer, J.D. ( 1974). An iterative procedure for estimating the mode. Journal of the American Statistical Association 69:1012-1016

work page 1974
[11]

F., Singh, S

Rueda, M., Arcos, A.,Munoz, J. F., Singh, S. (2007). Quantile estimation in two -phase sampling. Computational Statistics and Data Analysis 51(5):2559-2572

work page 2007
[12]

Silverman, B. W. (1986). Density estimation for statistics and data analysis. London, UK:Chapman and Hall

work page 1986
[13]

Venter, J. H. (1967). On estimation of the mode. Annals of Mathematical Statistics 34:1446-1455

work page 1967
[14]

Yasukawa, K. (1926). On the probable error of the mode of skew frequency distributions. Biometrika 18:263-292

work page 1926
[15]

Khare, B.B., Kumar, S. (2009). Transformed two phase sampling ratio and product type estimators for population mean in the presence of non -response. Aligarh Journal of Statistics 29:91-106

work page 2009
[16]

Estimation of Mode Using Auxiliary Information

Sedory, Stephen A.,Singh Sarjinder (2014). Estimation of Mode Using Auxiliary Information. Communication in Statistics-Simulation and Computation 43:2390-2402

work page 2014

[1] [1]

Chernoff, H. (1964). Estimation of the mode. Annals of the Institute of the Statistical Mathematics 16:31-41

work page 1964

[2] [2]

Cochran, W. G. (1940). Some properties of estimators based on sampling scheme with varying probabilities. The Australian Journal of Statistics 17:22-28

work page 1940

[3] [3]

Cochran, W. G. (1963). Sampling Techniques. New York: John Wiley and Sons

work page 1963

[4] [4]

Dalenius, T. (1965). The mode -a neglected parameter . J ournal of the Royal Statistical Society 128:110-118

work page 1965

[5] [5]

Doodson, A.T. (1917). Relation of the mode, median and mean in frequency functions. Biometrika 11:425-429

work page 1917

[6] [6]

Grenander, U. (1965). Some direct estimates of the mode . Annals of Mathematical Statistics 36:131-138

work page 1965

[7] [7]

Gross, S. T. (1980). Median estimation in sample surveys. Proceedings of the Survey Research Methods Section (American Statistical Association) , Houstan, Texas, August 11-14, 1980. Pp. 181-184

work page 1980

[8] [8]

G., Stuart, A

Kendall, M. G., Stuart, A. (1977). The Advanced Theory of Statistics . Vol. 1, 4 th ed. New York:Hafner Publishing Co

work page 1977

[9] [9]

Kuk, A. Y. C., Mak, T. K. (1989). Median estimation in the presence of auxiliary information. Journal of the Royal Statistical Society B 51:261-269

work page 1989

[10] [10]

Robertson, T., Cryer, J.D. ( 1974). An iterative procedure for estimating the mode. Journal of the American Statistical Association 69:1012-1016

work page 1974

[11] [11]

F., Singh, S

Rueda, M., Arcos, A.,Munoz, J. F., Singh, S. (2007). Quantile estimation in two -phase sampling. Computational Statistics and Data Analysis 51(5):2559-2572

work page 2007

[12] [12]

Silverman, B. W. (1986). Density estimation for statistics and data analysis. London, UK:Chapman and Hall

work page 1986

[13] [13]

Venter, J. H. (1967). On estimation of the mode. Annals of Mathematical Statistics 34:1446-1455

work page 1967

[14] [14]

Yasukawa, K. (1926). On the probable error of the mode of skew frequency distributions. Biometrika 18:263-292

work page 1926

[15] [15]

Khare, B.B., Kumar, S. (2009). Transformed two phase sampling ratio and product type estimators for population mean in the presence of non -response. Aligarh Journal of Statistics 29:91-106

work page 2009

[16] [16]

Estimation of Mode Using Auxiliary Information

Sedory, Stephen A.,Singh Sarjinder (2014). Estimation of Mode Using Auxiliary Information. Communication in Statistics-Simulation and Computation 43:2390-2402

work page 2014