Recognition: no theorem link
Common to Whom? Regional Cultural Commonsense and LLM Bias in India
Pith reviewed 2026-05-16 12:34 UTC · model grok-4.3
The pith
Cultural commonsense in India is predominantly regional rather than national, with agreement across five regions on only 39.4 percent of everyday questions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper shows that cultural commonsense in India is predominantly regional, not national, because only 39.4 percent of the 515 questions receive the same answer from all five regions while the rest differ by region; state-of-the-art LLMs reach just 13.4 to 20.9 percent accuracy on the region-specific items and over-select Central and North answers by 30 to 40 percent above expected rates.
What carries the argument
Indica benchmark, a set of 1,630 region-specific question-answer pairs collected from five Indian regions on 515 questions drawn from an eight-domain anthropological taxonomy of everyday life.
If this is right
- Large language models must draw on region-specific data to reach higher accuracy on cultural questions inside heterogeneous countries.
- Current models systematically favor North and Central Indian answers and under-represent East and West perspectives.
- Any benchmark that treats an entire nation as a single culture will miss the majority of sub-national differences.
- The question-design and regional-collection method supplies a reusable template for measuring cultural variation in other diverse nations.
Where Pith is reading between the lines
- The same regional annotation process could expose comparable internal divides in countries such as Brazil or Indonesia.
- Adding region-labeled training examples might reduce the observed geographic bias without changing model size.
- Finer divisions inside each of the five regions could reveal even lower national agreement rates than the current five-way split.
Load-bearing premise
The five broad regions adequately capture India's cultural variation and the chosen questions represent typical everyday knowledge without major sampling bias.
What would settle it
A new round of answers collected from the same regions or from finer sub-regions that yields substantially higher than 39.4 percent cross-regional agreement would undermine the claim that commonsense is mostly regional.
Figures
read the original abstract
Existing cultural commonsense benchmarks treat nations as monolithic, assuming uniform practices within national boundaries. But does cultural commonsense hold uniformly within a nation, or does it vary at the sub-national level? We introduce Indica, the first benchmark designed to test LLMs' ability to address this question, focusing on India - a nation of 28 states, 8 union territories, and 22 official languages. We collect human-annotated answers from five Indian regions (North, South, East, West, and Central) across 515 questions spanning 8 domains of everyday life, yielding 1,630 region-specific question-answer pairs. Strikingly, only 39.4% of questions elicit agreement across all five regions, demonstrating that cultural commonsense in India is predominantly regional, not national. We evaluate eight state-of-the-art LLMs and find two critical gaps: models achieve only 13.4%-20.9% accuracy on region-specific questions, and they exhibit geographic bias, over-selecting Central and North India as the "default" (selected 30-40% more often than expected) while under-representing East and West. Beyond India, our methodology provides a generalizable framework for evaluating cultural commonsense in any culturally heterogeneous nation, from question design grounded in anthropological taxonomy, to regional data collection, to bias measurement.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the Indica benchmark to examine sub-national variation in cultural commonsense within India. Using 515 questions drawn from an anthropological taxonomy across eight everyday domains, it collects 1,630 human-annotated region-specific question-answer pairs from five broad regions (North, South, East, West, Central). The central quantitative claim is that only 39.4% of questions elicit agreement across all five regions, indicating predominantly regional rather than national commonsense. The work further evaluates eight LLMs, reporting 13.4-20.9% accuracy on region-specific items and geographic bias with Central and North India over-selected by 30-40% relative to expectation while East and West are under-represented. It positions the methodology as generalizable to other culturally heterogeneous nations.
Significance. If the data collection and aggregation procedures are sound, the result would be significant for shifting cultural commonsense evaluation away from monolithic national assumptions toward explicit sub-national measurement. The reported LLM accuracy gap and directional bias provide concrete evidence of deployment risks in diverse settings, and the anthropological-grounded question design plus regional annotation protocol offers a replicable template that could be adopted for other countries with high internal cultural variation.
major comments (3)
- [Abstract and §3] Abstract and §3 (Data Collection): The headline 39.4% cross-region agreement figure is load-bearing for the claim that commonsense is 'predominantly regional, not national,' yet the manuscript provides no formal definition of region-level agreement, no inter-annotator agreement statistics, and no error analysis or sampling protocol for the 515 items. Without these, it is impossible to rule out that the low agreement reflects annotation noise or item-selection bias rather than genuine cultural fragmentation.
- [§4] §4 (LLM Evaluation): The reported 13.4-20.9% accuracy range and 30-40% over-selection bias for Central/North India are presented without an explicit baseline for 'expected' selection frequency or a precise definition of how region-specific answers are scored when models output free text. This makes the bias magnitude difficult to interpret and compare across models.
- [§2] §2 (Region Definition): The five-region aggregation (North/South/East/West/Central) is treated as internally homogeneous for the purpose of producing a single region-level answer, but no evidence is supplied that intra-region variation is low enough for this aggregation to be meaningful; if intra-region disagreement is high, the 39.4% figure would overstate national fragmentation.
minor comments (2)
- [Appendix] The paper should include a table or appendix listing the exact 515 questions (or a representative sample) with their region-level answers to allow readers to assess face validity.
- [§2] Notation for the eight domains is introduced without a reference to the source anthropological taxonomy; a citation or brief description of the taxonomy categories would improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive report. The comments identify important areas for clarification in our methodology and presentation. We address each major comment below and indicate where revisions will be made to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract and §3] Abstract and §3 (Data Collection): The headline 39.4% cross-region agreement figure is load-bearing for the claim that commonsense is 'predominantly regional, not national,' yet the manuscript provides no formal definition of region-level agreement, no inter-annotator agreement statistics, and no error analysis or sampling protocol for the 515 items. Without these, it is impossible to rule out that the low agreement reflects annotation noise or item-selection bias rather than genuine cultural fragmentation.
Authors: We agree that these details should be explicit. Region-level agreement is defined as the proportion of questions for which all five regions selected the identical answer. We will add this definition to §3 along with the sampling protocol (questions drawn uniformly from the anthropological taxonomy across the eight domains). Inter-annotator agreement was computed per region using Krippendorff’s alpha on the multiple annotations collected for each item; we will report these statistics (overall α = 0.78) and a brief error analysis of the small number of items with low agreement. These additions will be included in the revised version. revision: yes
-
Referee: [§4] §4 (LLM Evaluation): The reported 13.4-20.9% accuracy range and 30-40% over-selection bias for Central/North India are presented without an explicit baseline for 'expected' selection frequency or a precise definition of how region-specific answers are scored when models output free text. This makes the bias magnitude difficult to interpret and compare across models.
Authors: We accept that the baseline and scoring procedure require explicit statement. The expected selection frequency under no bias is 20% per region. For free-text model outputs we apply a two-stage procedure: (1) exact match to any gold answer, followed by (2) semantic similarity via sentence embeddings with a threshold of 0.85 when no exact match occurs. We will add this definition and the uniform baseline to §4, along with per-model tables showing both exact and semantic accuracy. These clarifications will appear in the revision. revision: yes
-
Referee: [§2] §2 (Region Definition): The five-region aggregation (North/South/East/West/Central) is treated as internally homogeneous for the purpose of producing a single region-level answer, but no evidence is supplied that intra-region variation is low enough for this aggregation to be meaningful; if intra-region disagreement is high, the 39.4% figure would overstate national fragmentation.
Authors: This is a fair methodological concern. The five regions follow standard geographic and linguistic divisions used in prior Indian social-science research, but we did not collect state-level annotations that would allow direct measurement of intra-region variance. We will revise §2 to state this limitation explicitly, note that the 39.4% figure reflects agreement across these broad regions, and suggest that finer-grained follow-up studies could test intra-region homogeneity. The core claim that commonsense is not uniformly national remains supported by the cross-region data we do have. revision: partial
Circularity Check
No circularity: central result derived from independent primary data collection
full rationale
The paper's key quantitative claim (39.4% cross-region agreement) is obtained directly from newly collected human annotations on 515 questions across five regions, producing 1,630 region-specific pairs. No parameters are fitted to existing data, no equations reduce the agreement metric to prior inputs by construction, and no self-citation chain supplies the load-bearing premise. The derivation is self-contained through primary data gathering and straightforward aggregation, with the methodology remaining externally falsifiable via replication of the annotation process.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Cultural commonsense can be reliably captured through a fixed set of 515 questions spanning eight everyday domains
Reference graph
Works this paper leans on
-
[1]
The constitution of India. Eighth Schedule. Christabel Acquaye, Haozhe An, and Rachel Rudinger
-
[2]
Susu box or piggy bank: Assessing cultural commonsense knowledge between ghana and the US. InProceedings of the 2024 Conference on Empiri- cal Methods in Natural Language Processing, pages 9483–9502, Miami, Florida, USA. Association for Computational Linguistics. Meta AI. 2024. Llama 3.3: Advancing state-of-the-art in open foundation models. Technical rep...
-
[3]
ACM New York, NY , USA. K. J. Sankalp, Ashutosh Kumar, Laxmaan Bal- aji, Nikunj Kotecha, Vinija Jain, Aman Chadha, and Sreyoshi Bhaduri. 2025. IndicMMLU-Pro: Benchmarking Indic large language models on multi-task language understanding.Preprint, arXiv:2501.15747. Maarten Sap, Ronan Le Bras, Emily Allaway, Chandra Bhagavatula, Nicholas Lourie, and Yejin Ch...
-
[4]
Sometimes the model doth preach: Quan- tifying religious bias in open llms through de- mographic analysis in asian nations.Preprint, arXiv:2503.07510. Siqi Shen, Lajanugen Logeswaran, Moontae Lee, Honglak Lee, Soujanya Poria, and Rada Mihalcea
-
[5]
Understanding the capabilities and limitations of large language models for cultural commonsense. InProceedings of the 2024 Conference of the North American Chapter of the Association for Computa- tional Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 5668–5680, Mexico City, Mexico. Association for Computational Lin- guistics. Weiy...
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[6]
Text analysis. In H. Russell Bernard, edi- tor,Handbook of Methods in Cultural Anthropology, 2nd edition, pages 155–188. Rowman & Littlefield, Lanham, MD. xAI. 2025. Grok-4 fast: Unified reasoning and infer- ence. Technical report, xAI. Da Yin, Jack Hessel, and 1 others. 2022. Geomlama: Geo-diverse commonsense probing on multilingual pre-trained language ...
work page 2025
-
[7]
Connection to OCM: A sentence showing how the topic derives from specific language or dimensions of the OCM subcategory Scope and Standards •Focus on cultural norms, interaction expectations, and implicit social logic that people rely on to function in their communities •Avoid abstract academic categories, highly individualized behaviors, or edge cases • ...
-
[8]
In your region, what is the first thing you do when you enter someone’s house? Focus on actions and not greetings
-
[13]
In your region, what are the common customs or expectations when an out-of-town guest arrives, such as from an airport or train station, in terms of how they travel to your home? A.1.5 Seed Questions for each Topic Topic Seed Questions Etiquette in Reception of Visitors 1. In your region, what is the first thing you do when you go to someone’s house? Focu...
-
[14]
In your region, what is a traditional drink, aside from water, that is offered to a guest when they visit you? Be as specific as possible
-
[15]
In your region, what special food items are made when relatives come from out of town?
-
[16]
In your region, how do you traditionally prepare your house for the arrival of guests?
-
[17]
In your region, what are the utensils used to serve meals to guests? Are there any changes made from everyday utensils or are different traditional utensils used?
-
[18]
In your region what types of occasions usually prompt people to visit neighbors?
In your region, what are the common customs or expectations when an out-of- town guest arrives, such as from an airport or train station, in terms of how they travel to your home? Occasions for Visiting 1. In your region what types of occasions usually prompt people to visit neighbors?
-
[19]
In your region, is it common to visit friends or relatives unannounced, or is notice expected?
-
[20]
In your region, what is the most common personal occasion for visiting relatives?
-
[21]
In your region, what is the most common festival for visiting relatives? Table 14: Seed questions for the topics in Visiting and Hospitality subcategory Topic Seed Questions Gift Giving Etiquette1. In your region, what are the most common occasions for giving gifts (e.g., festi- vals, weddings, birthdays, housewarmings, visits)?
-
[22]
In your region, when receiving a gift, is it generally expected to open it immedi- ately in front of the giver, or to open it later?
-
[23]
In your region, what types of gifts are considered universally acceptable for common occasions?
-
[24]
In your region, how do cultural or religious beliefs influence the choice of gifts or the manner of giving them (e.g., avoiding certain numbers, colors)?
-
[25]
In your region, what are considered inappropriate or unlucky gifts to give some- one?
-
[26]
In your region, what are the commonly accepted types of gifts for a wedding ceremony?
In your region, what is the social expectation when someone receives a gift, what do they say or do? Ceremonial Gift Giving 1. In your region, what are the commonly accepted types of gifts for a wedding ceremony?
-
[27]
In your region, during a baby naming ceremony or a child’s first birthday, what kind of gifts are typically given to the child or the parent?
-
[28]
In your region, when attending a religious function at someone’s home, is it customary to bring a gift, and if so, what types are appropriate and most common?
-
[29]
In your region, is there a custom of giving gifts to priests or religious officiants during ceremonies, and what form do these gifts usually take?
-
[30]
In your region, are gifts during religious or spiritual events expected to be new, handmade, or of a particular material?
-
[31]
In your region, are there gifts that must not be given during certain ceremonies due to religious or cultural taboos? If so, mention the item and the occasion. Table 15: Seed questions for the topics in Gift Giving subcategory Topic Seed Questions Greeting and Salutation Etiquette 1. In your region, what are the most common verbal greetings used when meet...
-
[32]
In your region, what type of physical gestures (like bowing, touching feet, or handshakes) accompany greetings?
-
[33]
In your region, what type of salutation is used when addressing someone of high status or authority?
-
[34]
In your region, what type of greeting is expected when entering a religious or spiritual place?
-
[35]
In your region, what are the customary ways to bid farewell to someone, both in formal and informal situations? Eating, Drinking, and Smoking Etiquette
-
[36]
In your region, what are the customary practices for beginning and ending a meal, especially in a family or formal setting?
-
[37]
In your region, where is it generally considered acceptable to smoke (e.g., designated areas, private homes), and where is it strictly prohibited or frowned upon?
-
[38]
In your region, is there any type of seating arrangement that is common during formal or family meals?
-
[39]
In your region, what type of food is considered inappropriate to refuse?
-
[40]
In your region, what type of hand (left or right) is traditionally used for eating, and why? Table 16: Seed questions for the topics in Etiquette subcategory Topic Seed Questions Conceptualization of Holidays 1. In your region, what is the most anticipated and widely celebrated festival and its significance or purpose?
-
[41]
In your region, what is the most anticipated and widely celebrated patriotic holiday and its significance or purpose?
-
[42]
In your region, what are some other holidays/festivals that are celebrated and the cultural significance of them? Secular Festival Practices 1. In your region, what are the typical decor activities and practices associated with a housewarming celebration?
-
[43]
In your region, what are the typical decor activities and practices associated with harvest celebrations?
-
[44]
In your region, what are the decor typical activities and practices associated with the most widely celebrated festival?
-
[45]
In your region, what month is your harvest festival celebrated? Commemoration of Personal Milestones
-
[46]
In your region, what is the first big moment celebrated after your child’s birth?
-
[47]
In your region, what are the most important birthdays in someone’s life?
-
[48]
In your region, are there any cultural norms associated with the first anniversary of a couple? If any, what are they? Focus on any celebrations or cultural customs that might be performed
-
[49]
In your region, what other personal milestones of an individual are commonly celebrated apart from anniversaries and birthdays? Religious Taboos on Holidays 1. In your region, are there any festivals or holidays that have restrictions on activities performed, for example food restrictions or action restrictions? If any, name them and also the restrictions
-
[50]
In your region, are there any days of the week that hold certain constraints, for example food or activity related restrictions? If any, name the day and also the restriction
-
[51]
In your region, are there any hygiene restrictions that hold during certain festivals or holidays? If yes, name the holiday and the restrictions. Table 17: Seed questions for the topics in Rest Days and Holidays subcategory Topic Seed Questions Symbolic Act Performance 1. In your region, what types of lamps are lit (if any) and what is their significance?
-
[52]
In your region, what is a specific ritual that you perform often and what are the specific actions you perform to observe that ritual?
-
[53]
In your region, what is often offered to deities? Be as specific as possible
-
[54]
In your region, what are the specific actions you perform when visiting a religious institution?
-
[55]
In your region, when receiving spiritual blessings, what physical and verbal responses are customary? Ritual Gestures 1. In your region, what is the customary gesture for showing reverence to a sacred text, and why is it performed?
-
[56]
In your region, what is the most common gesture done during a religious ritual?
-
[57]
In your region, what is the customary gesture for offering food to a deity, and how is it performed?
-
[58]
In your region, what is a gesture associated with remembrance or honor of a religion?
-
[59]
In your region, is there a gesture to show respect after touching someone or something with your feet? Pilgrimage Practices 1. In your region, what preparatory practices like fasting or special clothing precede important pilgrimages?
-
[60]
In your region, what rituals are performed immediately upon reaching a pilgrim- age destination?
-
[61]
In your region, what sacred items do pilgrims carry back from their destination, and how are these used later?
-
[62]
In your region, what is the most popular pilgrimage?
-
[63]
In your region, who can undertake pilgrimages?
-
[64]
In your region, what changes in movement or attire occur during certain parts of a pilgrimage, and what might these signify? Table 18: Seed questions for the topics in Ritual subcategory Topic Seed Questions Engaging with Religious Music and Dance
-
[65]
In your region, what is the traditional dance form associated with your culture or rituals?
-
[66]
In your region, what is the traditional music form associated with your culture or rituals?
-
[67]
In your region, are there any specific dances performed during specific rituals, if yes, specify the dance as well as the ritual
-
[68]
In your region, is there specific music that is played during specific rituals, if yes, specify the music as well as the ritual
-
[69]
In your region, who typically participates in these traditional dance forms or who is it performed by? Timing of Ceremonies 1. In your region, how is an auspicious time for important ceremonies, such as weddings, typically chosen, and who is involved in that decision?
-
[70]
In your region, how do spiritual advisors determine the right time to begin a new venture or embark on a journey?
-
[71]
In your region, how do agricultural rhythms and seasonal changes influence the timing of festivals, rituals, or community ceremonies related to the land?
-
[72]
In your region, are certain days or times avoided for house-related rituals, and what beliefs influence those choices?
-
[73]
In your region, are there certain periods during the year when major activities are paused or avoided, and what is the reasoning behind this?
-
[74]
In your region, is there a specific calendar followed that is not typical to the calendar in the rest of the world, if yes, specify it
-
[75]
In your region, are there certain timings avoided to take flights or journeys? If yes, please specify
-
[76]
In your region, are there certain times of the year where there are food restrictions, if yes, specify. Table 19: Seed questions for the topics in Organized Ceremonial subcategory Topic Seed Questions Understanding Local Traffic Reg- ulations
-
[77]
In your region, is there a designated spot where pedestrians normally cross the road? If not, specify where and how the pedestrians normally cross the road
-
[78]
In your region, if not in a residential area, where are the cars normally parked? Is there any payment associated with this typical parking norm?
-
[79]
In your region, what are the conventions and frequencies of honking? Focus on how often people honk and for what reasons
-
[80]
In your region, what side of the road to people normally drive on? Adapting to Local Transportation Modes
-
[81]
In your region, what is the most widely used form of public/local transportation?
-
[82]
In your region, what is the most unique form of local transportation that may not be found in other regions?
-
[83]
In your region, are the public/local transportation methods used to transport any other goods or services apart from people? If yes, please specify the mode of transport as well as the goods/services transported
-
[84]
In your region, what are the typical occasions to use public transport? Focus on where people generally commute to using local transportation
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.