{"work":{"id":"1b66d0a5-f6ae-4332-8025-c662dc64b238","openalex_id":null,"doi":null,"arxiv_id":"2201.08239","raw_key":null,"title":"LaMDA: Language Models for Dialog Applications","authors":null,"authors_text":"Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng","year":2022,"venue":"cs.CL","abstract":"We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding. The first challenge, safety, involves ensuring that the model's responses are consistent with a set of human values, such as preventing harmful suggestions and unfair bias. We quantify safety using a metric based on an illustrative set of human values, and we find that filtering candidate responses using a LaMDA classifier fine-tuned with a small amount of crowdworker-annotated data offers a promising approach to improving model safety. The second challenge, factual grounding, involves enabling the model to consult external knowledge sources, such as an information retrieval system, a language translator, and a calculator. We quantify factuality using a groundedness metric, and we find that our approach enables the model to generate responses grounded in known sources, rather than responses that merely sound plausible. Finally, we explore the use of LaMDA in the domains of education and content recommendations, and analyze their helpfulness and role consistency.","external_url":"https://arxiv.org/abs/2201.08239","cited_by_count":null,"metadata_source":"pith","metadata_fetched_at":"2026-07-01T15:15:47.553765+00:00","pith_arxiv_id":"2201.08239","created_at":"2026-05-08T18:44:01.588717+00:00","updated_at":"2026-07-01T15:15:47.553765+00:00","title_quality_ok":true,"display_title":"LaMDA: Language Models for Dialog Applications","render_title":"LaMDA: Language Models for Dialog Applications"},"hub":{"state":{"work_id":"1b66d0a5-f6ae-4332-8025-c662dc64b238","tier":"hub","tier_reason":"10+ Pith inbound or 1,000+ external citations","pith_inbound_count":79,"external_cited_by_count":null,"distinct_field_count":11,"first_pith_cited_at":"2022-01-28T02:33:07+00:00","last_pith_cited_at":"2026-06-29T18:11:17+00:00","author_build_status":"not_needed","summary_status":"needed","contexts_status":"needed","graph_status":"needed","ask_index_status":"not_needed","reader_status":"not_needed","recognition_status":"not_needed","updated_at":"2026-07-01T21:31:31.228148+00:00","tier_text":"hub"},"tier":"hub","role_counts":[{"context_role":"background","n":27},{"context_role":"other","n":2},{"context_role":"baseline","n":1},{"context_role":"method","n":1}],"polarity_counts":[{"context_polarity":"background","n":24},{"context_polarity":"unclear","n":4},{"context_polarity":"baseline","n":1},{"context_polarity":"support","n":1},{"context_polarity":"use_method","n":1}],"runs":{"context_extract":{"job_type":"context_extract","status":"succeeded","result":{"enqueued_papers":25},"error":null,"updated_at":"2026-05-14T17:48:48.593952+00:00"},"graph_features":{"job_type":"graph_features","status":"succeeded","result":{"co_cited":[{"title":"PaLM: Scaling Language Modeling with Pathways","work_id":"a94f3ef7-2c49-4445-93fe-6ec16aafd966","shared_citers":22},{"title":"Scaling Language Models: Methods, Analysis & Insights from Training Gopher","work_id":"47ce8be9-e500-407d-af41-ac2d132215eb","shared_citers":21},{"title":"Scaling Laws for Neural Language Models","work_id":"b7dd8749-9c45-4977-ab9b-64478dce1ae8","shared_citers":21},{"title":"On the Opportunities and Risks of Foundation Models","work_id":"a18039e9-928d-47c9-a836-32656a71bf71","shared_citers":16},{"title":"Training Compute-Optimal Large Language Models","work_id":"b2faf28d-86b7-429c-bc42-469458efc246","shared_citers":16},{"title":"Training Verifiers to Solve Math Word Problems","work_id":"acab1aa8-b4d6-40e0-a3ee-25341701dca2","shared_citers":16},{"title":"Evaluating Large Language Models Trained on Code","work_id":"042493e9-b26f-4b4e-bbde-382072ca9b08","shared_citers":13},{"title":"Scaling Instruction-Finetuned Language Models","work_id":"8405abb1-7558-4fdf-af24-f4c52fa77a06","shared_citers":12},{"title":"Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback","work_id":"a1f2574b-a899-4713-be60-c87ba332656c","shared_citers":12},{"title":"Chain-of-Thought Prompting Elicits Reasoning in Large Language Models","work_id":"d1cf6693-a082-403c-ada9-dac7b96341f9","shared_citers":11},{"title":"Ethical and social risks of harm from Language Models","work_id":"b4ce1c45-ef69-445a-a872-dbb785b485e9","shared_citers":11},{"title":"Finetuned Language Models Are Zero-Shot Learners","work_id":"7ed6cdaa-ed67-4db4-aceb-b7e1b0e6e7c4","shared_citers":11},{"title":"Large Language Models are Zero-Shot Reasoners","work_id":"d9b7eb1a-7165-46ff-9f06-d2f0b9d6f95d","shared_citers":11},{"title":"LLaMA: Open and Efficient Foundation Language Models","work_id":"c018fc23-6f3f-4035-9d02-28a2173b2b9d","shared_citers":11},{"title":"Training language models to follow instructions with human feedback","work_id":"52aff42f-4fa9-4fcf-bdb3-1459b9bebf65","shared_citers":11},{"title":"OPT: Open Pre-trained Transformer Language Models","work_id":"d7ff3b21-1fff-4cf4-952a-4714e3ef2307","shared_citers":10},{"title":"WebGPT: Browser-assisted question-answering with human feedback","work_id":"e25ef3e1-4848-4cb9-bf28-67a420591165","shared_citers":10},{"title":"Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models","work_id":"bb63abb3-0d50-4362-b97c-b5e725b03b39","shared_citers":9},{"title":"Red Teaming Language Models with Language Models","work_id":"d1274c54-508f-42f9-aeb3-91db13f3a622","shared_citers":9},{"title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding","work_id":"ed240a10-5b19-406c-baa5-30803f465785","shared_citers":8},{"title":"BLOOM: A 176B-Parameter Open-Access Multilingual Language Model","work_id":"337ba690-f35d-4154-9450-8edf4bc9f488","shared_citers":8},{"title":"Improving alignment of dialogue agents via targeted human judgements","work_id":"6ad5970e-7550-4ae8-a158-7084dec7e3cc","shared_citers":8},{"title":"RoBERTa: A Robustly Optimized BERT Pretraining Approach","work_id":"41fe12c4-e538-4890-a244-480650ed3078","shared_citers":8},{"title":"Show Your Work: Scratchpads for Intermediate Computation with Language Models","work_id":"a05b1e60-8e76-4f26-9bea-28927a5f8620","shared_citers":8}],"time_series":[{"n":14,"year":2022},{"n":13,"year":2023},{"n":3,"year":2024},{"n":1,"year":2025},{"n":11,"year":2026}],"dependency_candidates":[]},"error":null,"updated_at":"2026-05-14T17:48:39.560221+00:00"},"identity_refresh":{"job_type":"identity_refresh","status":"succeeded","result":{"items":[{"title":"Qwen3 Technical Report","outcome":"unchanged","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","resolver":"local_arxiv","confidence":0.98,"old_work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e"}],"counts":{"fixed":0,"merged":0,"unchanged":1,"quarantined":0,"needs_external_resolution":0},"errors":[],"attempted":1},"error":null,"updated_at":"2026-05-14T17:48:36.269598+00:00"},"summary_claims":{"job_type":"summary_claims","status":"succeeded","result":{"title":"LaMDA: Language Models for Dialog Applications","claims":[{"claim_text":"We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding. The first challenge, safety, i","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks LaMDA: Language Models for Dialog Applications because it crossed a citation-hub threshold.","role_counts":[]},"error":null,"updated_at":"2026-05-14T17:48:36.274572+00:00"}},"summary":{"title":"LaMDA: Language Models for Dialog Applications","claims":[{"claim_text":"We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding. The first challenge, safety, i","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks LaMDA: Language Models for Dialog Applications because it crossed a citation-hub threshold.","role_counts":[]},"graph":{"co_cited":[{"title":"PaLM: Scaling Language Modeling with Pathways","work_id":"a94f3ef7-2c49-4445-93fe-6ec16aafd966","shared_citers":22},{"title":"Scaling Language Models: Methods, Analysis & Insights from Training Gopher","work_id":"47ce8be9-e500-407d-af41-ac2d132215eb","shared_citers":21},{"title":"Scaling Laws for Neural Language Models","work_id":"b7dd8749-9c45-4977-ab9b-64478dce1ae8","shared_citers":21},{"title":"On the Opportunities and Risks of Foundation Models","work_id":"a18039e9-928d-47c9-a836-32656a71bf71","shared_citers":16},{"title":"Training Compute-Optimal Large Language Models","work_id":"b2faf28d-86b7-429c-bc42-469458efc246","shared_citers":16},{"title":"Training Verifiers to Solve Math Word Problems","work_id":"acab1aa8-b4d6-40e0-a3ee-25341701dca2","shared_citers":16},{"title":"Evaluating Large Language Models Trained on Code","work_id":"042493e9-b26f-4b4e-bbde-382072ca9b08","shared_citers":13},{"title":"Scaling Instruction-Finetuned Language Models","work_id":"8405abb1-7558-4fdf-af24-f4c52fa77a06","shared_citers":12},{"title":"Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback","work_id":"a1f2574b-a899-4713-be60-c87ba332656c","shared_citers":12},{"title":"Chain-of-Thought Prompting Elicits Reasoning in Large Language Models","work_id":"d1cf6693-a082-403c-ada9-dac7b96341f9","shared_citers":11},{"title":"Ethical and social risks of harm from Language Models","work_id":"b4ce1c45-ef69-445a-a872-dbb785b485e9","shared_citers":11},{"title":"Finetuned Language Models Are Zero-Shot Learners","work_id":"7ed6cdaa-ed67-4db4-aceb-b7e1b0e6e7c4","shared_citers":11},{"title":"Large Language Models are Zero-Shot Reasoners","work_id":"d9b7eb1a-7165-46ff-9f06-d2f0b9d6f95d","shared_citers":11},{"title":"LLaMA: Open and Efficient Foundation Language Models","work_id":"c018fc23-6f3f-4035-9d02-28a2173b2b9d","shared_citers":11},{"title":"Training language models to follow instructions with human feedback","work_id":"52aff42f-4fa9-4fcf-bdb3-1459b9bebf65","shared_citers":11},{"title":"OPT: Open Pre-trained Transformer Language Models","work_id":"d7ff3b21-1fff-4cf4-952a-4714e3ef2307","shared_citers":10},{"title":"WebGPT: Browser-assisted question-answering with human feedback","work_id":"e25ef3e1-4848-4cb9-bf28-67a420591165","shared_citers":10},{"title":"Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models","work_id":"bb63abb3-0d50-4362-b97c-b5e725b03b39","shared_citers":9},{"title":"Red Teaming Language Models with Language Models","work_id":"d1274c54-508f-42f9-aeb3-91db13f3a622","shared_citers":9},{"title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding","work_id":"ed240a10-5b19-406c-baa5-30803f465785","shared_citers":8},{"title":"BLOOM: A 176B-Parameter Open-Access Multilingual Language Model","work_id":"337ba690-f35d-4154-9450-8edf4bc9f488","shared_citers":8},{"title":"Improving alignment of dialogue agents via targeted human judgements","work_id":"6ad5970e-7550-4ae8-a158-7084dec7e3cc","shared_citers":8},{"title":"RoBERTa: A Robustly Optimized BERT Pretraining Approach","work_id":"41fe12c4-e538-4890-a244-480650ed3078","shared_citers":8},{"title":"Show Your Work: Scratchpads for Intermediate Computation with Language Models","work_id":"a05b1e60-8e76-4f26-9bea-28927a5f8620","shared_citers":8}],"time_series":[{"n":14,"year":2022},{"n":13,"year":2023},{"n":3,"year":2024},{"n":1,"year":2025},{"n":11,"year":2026}],"dependency_candidates":[]},"authors":[]}}