{"paper":{"title":"Language Model Goal Selection Differs from Humans' in a Self-Directed Learning Task","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Language models diverge from humans by exploiting single solutions rather than gradually exploring goals in self-directed learning tasks.","cross_cats":["cs.AI","cs.CY"],"primary_cat":"cs.CL","authors_text":"Anne G. E. Collins, Danielle Perszyk, Dave August, Gaia Molinaro","submitted_at":"2026-02-06T15:39:54Z","abstract_excerpt":"Whether in agentic workflows, social studies, or chat settings, large language models (LLMs) are increasingly being asked to replace humans in choosing which goals to pursue, rather than completing predefined tasks. However, the assumption that LLMs accurately reflect human preferences for goal setting remains largely untested. We assess the validity of LLMs as proxies for human goal selection in a controlled, self-directed learning task borrowed from cognitive science. Across five models (GPT-5, Gemini 2.5 Pro, Claude Sonnet 4.5, Qwen3 32B, and Centaur), we find substantial divergence from hu"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Across five models (GPT-5, Gemini 2.5 Pro, Claude Sonnet 4.5, Qwen3 32B, and Centaur), we find substantial divergence from human behavior. While people gradually explore and learn to achieve goals with diversity across individuals, most models exploit a single identified solution or show surprisingly low performance, with distinct patterns across models and little variability across instances of the same model.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the borrowed cognitive science self-directed learning task validly measures the kind of goal selection preferences that LLMs are being asked to replace in real-world agentic, social, or chat settings.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"LLMs diverge from human goal selection in self-directed learning by exploiting single solutions with low variability across instances.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Language models diverge from humans by exploiting single solutions rather than gradually exploring goals in self-directed learning tasks.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"ec4f2dd5b6352d508a0765d2d5ba8c82229dddf744c522e7d5c702c9756d2b1b"},"source":{"id":"2603.03295","kind":"arxiv","version":2},"verdict":{"id":"5fcf2b8a-2db2-49f6-a941-36c6674fa104","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-16T06:46:21.288365Z","strongest_claim":"Across five models (GPT-5, Gemini 2.5 Pro, Claude Sonnet 4.5, Qwen3 32B, and Centaur), we find substantial divergence from human behavior. While people gradually explore and learn to achieve goals with diversity across individuals, most models exploit a single identified solution or show surprisingly low performance, with distinct patterns across models and little variability across instances of the same model.","one_line_summary":"LLMs diverge from human goal selection in self-directed learning by exploiting single solutions with low variability across instances.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the borrowed cognitive science self-directed learning task validly measures the kind of goal selection preferences that LLMs are being asked to replace in real-world agentic, social, or chat settings.","pith_extraction_headline":"Language models diverge from humans by exploiting single solutions rather than gradually exploring goals in self-directed learning tasks."},"references":{"count":24,"sample":[{"doi":"","year":null,"title":"V ., Arriaga, R","work_id":"371ac3b9-e11c-4fa8-9534-9480633f2fcd","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Concrete Problems in AI Safety","work_id":"c8d14fbe-6eab-464a-95b3-778aabd82fa3","ref_index":2,"cited_arxiv_id":"1606.06565","is_internal_anchor":true},{"doi":"","year":null,"title":"Large-scale study of curiosity-driven learning","work_id":"d0d290bb-0df6-42f3-97e4-0ed0fe5da7e0","ref_index":3,"cited_arxiv_id":"1808.04355","is_internal_anchor":true},{"doi":"","year":null,"title":"Language models trained on media diets can predict public opinion","work_id":"b838ab79-0750-4666-8029-655e49abadcb","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"X., and Schulz, E","work_id":"2c382c35-483d-45e7-83eb-d51483a625d2","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":24,"snapshot_sha256":"195423887f64a4e45ce04d81ad4a51d8065ff8c48a8bc130aec880c4fb7b960a","internal_anchors":9},"formal_canon":{"evidence_count":2,"snapshot_sha256":"b0c3d04efb9316c7692bf51f9be56a846d03e7d71111093142ee2049f987baa2"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}