{"work":{"id":"bdc944db-4be2-44f7-950b-eaef12fab00e","openalex_id":null,"doi":null,"arxiv_id":"1610.01644","raw_key":null,"title":"Understanding intermediate layers using linear classifier probes","authors":null,"authors_text":"Guillaume Alain, Yoshua Bengio","year":2016,"venue":"stat.ML","abstract":"Neural network models have a reputation for being black boxes. We propose to monitor the features at every layer of a model and measure how suitable they are for classification. We use linear classifiers, which we refer to as \"probes\", trained entirely independently of the model itself.\n  This helps us better understand the roles and dynamics of the intermediate layers. We demonstrate how this can be used to develop a better intuition about models and to diagnose potential problems.\n  We apply this technique to the popular models Inception v3 and Resnet-50. Among other things, we observe experimentally that the linear separability of features increase monotonically along the depth of the model.","external_url":"https://arxiv.org/abs/1610.01644","cited_by_count":null,"metadata_source":"pith","metadata_fetched_at":"2026-05-25T06:06:42.776471+00:00","pith_arxiv_id":"1610.01644","created_at":"2026-05-09T05:45:21.514056+00:00","updated_at":"2026-05-25T06:06:42.776471+00:00","title_quality_ok":true,"display_title":"Understanding intermediate layers using linear classifier probes","render_title":"Understanding intermediate layers using linear classifier probes"},"hub":{"state":{"work_id":"bdc944db-4be2-44f7-950b-eaef12fab00e","tier":"hub","tier_reason":"10+ Pith inbound or 1,000+ external citations","pith_inbound_count":85,"external_cited_by_count":null,"distinct_field_count":13,"first_pith_cited_at":"2016-10-05T20:59:01+00:00","last_pith_cited_at":"2026-05-21T20:58:42+00:00","author_build_status":"not_needed","summary_status":"needed","contexts_status":"needed","graph_status":"needed","ask_index_status":"not_needed","reader_status":"not_needed","recognition_status":"not_needed","updated_at":"2026-05-26T08:36:20.699527+00:00","tier_text":"hub"},"tier":"hub","role_counts":[{"context_role":"method","n":10},{"context_role":"background","n":8}],"polarity_counts":[{"context_polarity":"use_method","n":10},{"context_polarity":"background","n":8}],"runs":{"context_extract":{"job_type":"context_extract","status":"succeeded","result":{"enqueued_papers":25},"error":null,"updated_at":"2026-05-14T13:21:10.860415+00:00"},"graph_features":{"job_type":"graph_features","status":"succeeded","result":{"co_cited":[{"title":"Representation Engineering: A Top-Down Approach to AI Transparency","work_id":"45b326e2-e962-41a5-a542-2559e103a19b","shared_citers":9},{"title":"Steering Language Models With Activation Engineering","work_id":"d525fe06-5560-4e97-86fc-7a0e551f5b17","shared_citers":8},{"title":"An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale","work_id":"e96730e3-129b-4db6-b981-15ab7932e297","shared_citers":6},{"title":"Eliciting Latent Predictions from Transformers with the Tuned Lens","work_id":"a127314f-7424-488f-b6d7-8214650c420f","shared_citers":6},{"title":"Qwen2.5 Technical Report","work_id":"d8432992-4980-4a81-85c7-9fa2c2b87f85","shared_citers":6},{"title":"The Llama 3 Herd of Models","work_id":"1549a635-88af-4ac1-acfe-51ae7bb53345","shared_citers":6},{"title":"In-context Learning and Induction Heads","work_id":"db2b0911-2758-4a2a-99dc-15b14b91bd5e","shared_citers":5},{"title":"Qwen3 Technical Report","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","shared_citers":5},{"title":"Sparse Autoencoders Find Highly Interpretable Features in Language Models","work_id":"51960d72-c69f-4db8-8efd-e90e8b4d9524","shared_citers":5},{"title":"Training Verifiers to Solve Math Word Problems","work_id":"acab1aa8-b4d6-40e0-a3ee-25341701dca2","shared_citers":5},{"title":"Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small","work_id":"d1167c73-3f2a-472b-8bf5-0ec282d7988a","shared_citers":4},{"title":"Probing classifiers: Promises, shortcomings, and advances","work_id":"3eabab74-ac71-4292-86ce-b0469cd4e6cf","shared_citers":4},{"title":"Scaling Laws for Neural Language Models","work_id":"b7dd8749-9c45-4977-ab9b-64478dce1ae8","shared_citers":4},{"title":"The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets","work_id":"400e017f-8643-4166-b6da-a75d4446da80","shared_citers":4},{"title":"Decoupled Weight Decay Regularization","work_id":"07ef7360-d385-4033-83f7-8384a6325204","shared_citers":3},{"title":"Designing and interpreting probes with control tasks","work_id":"a10c87f8-f3ea-4b0f-ad42-26762e3ec8ec","shared_citers":3},{"title":"Distilling the Knowledge in a Neural Network","work_id":"d927ab1f-17b8-4002-9d09-c3d55764fbad","shared_citers":3},{"title":"Gaussian Error Linear Units (GELUs)","work_id":"0466fd22-03a1-4a61-af0a-a900e77bb023","shared_citers":3},{"title":"Language Models (Mostly) Know What They Know","work_id":"8ca58a10-da41-4f70-baae-7e449512e345","shared_citers":3},{"title":"Llama 2: Open Foundation and Fine-Tuned Chat Models","work_id":"68a5177f-d644-44c1-bd4f-4e5278c22f5d","shared_citers":3},{"title":"LLaMA: Open and Efficient Foundation Language Models","work_id":"c018fc23-6f3f-4035-9d02-28a2173b2b9d","shared_citers":3},{"title":"Localizing model behavior with path patching","work_id":"fdd07939-8060-4e9e-bd06-ff125eae86ef","shared_citers":3},{"title":"LoRA: Low-Rank Adaptation of Large Language Models","work_id":"0426219a-789e-4964-adc8-a04538510818","shared_citers":3},{"title":"Mass- editing memory in a transformer","work_id":"03332427-98ba-4287-876a-bb315cbddb1d","shared_citers":3}],"time_series":[{"n":1,"year":2016},{"n":1,"year":2023},{"n":48,"year":2026}],"dependency_candidates":[]},"error":null,"updated_at":"2026-05-14T13:21:14.483535+00:00"},"identity_refresh":{"job_type":"identity_refresh","status":"succeeded","result":{"items":[{"title":"Qwen3 Technical Report","outcome":"unchanged","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","resolver":"local_arxiv","confidence":0.98,"old_work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e"}],"counts":{"fixed":0,"merged":0,"unchanged":1,"quarantined":0,"needs_external_resolution":0},"errors":[],"attempted":1},"error":null,"updated_at":"2026-05-14T13:21:07.380825+00:00"},"summary_claims":{"job_type":"summary_claims","status":"succeeded","result":{"title":"Understanding intermediate layers using linear classifier probes","claims":[{"claim_text":"Neural network models have a reputation for being black boxes. We propose to monitor the features at every layer of a model and measure how suitable they are for classification. We use linear classifiers, which we refer to as \"probes\", trained entirely independently of the model itself.\n  This helps us better understand the roles and dynamics of the intermediate layers. We demonstrate how this can be used to develop a better intuition about models and to diagnose potential problems.\n  We apply this technique to the popular models Inception v3 and Resnet-50. Among other things, we observe exper","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks Understanding intermediate layers using linear classifier probes because it crossed a citation-hub threshold.","role_counts":[]},"error":null,"updated_at":"2026-05-14T13:21:10.864265+00:00"}},"summary":{"title":"Understanding intermediate layers using linear classifier probes","claims":[{"claim_text":"Neural network models have a reputation for being black boxes. We propose to monitor the features at every layer of a model and measure how suitable they are for classification. We use linear classifiers, which we refer to as \"probes\", trained entirely independently of the model itself.\n  This helps us better understand the roles and dynamics of the intermediate layers. We demonstrate how this can be used to develop a better intuition about models and to diagnose potential problems.\n  We apply this technique to the popular models Inception v3 and Resnet-50. Among other things, we observe exper","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks Understanding intermediate layers using linear classifier probes because it crossed a citation-hub threshold.","role_counts":[]},"graph":{"co_cited":[{"title":"Representation Engineering: A Top-Down Approach to AI Transparency","work_id":"45b326e2-e962-41a5-a542-2559e103a19b","shared_citers":9},{"title":"Steering Language Models With Activation Engineering","work_id":"d525fe06-5560-4e97-86fc-7a0e551f5b17","shared_citers":8},{"title":"An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale","work_id":"e96730e3-129b-4db6-b981-15ab7932e297","shared_citers":6},{"title":"Eliciting Latent Predictions from Transformers with the Tuned Lens","work_id":"a127314f-7424-488f-b6d7-8214650c420f","shared_citers":6},{"title":"Qwen2.5 Technical Report","work_id":"d8432992-4980-4a81-85c7-9fa2c2b87f85","shared_citers":6},{"title":"The Llama 3 Herd of Models","work_id":"1549a635-88af-4ac1-acfe-51ae7bb53345","shared_citers":6},{"title":"In-context Learning and Induction Heads","work_id":"db2b0911-2758-4a2a-99dc-15b14b91bd5e","shared_citers":5},{"title":"Qwen3 Technical Report","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","shared_citers":5},{"title":"Sparse Autoencoders Find Highly Interpretable Features in Language Models","work_id":"51960d72-c69f-4db8-8efd-e90e8b4d9524","shared_citers":5},{"title":"Training Verifiers to Solve Math Word Problems","work_id":"acab1aa8-b4d6-40e0-a3ee-25341701dca2","shared_citers":5},{"title":"Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small","work_id":"d1167c73-3f2a-472b-8bf5-0ec282d7988a","shared_citers":4},{"title":"Probing classifiers: Promises, shortcomings, and advances","work_id":"3eabab74-ac71-4292-86ce-b0469cd4e6cf","shared_citers":4},{"title":"Scaling Laws for Neural Language Models","work_id":"b7dd8749-9c45-4977-ab9b-64478dce1ae8","shared_citers":4},{"title":"The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets","work_id":"400e017f-8643-4166-b6da-a75d4446da80","shared_citers":4},{"title":"Decoupled Weight Decay Regularization","work_id":"07ef7360-d385-4033-83f7-8384a6325204","shared_citers":3},{"title":"Designing and interpreting probes with control tasks","work_id":"a10c87f8-f3ea-4b0f-ad42-26762e3ec8ec","shared_citers":3},{"title":"Distilling the Knowledge in a Neural Network","work_id":"d927ab1f-17b8-4002-9d09-c3d55764fbad","shared_citers":3},{"title":"Gaussian Error Linear Units (GELUs)","work_id":"0466fd22-03a1-4a61-af0a-a900e77bb023","shared_citers":3},{"title":"Language Models (Mostly) Know What They Know","work_id":"8ca58a10-da41-4f70-baae-7e449512e345","shared_citers":3},{"title":"Llama 2: Open Foundation and Fine-Tuned Chat Models","work_id":"68a5177f-d644-44c1-bd4f-4e5278c22f5d","shared_citers":3},{"title":"LLaMA: Open and Efficient Foundation Language Models","work_id":"c018fc23-6f3f-4035-9d02-28a2173b2b9d","shared_citers":3},{"title":"Localizing model behavior with path patching","work_id":"fdd07939-8060-4e9e-bd06-ff125eae86ef","shared_citers":3},{"title":"LoRA: Low-Rank Adaptation of Large Language Models","work_id":"0426219a-789e-4964-adc8-a04538510818","shared_citers":3},{"title":"Mass- editing memory in a transformer","work_id":"03332427-98ba-4287-876a-bb315cbddb1d","shared_citers":3}],"time_series":[{"n":1,"year":2016},{"n":1,"year":2023},{"n":48,"year":2026}],"dependency_candidates":[]},"authors":[]}}