Turning Trust to Transactions: Tracking Affiliate Marketing and FTC Compliance in YouTube's Influencer Economy

Chen Sun; Rishab Nithyanand; Yash Vekaria; Zubair Shafiq

arxiv: 2603.04383 · v2 · pith:MUERST36new · submitted 2026-03-04 · 💻 cs.CY · cs.CR· cs.IR· cs.LG· cs.SI

Turning Trust to Transactions: Tracking Affiliate Marketing and FTC Compliance in YouTube's Influencer Economy

Chen Sun , Yash Vekaria , Zubair Shafiq , Rishab Nithyanand This is my paper

Pith reviewed 2026-05-22 11:36 UTC · model grok-4.3

classification 💻 cs.CY cs.CRcs.IRcs.LGcs.SI

keywords affiliate marketingYouTubeFTC compliancedisclosureinfluencer economyweb measurementtransparencyNLP

0 comments

The pith

Affiliate links appear often on YouTube but most videos fail to meet FTC disclosure standards.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how creators monetize recommendations through affiliate marketing on YouTube and checks whether they follow federal disclosure rules. It builds automated tools from web measurement and natural language processing to scan a decade of videos for links and disclosures. The analysis shows these links are common yet proper disclosures remain rare, with the platform's own standardized features linked to higher compliance. This matters because undisclosed promotions can mislead viewers about why a product is recommended. The findings point to closer work between regulators, affiliate networks, and the platform to raise transparency.

Core claim

Using tools developed from web measurement and NLP research, the study of a ten-year dataset of two million videos from nearly 540,000 creators finds affiliate marketing widespread yet disclosure compliance low, with most videos failing to meet FTC standards. The platform is highly associated with improved compliance through standardized disclosure features.

What carries the argument

Automated detection tools that combine web measurement techniques with NLP to locate affiliate links and classify whether disclosure statements meet FTC requirements in video descriptions.

If this is right

Standardized platform disclosure features can raise compliance rates without requiring changes from every creator.
Regulators and affiliate partners should collaborate directly with platforms to strengthen transparency rules.
Low compliance rates may continue to expose viewers to undisclosed commercial promotions.
Platform-level interventions appear more effective than relying solely on individual creator behavior.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same measurement approach could test compliance patterns on other short-form video platforms to compare results.
If platform features drive most gains, regulators might focus policy on mandating similar tools elsewhere.
Tracking the same creators over additional years could show whether compliance improves steadily or stalls.

Load-bearing premise

The automated tools accurately detect affiliate links and classify disclosures without large numbers of errors or missed cases.

What would settle it

A manual audit of several hundred videos labeled non-compliant that finds most actually contain proper disclosures would undermine the claim of widespread non-compliance.

read the original abstract

YouTube has evolved into a powerful platform where creators monetize their influence through affiliate marketing, raising concerns about transparency and ethics, especially when creators fail to disclose their affiliate relationships. Although regulatory agencies like the US Federal Trade Commission (FTC) have issued guidelines to address these issues, non-compliance and consumer harm persist, and the extent of these problems remains unclear. In this paper, we introduce tools, developed with insights from recent advances in Web measurement and NLP research, to examine the state of the affiliate marketing ecosystem on YouTube. We apply these tools to a 10-year dataset of 2 million videos from nearly 540,000 creators, analyzing the prevalence of affiliate marketing on YouTube and the rates of non-compliant behavior. Our findings reveal that affiliate links are widespread, yet disclosure compliance remains low, with most videos failing to meet FTC standards. Furthermore, we analyze the effects of different stakeholders in improving disclosure behavior. Our study suggests that the platform is highly associated with improved compliance through standardized disclosure features. We recommend that regulators and affiliate partners collaborate with platforms to enhance transparency, accountability, and trust in the influencer economy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper develops automated tools drawing on web measurement and NLP research to detect affiliate links and evaluate FTC disclosure compliance in YouTube videos. These tools are applied to a 10-year dataset of 2 million videos from nearly 540,000 creators. The central findings are that affiliate links are widespread, disclosure compliance is low with most videos failing to meet FTC standards, and platform-provided standardized disclosure features are strongly associated with improved compliance rates. The authors recommend greater collaboration among regulators, affiliate partners, and platforms to enhance transparency.

Significance. If the detection tools are shown to be reliable, the work supplies one of the largest-scale empirical measurements of affiliate marketing prevalence and compliance on YouTube, offering concrete data that could inform FTC enforcement priorities and platform design choices. The longitudinal scope and multi-stakeholder analysis are clear strengths that would make the results useful for both academic and policy audiences.

major comments (2)

[Methods / Tool Development] Methods / Tool Development section: The affiliate-link detection and disclosure-classification pipelines are described at a high level but no quantitative validation against human-annotated ground truth is reported—no held-out test set, no precision/recall/F1 figures, and no error analysis for edge cases such as shortened URLs, sponsored segments without explicit links, or ambiguous disclosure phrasing. Because the headline statistics (prevalence of links, fraction of videos failing FTC standards, and platform-association effects) rest directly on the outputs of these pipelines, the absence of validation metrics is load-bearing for the central empirical claims.
[Results] Results section on platform effects: The claim that 'the platform is highly associated with improved compliance' is presented as a key finding, yet the manuscript does not detail the statistical controls or matching procedure used to isolate platform features from confounding factors such as creator size or video category. Without these details it is difficult to assess whether the reported association supports the causal language used in the abstract and conclusion.

minor comments (2)

[Abstract] The abstract states that tools were 'developed with insights from recent advances in Web measurement and NLP research' but provides no citations to the specific prior work; adding 1–2 key references would improve traceability.
[Figures] Figure captions and axis labels in the compliance-over-time plots should explicitly state the exact operational definition of 'compliant' used for each bar or line.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We address each major point below and indicate the revisions we will incorporate to strengthen the work.

read point-by-point responses

Referee: [Methods / Tool Development] The affiliate-link detection and disclosure-classification pipelines are described at a high level but no quantitative validation against human-annotated ground truth is reported—no held-out test set, no precision/recall/F1 figures, and no error analysis for edge cases such as shortened URLs, sponsored segments without explicit links, or ambiguous disclosure phrasing.

Authors: We acknowledge that the manuscript currently presents the pipelines at a high level without reporting quantitative validation metrics. In the revised version we will add a dedicated validation subsection to the Methods. This subsection will describe a held-out human-annotated test set and report precision, recall, and F1 scores, together with an error analysis that explicitly addresses edge cases including shortened URLs, sponsored segments lacking explicit links, and ambiguous disclosure phrasing. These additions will directly support the reliability of the headline statistics. revision: yes
Referee: [Results] The claim that 'the platform is highly associated with improved compliance' is presented as a key finding, yet the manuscript does not detail the statistical controls or matching procedure used to isolate platform features from confounding factors such as creator size or video category.

Authors: We agree that greater methodological transparency is needed. We will expand the Results section to describe the statistical controls and any matching or regression procedures used to account for potential confounders such as creator size and video category. We will also review the abstract and conclusion to confirm that language remains consistent with the observational associations reported, avoiding any implication of causality. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical measurement study with independent data analysis

full rationale

This is a large-scale empirical measurement paper that crawls and classifies 2 million YouTube videos using custom web-measurement and NLP pipelines to count affiliate links and disclosure compliance. No equations, fitted parameters, or first-principles derivations are present, so none of the enumerated circularity patterns (self-definitional, fitted-input-called-prediction, self-citation load-bearing, etc.) apply. The central statistics are direct outputs of the measurement pipeline applied to external platform data; they are not forced by construction or reduced to prior self-citations. The study is therefore self-contained against external benchmarks and receives a non-finding.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the accuracy of custom detection tools and the representativeness of the collected video sample; no free parameters or invented entities are introduced.

axioms (1)

domain assumption Automated tools correctly identify affiliate links and disclosures at scale.
Invoked to support prevalence and compliance statistics; location implied in tool-development description.

pith-pipeline@v0.9.0 · 5753 in / 1148 out tokens · 29415 ms · 2026-05-22T11:36:15.426361+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce tools, developed with insights from recent advances in Web measurement and NLP research, to examine the state of the affiliate marketing ecosystem on YouTube. We apply these tools to a 10-year dataset of 2 million videos...
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our findings reveal that affiliate links are widespread, yet disclosure compliance remains low...

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.