Authorship Attribution in the Era of LLMs: Problems, Methodologies, and Challenges

framework — Representative Problems in Authorship Attribution:
Human-written Text Attribution (attributing unknown texts to human authors)

LLM-generated Text Detection (detecting if texts are generated by LLMs)

LLM-generated Text Attribution (identifying the specific LLM or human responsible for a text)

Human-LLM Co-authored Text Attribution (classifying texts as human-written, machine-generated, or a combination of both)

Abstract

Accurate attribution of authorship is crucial for maintaining the integrity of digital content, improving forensic investigations, and mitigating the risks of misinformation and plagiarism. Addressing the imperative need for proper authorship attribution is essential to uphold the credibility and accountability of authentic authorship. The rapid advancements of Large Language Models (LLMs) have blurred the lines between human and machine authorship, posing significant challenges for traditional methods. We present a comprehensive literature review that examines the latest research on authorship attribution in the era of LLMs. This survey systematically explores the landscape of this field by categorizing four representative problems: (1) Human-written Text Attribution; (2) LLM-generated Text Detection; (3) LLM-generated Text Attribution; and (4) Human-LLM Co-authored Text Attribution. We also discuss the challenges related to ensuring the generalization and explainability of authorship attribution methods. Generalization requires the ability to generalize across various domains, while explainability emphasizes providing transparent and understandable insights into the decisions made by these models. By evaluating the strengths and limitations of existing methods and benchmarks, we identify key open problems and future research directions in this field. This literature review serves a roadmap for researchers and practitioners interested in understanding the state of the art in this rapidly evolving field.

Contributions

We provide a timely overview to discuss the challenges and opportunities presented by LLMs in the field of authorship attribution. By systematically categorizing authorship attribution into four problems and balancing problem complexity with practicality, we reveal insights into the evolving filed of authorship attribution in the era of LLMs.
We offer a comprehensive comparison of state-of-the-art methodologies, datasets, benchmarks, and commercial tools used in authorship attribution. This analysis not only improves the understanding of authorship attribution but also provides a valuable resource for researchers and practitioners to use as guidelines for approaching this direction.
We discuss open issues and provide future directions by considering crucial aspects such as generalization, explainability, and interdisciplinary perspectives. We also discuss the broader implications of authorship attribution in real-world applications. This holistic approach ensures that authorship attribution not only yields accurate results but also provides insights that are explainable and socially relevant.

Benchmarks

The table below is a summary of Authorship Attribution Datasets and Benchmarks with LLM-Generated Text. Size is shown as the sum of LLM-generated and human-written texts (with the percentage of human-written texts in parentheses). Language is displayed using the two-letter ISO 639 abbreviation. Columns P2, P3, and P4 indicate whether the dataset supports problems described in Problem 2, 3, and 4, respectively.

Name	Domain	Size	Length	Language	Model	P2	P3	P4
TuringBench	News	168,612 (5.2%)	100 to 400 words	en	GPT-1,2,3, GROVER, CTRL, XLM, XLNET, FAIR, TRANSFORMER-XL, PPLM	✓	✓
TweepFake	Social media	25,572 (50.0%)	less than 280 characters	en	GPT-2, RNN, Markov, LSTM, CharRNN	✓
ArguGPT	Academic essays	8,153 (49.5%)	300 words on average	en	GPT2-Xl, text-babbage-001, text-curie-001, davinci-001,002,003, GPT-3.5-Turbo	✓
AuTexTification	Tweets, reviews, news, legal, and how-to articles	163,306 (42.5%)	20 to 100 tokens	en, es	BLOOM, GPT-3	✓	✓
CHEAT	Academic paper abstracts	50,699 (30.4%)	163.9 words on average	en	ChatGPT	✓
GPABench2	Academic paper abstracts	2.385M (6.3%)	70 to 350 words	en	ChatGPT	✓		✓
Ghostbuster	News, student essays, creative writing	23,091 (87.0%)	77 to 559 (median words per document)	en	ChatGPT, Claude	✓
HC3	Reddit, Wikipedia, medicine, finance	125,230 (64.5%)	25 to 254 words	en, zh	ChatGPT	✓
HC3 Plus	News, social media	214,498	N/A	en, zh	ChatGPT	✓
HC-Var	News, reviews, essays, QA	144k (68.8%)	50 to 200 words	en	ChatGPT	✓
HANSEN	Transcripts of speech (spoken text), statements (written text)	535k (96.1%)	less than 1k tokens	en	ChatGPT, PaLM2, Vicuna-13B	✓	✓
M4	Wikipedia, WikiHow, Reddit, QA, news, paper abstracts, peer reviews	147,895 (24.2%)	more than 1k characters	ar, bg, en, id, ru, ur, zh	davinci-003, ChatGPT, GPT-4, Cohere, Dolly2, BLOOMz	✓
MGTBench	News, student essays, creative writing	21k (14.3%)	1 to 500 words	en	ChatGPT, ChatGLM, Dolly, GPT4All, StableLM, Claude	✓	✓
MULTITuDE	News	74,081 (10.8%)	200 to 512 tokens	ar, ca, cs, de, en, es, nl, pt, ru, uk, zh	GPT-3,4, ChatGPT, Llama-65B, Alpaca-LoRa-30B, Vicuna-13B, OPT-66B, OPT-IML-Max-1.3B	✓
OpenGPTText	OpenWebText	58,790 (50.0%)	less than 2k words	en	ChatGPT	✓
OpenLLMText	OpenWebText	344,530 (20%)	512 tokens	en	ChatGPT, PaLM, Llama, GPT2-XL	✓	✓
Scientic Paper	Scientific papers	29k (55.2%)	900 tokens on average	en	SCIgen, GPT-2,3, ChatGPT, Galactica	✓
RAID	News, Wikipedia, paper abstracts, recipes, Reddit, poems, book summaries, movie reviews	523,985 (2.9%)	323 tokens on average	cs, de, en	GPT-2,3,4, ChatGPT, Mistral-7B, MPT-30B, Llama2-70B, Cohere command and chat	✓
M4GT-Bench	Wikipedia, Wikihow, Reddit, arXiv abstracts, academic paper reviews, student essays	5,368,998 (96.6%)	more than 50 characters	ar, bg, de, en, id, it, ru, ur, zh	ChatGPT, davinci-003, GPT-4, Cohere, Dolly-v2, BLOOMz	✓	✓	✓
MAGE	Reddit, reviews, news, QA, story writing, Wikipedia, academic paper abstracts	448,459 (34.4%)	263 words on average	en	GPT, Llama, GLM-130B, FLAN-T5 OPT, T0, BLOOM-7B1, GPT-J-6B, GPT-NeoX-2	✓
MIXSET	Email, news, game reviews, academic paper abstracts, speeches, blogs	3.6k (16.7%)	50 to 250 words	en	GPT-4, Llama2	✓		✓

Detectors

The Table below presents an overview of LLM-Generated Text Detectors.

Detector	Price	API	Website
GPTZero	150k words at $10/month, 10k words for free per month	Yes	https://gptzero.me/
ZeroGPT	100k characters for $9.99, 15k characters for free	Yes	https://www.zerogpt.com/
Sapling	50k characters for $25, 2k characters for free	Yes	https://sapling.ai/ai-content-detector
Originality.AI	200k words at $14.95/month	Yes	https://originality.ai/
CopyLeaks	300k words at $7.99/month	Yes	https://copyleaks.com/ai-content-detector
Winston	80k words at $12/month	Yes	https://gowinston.ai/
GPT Radar	$0.02/100 tokens	N/A	https://gptradar.com/
Turnitin’s AI detector	License required	N/A	https://www.turnitin.com/solutions/topics/ai-writing/ai-detector/
GPT-2 Output Detector	Free	N/A	https://github.com/openai/gpt-2-output-dataset/tree/master/detector
Crossplag	Free	N/A	https://crossplag.com/ai-content-detector/
CatchGPT	Free	N/A	https://www.catchgpt.ai/
Quil.org	Free	N/A	https://aiwritingcheck.org/
Scribbr	Free	N/A	https://www.scribbr.com/ai-detector/
Draft Goal	Free	N/A	https://detector.dng.ai/
Writefull	Free	Yes	https://x.writefull.com/gpt-detector
Phrasly	Free	Yes	https://phrasly.ai/ai-detector
Writer	Free	Yes	https://writer.com/ai-content-detector

BibTeX

@article{huang2024aa_llm,
    title   = {Authorship Attribution in the Era of LLMs: Problems, Methodologies, and Challenges},
    author  = {Baixiang Huang and Canyu Chen and Kai Shu},
    year    = {2024},
    journal = {arXiv preprint arXiv: 2408.08946},
    url     = {https://arxiv.org/abs/2408.08946}, 
}

The example above compares Linguistically Informed Prompting (LIP) with other baselines that provide less linguistic guidance for the authorship verification task. The outputs of an LLM are categorized as either "Analysis" or "Answer." Only the LIP strategy correctly identifies that the two given texts were authored by the same individual. Texts highlighted in orange emphasize the differences across four levels of guidance. Texts highlighted in blue indicate the linguistically informed reasoning process, while blue text represents content referenced from the original documents.

Abstract

The ability to accurately identify authorship is crucial for verifying content authenticity and mitigating misinformation. Large Language Models (LLMs) have demonstrated exceptional capacity for reasoning and problem-solving. However, their potential in authorship analysis remains under-explored. Traditional studies have depended on hand-crafted stylistic features, whereas state-of-the-art approaches leverage text embeddings from pre-trained language models. These methods, which typically require fine-tuning on labeled data, often suffer from performance degradation in cross-domain applications and provide limited explainability. This work seeks to address three research questions: (1) Can LLMs perform zero-shot, end-to-end authorship verification effectively? (2) Are LLMs capable of accurately attributing authorship among multiple candidates authors (e.g., 10 and 20)? (3) How can LLMs provide explainability in authorship analysis, particularly through the role of linguistic features? Moreover, we investigate the integration of explicit linguistic features to guide LLMs in their reasoning processes. Our assessment demonstrates LLMs' proficiency in both tasks without the need for domain-specific fine-tuning, providing insights into their decision-making via a detailed analysis of linguistic features. This establishes a new benchmark for future research on LLM-based authorship analysis.

Contributions

We conduct a comprehensive evaluation of LLMs in authorship attribution and verification tasks. Our results demonstrate that LLMs outperform existing BERT-based models in a zero-shot setting, showcasing their inherent stylometric knowledge essential for distinguishing authorship. This enables them to excel in authorship attribution and verification across low-resource domains without the need for domain-specific fine-tuning.
We develop a pipeline for authorship analysis with LLMs, encompassing dataset preparation, baseline implementation, and evaluation. Our novel Linguistically Informed Prompting (LIP) technique guides LLMs to leverage linguistic features for accurate authorship analysis, enhancing their reasoning capabilities.
Our end-to-end approach improves the explainability of authorship analysis. It elucidates the reasoning and evidence behind authorship predictions, shedding light on how various linguistic features influence these predictions. This contributes to a deeper understanding of the mechanisms behind LLM-based authorship identification.

BibTeX

@artile{huang2024authorship,
    title   = {Can Large Language Models Identify Authorship?}, 
    author  = {Baixiang Huang and Canyu Chen and Kai Shu},
    year    = {2024},
    journal = {arXiv preprint},
    volume  = {abs/2403.08213},
    url     = {https://arxiv.org/abs/2403.08213}, 
}