Should AI Write My Headline?

Hey, y’all. These days, journalists hear often about new technology, and what they can (and shouldn’t) do when it comes to tough ethical and safety issues around AI, social media, and more. But there’s a lot of conflicting advice out there. I’m here to help. With this new column, I’ll address common questions about the ethics, legal considerations, and best practices of adopting new technologies. Trained as a journalist and media lawyer, I’ve spent over a decade developing policies for emerging technologies inside civil society organizations and at companies such as Twitter and Twitch. I’m also a professor at Columbia Journalism School and lead the Craig Newmark Center for Journalism Ethics and Security. Send me your questions at askanika@cjr.org.

Q: I’m a busy, tired, and overworked journalist. Should I use AI to help me write headlines?

I’ve always thought that if there was one solid use case for AI-powered chatbots like ChatGPT and Claude, it could be headline writing. The latest generation of large language models has ingested practically every headline ever published, making it especially suited to the task of suggesting which I should use for my little story.

Yet like many journalists navigating the nexus of craft and technology, I’ve been wary of feeding any of my unpublished work into generative AI models. Why? Because each time a journalist submits unpublished work to an LLM, whether to ask for headline help or power an investigation, they face significant questions about how their intellectual property will be used, and they open themselves up to novel legal and security risks.

That’s because those of us not working inside of an AI company actually have no idea what happens when we copy the text of our draft stories, paste them into a chat box, and press enter. And we have reasons to be concerned.

The AI industry, reportedly worth some three trillion dollars, was built upon the work of journalists whose words were hoovered up without compensation or consent. Half of the top ten websites used in chatbot training data were news outlets, according to an investigation by the Washington Post (whose own website ranked eleventh). Anthropic, for its part, tried to “destructively scan all the books in the world,” which involved slicing the spines off millions of volumes, for which it paid tens of millions of dollars. Thus, the creation of large language models was, as A.G. Sulzberger, the publisher of the New York Times, recently described it, “a brazen theft of intellectual property that has occurred at an unprecedented scale.”

Then, when AI companies effectively ran out of words to use to keep improving their word-prediction machines, one solution was “synthetic data,” or, as I wrote in 2024, “information generated by AI itself, rather than humans, to continue to train their systems.” I warned of a “feedback loop to hell” that this scenario could create as AI doubled down on its own biases and hallucinated falsities.

AI companies have managed to find a steady source of new, better-quality, human-written words to continue to train their new models on: the text entered into a chatbot’s prompt box. In a September 2025 study of publicly available chatbots from Amazon, Anthropic, Google, Meta, Microsoft, and OpenAI, Stanford University researchers discovered “that all six developers appear to employ their users’ chat data to train and improve their models by default, and that some retain this data indefinitely.” Even Lex, an AI chatbot that is marketed as having been developed for writers and that boasts an editing feature, discloses that it “may collect your conversations in the AI Chat including both your inputs and the AI assistants outputs” (sic) for the purpose of “improving, upgrading, or enhancing” the platform. With AI companies’ prior scraping and current concerning use of data, I recommend journalists follow the lead of The Markup’s newsroom policies and set a prohibition against feeding unpublished drafts into publicly available generative AI bots.

Enterprise-level (paid) models may be more secure, and basic agreements claim that they are. We can learn from the California State University system’s $16.9 million deal with OpenAI that these deals may provide more protections. According to recent reporting from the Times magazine, “the terms of the deal stipulate that OpenAI may not train its model on data from the C.S.U.” However, once content is entered into a commercial LLM, its ultimate use remains known only to the company itself. Journalists filing unpublished drafts into enterprise LLM models should follow the guidance of news outlets like the Associated Press, which entered a licensing agreement with OpenAI in 2023 and still “urges” its journalists “to not put confidential or sensitive information into AI tools.”

The option with the least risk for a journalist’s unpublished drafts is a local large language model, a type of LLM that can run on a computer or server owned by a newsroom or journalist. The owners of local LLMs then have the power to determine how data like new text inputs are used or deleted. However, local LLMs are not common. So it would be interesting to see the fruition of a real “Newsroom Tooling Alliance” (which was proposed last year by researchers) as a safer alternative for unpublished drafts. Another promising tool is Lumo, with chats that are encrypted and stored only locally.

But until these solutions are more widespread, the journalists whose work built the modern mythos of AI are being tendered limited options. Journalists can scrutinize the privacy policies of publicly available models, negotiate or opt out of allowing companies to use chat inputs for new training, and—at best—proceed with caution. But in the end, when a journalist clicks enter in an LLM chat box, whatever happens next is anyone’s guess.

This piece was produced with support from the Craig Newmark Center for Journalism Ethics and Security.

Has America ever needed a media defender more than now? Help us by joining CJR today.

Content Curated Originally From Here