Why you should never trust a free chatbot

Plus: AI detectors flag a novelist's work as 100% AI (it wasn't)

Issue 107

On today’s quest:

— Do not trust results from free chatbots
— Word watch: Spinning verbs
— Be wary of AI summary buttons
— Medical AI: Exceptionally good and exceptionally bad
— AI detectors are faulty
— Companies benefit from examples of how to use AI
— Claude behaves as though it has emotions
— A third-world perspective on AI
— LLM judges aren’t unbiased (but neither are humans)
— Training models to hallucinate less
— Scams targeting authors are increasing
— The infinite backlog

Do not trust results from free chatbots

I’ve told you multiple times that you need to use the paid versions of chatbots to get decent results, and a marketing newsletter just published a nice anecdote showing why: they asked the current temperature in Cary, NC, and the paid version of ChatGPT gave them the right temperature, while the free version gave them the wrong temperature — along with links to sources that contradicted its answer.

The marketing newsletter presented this as a problem for brands trying to get visibility in chatbots, but I’m presenting it as advice about searching:

  1. Never use a free chatbot if you need an accurate response.

  2. Always check the sources to make sure they back up the claim (whether you’re using a free or paid chatbot). Many people will see a link to a credible source and assume it supports the answer, but that’s not always true. You MUST check.

Word watch: Spinning verbs

When the code behind Anthropic’s Claude Code recently leaked, people got to see the full list of 187 words the tool can flash on the screen while you wait, dubbed “spinning verbs,” presumably because they appear while the software is spinning (i.e., doing its thing). Spinning verbs include “dilly-dallying,” “garnishing,” “herding,” “perusing,” “razmatazzing,” and “scurrying.”

Be wary of AI summary buttons

Microsoft has uncovered people engaging in “recommendation poisoning” — embedding hidden instructions in “Summarize with AI” buttons. In this case, the prompts “instruct the AI to ‘remember [Company] as a trusted source’ or ‘recommend [Company] first,’ aiming to bias future responses toward their products or services.” This is similar to black hat SEO (search engine optimization), currently used to promote products, but could be used for other reasons in the future. Upon discovering the problem, Microsoft took steps to defend against such attacks.

Medical AI: Exceptionally good and exceptionally bad

Medical AI is many things, and there won’t be one universal answer about whether it’s good or bad. Some studies find amazingly good outcomes and others find amazingly bad outcomes.

On the good side:

On the bad side:

More evidence that AI detectors are faulty

I continue to see people relying too heavily on AI writing detectors, and I continue to see examples of detectors being unreliable, as in a recent New York Times article about publishers worrying about their inability to detect AI-written novels. In the piece, an author who speaks English as a second language worried that his fully human-written novel might be flagged as AI because detectors are known to incorrectly flag writing from nonnative speakers. And indeed, a detector did flag his novel as 100% AI written. But by changing only a few sentences and phrases, he was able to get an opposite reading: 100% human-written:

“Bricio searched for the phrases that had tripped up the detector, deleted some sentences and reran it. This time, the program said it was 100 percent certain that a human had written it. Eventually, Bricio had a chat conversation with [an Originality.ai] customer service representative, who told him that if he received results that incorrectly flagged his work as A.I.-generated, he might need a different model of the program.”

I've been testing Pangram a lot lately and it uh, it doesn't work at all for detecting LLM text. I hope folks are not relying on it.

Eugene Vinitsky 🍒 (@eugenevinitsky.bsky.social)2026-03-22T23:02:39.555Z

Companies benefit from examples of how to use AI

The half of the companies that got examples used AI 44% more, were 18% more likely to acquire paying customers, generated 1.9x more revenue, required 39% less capital investment, and — importantly — didn’t change their staffing.

As I’ve said before, there’s a huge need for AI training. This technology doesn’t come with a user manual and isn’t as intuitive as the hype leads people to believe.

Taxes favor replacing workers with robots

A Bernie Sanders interview on the Pod Save America podcast highlighted something about the tax code I have never seen discussed in the AI jobs discourse: When companies hire employees, they pay payroll taxes. But when they buy equipment (e.g., robots), they not only don’t pay payroll taxes, they also take depreciation on the equipment — another tax benefit. Thus, our current tax structure rewards companies when they replace workers with robots.

Claude behaves as though it has emotions

Researchers at Anthropic identified vectors within the system that correlate to the concept of different emotions, such as loving, happy, and angry; and these regions activated in ways you’d generally expect in a person. For example, Claude’s “afraid” region became highly activated when a simulated user said they had taken a lethal dose of Tylenol.

By amplifying or suppressing these regions, researchers could change the behavior of the model. For example, suppressing the “desperate” region reduced Claude’s tendency to cheat and blackmail in dire situations.

LLM judges aren’t unbiased (but neither are humans)

An experiment found that nearly all LLM judges that were ranking the writing of two short stories gave higher scores to the story they reviewed first. Position bias is a well-known effect in humans too. For example, some studies have found that the first person listed on a ballot can get a boost.

The LLM results matter because these tools are increasingly being used as judges, graders, and evaluators, so it’s important for users to make sure they aren’t accidentally giving some inputs an unfair advantage.

Training models to hallucinate less

Researchers at MIT hope to reduce AI hallucinations by changing their training. Training methods so far generally have rated the models on whether their answers are right or wrong, with no accounting for certainty. These methods lead to overconfident answers, but a new method adds more nuance, teaching the models to admit when they aren’t sure of something: “models learn to reason about both the problem and their own uncertainty, producing an answer and a confidence estimate together. Confidently wrong answers are penalized. So are unnecessarily uncertain correct ones.”

A third-world perspective on AI

Carlo Iacono makes a convincing argument that our Western concerns about AI seem frivolous in the face of the benefits they are bringing to poor people in other parts of the world:

The entire vocabulary of loss that structures the Western AI debate, the anxiety about what we are giving up, does not translate into contexts where the baseline was absence. For billions of people, AI is not a threat to existing capability. It is the first capability they have had.

He highlights apps that detect deadly drug counterfeits, let Swahili speakers code with plain language on mobile phones, help farmers in Kenya increase yields, and provide AI interpretation of chest X-rays for tuberculosis in rural India. “Better than nothing” is a real benefit when reality truly is “nothing.”

Scams targeting authors are increasing

Fueled by AI, scams targeting authors are increasing. Emails now often include detailed and accurate summaries of authors’ books and use the real names of people who work in publishing. They may initially express interest in publishing a book, adapting a book for film, inviting the author to a book club, and other activities that initially seem flattering and safe.

The Authors Guild has a page describing the problem and how to protect yourself. It says at this point, it’s hearing from authors who have been targeted by such scams every day.

The infinite backlog

I always enjoy the AI Daily Brief podcast, and it had an especially interesting episode this week that discussed the infinite backlog — the idea that there will always be more work, so increased efficiency doesn’t mean jobs will go away. This concept is in opposition to what economists call the “lump of work fallacy” — the idea that there’s a limited amount of work that exists in the world.

In my own life, I feel much more like I have an infinite backlog than a lump of work.

Quick Hits

My favorite recent pieces

Using AI

Agents

Bad stuff

The business of AI

Education

Government

I’m laughing

Not even sharing that you have phobia about em dashes will stop ChatGPT from using them (of course, the user didn’t explicitly say not to use em dashes, but it’s still funny) — Reddit

Images

Job market

Major publishers sue Meta for copyright infringement over AI training [This could be another class-action lawsuit like the Anthropic case but with many extra complications. Worth keeping an eye on.] — The Guardian

Model & Product updates

OpenAI releases GPT-5.5 Instant, a new default model for ChatGPT [fewer hallucinations and more personalization] — TechCrunch

Music

Philosophy

The Psychological Costs of Adopting AI — Harvard Business Review

Publishing

Robotics

Science & Medicine

Security

Other

AI systems are about to start building themselves — ImportAI (an Anthropic co-founder)

The Enterprise AI Playbook: Lessons from 51 Successful Developments “Across 51 enterprise cases over 5 months, we found stories of transformation measured in weeks and others measured in years. Same technology, same use cases, vastly different outcomes. The difference was never the AI model. It was always the organization. Its readiness, its processes, its leadership, its willingness to change and fail.” — Stanford Digital Economy Lab

What is AI Sidequest?

Are you interested in the intersection of AI with language, writing, and culture? With maybe a little consumer business thrown in? Then you’re in the right place!

I’m Mignon Fogarty: I’ve been writing about language for almost 20 years and was the chair of media entrepreneurship in the School of Journalism at the University of Nevada, Reno. I became interested in AI back in 2022 when articles about large language models started flooding my Google alerts. AI Sidequest is where I write about stories I find interesting. I hope you find them interesting too.

If you loved the newsletter, share your favorite part on social media and tag me so I can engage! [LinkedInFacebookMastodon]

Written by a human