Anthropic's new model was amazing. Maybe it will come back?

Plus, an AI phone call and how AI detection works

Issue 111

On today’s quest:

— Fable was amazing, and it’s gone for now
— An AI phone call
— How AI detection works

A quick note: I feel a little weird sending this newsletter because my mother died suddenly on Thursday. I’m taking some time off from Grammar Girl (I was ahead, so you won’t notice much of a disruption), but my mom enjoyed this newsletter and always nudged me when she thought I had gone too long since writing one, and I also already had it mostly written, so I think she’d have wanted me to finish it and send it.

Fable was amazing, and it’s gone for now

When Anthropic’s security-locked-down version of the much-hyped Mythos model, Fable, came out Tuesday, I was in a great position to test how much better it was than the current model because I had been working on a massive data organization project that Opus was helpful with, but struggling. Opus was getting there in the end, and I wouldn’t have even tried such a big project without it, but I had to do a lot of clean up.

My project involved working through 20 years of data year by year, and when I put the identical prompt into Fable for the next year’s data, it completed the task perfectly in one shot. What had previously taken a few rounds of prompts with Opus and ~30 minutes of clean-up, now took one prompt and 5 minutes of checking and finding that everything was right.

I tried it with a couple more years of data, and it worked just as well, with only minor, understandable problems (like being unable to match two cells in a spreadsheet that were related but had such different framing that any person other than me likely would have missed it too — like not realizing that pieces published in different years titled “-ed adjectives” and “pronouncing ‘wicked’” were the same piece).

Fable was also expensive. Even though Anthropic temporarily doubled the credits people got, I was burning through my allotments. Since the company was going to take Fable off the monthly plans and make it pay-as-you-go June 22, I used it as much as I could for a couple of days, and I was similarly impressed with everything I tried. I would have paid more to keep using it for hard tasks after June 22 because of the very real time savings it was giving me, and I was starting to think about other, bigger projects I might be able to try.

A feat I considered especially impressive was that it figured out that two spreadsheet entries called “Capitalizing Titles” could actually be different pieces, and that it should dig deeper and check — which it did and which they were. One was about capitalizing titles like headlines, and the other was about capitalizing job titles.

There’s no way Opus or even most humans I’ve worked with would have flagged that to check. Heck, I wrote the two pieces, and I had to double-check Fable’s call to be sure they were different because of my sloppy record-keeping.

[A sidenote for the many writers and editors who read this newsletter — I never tried Fable for writing or editing, but I saw one review on YouTube that said it falls short for writing, perhaps even being worse than previous models.]

Then, on Friday night, Fable suddenly disappeared. Amazon workers discovered that they could jailbreak some of the safety features and reported it to the U.S. government, which put stringent export controls on the product (and its more powerful sibling, Mythos, which was in very limited distribution to give organizations time to work with it to prepare for security problems). Many of the details are contested, but the only way for Anthropic to comply with the order was to shut down the two products for everyone, including their own employees.

I hope it all works out and we get access again. I’m not working much right now anyway, but I imagine people who had used it are weighing whether it’s worth going back to using Opus, or if it makes more sense to just wait for Fable. If someone takes away your power saw, but you think you might get it back, how long do you wait before you give up and go back to using a handsaw?

If you’re interested in other people’s reviews, here are a few people who had early access: Ethan Mollick from One Useful Thing (text); Dan Shipper from Every (video) making advanced websites and a game, and doing data analysis with reports and clearing a GitHub backlog; and Pat Simmons (video) making an e-commerce website and a game.

An AI phone call

Last week, my doctors needed to move an appointment, and I got a phone call from an AI to do the rescheduling. A couple of interesting things:

  1. It pretended to be a human, with speech disfluencies like “uh” and false starts on sentences, and it simulated typing sounds in the background while it was “looking things up.”

  2. It didn’t seem to be able to do much outside of rescheduling. For example, it couldn’t tell me the office address, instead offering to transfer me to someone who could help.

And this was fun … it couldn’t hang up! I remembered that the AI agents from the Shell Game podcast couldn’t hang up, so when the call was finished I just stayed on the line and nothing happened. After about 10 seconds, I asked if it could hang up, and it said it couldn’t and that I should just go ahead and end the call if I was ready.

But honestly, it was so good I wouldn’t have been 100% certain it was an AI if it hadn’t bungled the address question, and interacting with it was a pleasant and efficient experience.

How AI detection works

I’m going to keep talking about how flawed AI detectors are because I’m becoming more and more disturbed by all the possibly wrong high-stakes accusations I’m seeing in the world.

Last week, Christopher Penn had a great YouTube video and newsletter about how AI detectors work, and says he finds they inaccurately label things as AI writing about 1 in 7 cases in his tests.

[As an aside, I used the “Ask Gemini” button in YouTube for the first time because I wanted to confirm my memory that Chris said the rate was “about 1 in 7” without rewatching the whole video, and it worked perfectly, taking me to the exact point in the video where he said it. I’m generally not a big fan of built-in AI, but in this case, it was good.]

Also this week, I saw the following post on Bluesky where a prominent writer was complaining about his work being flagged as AI by Pangram, supposedly the best detector on the market:

I think Christopher Penn summed it up well in his video, saying “AI detectors should never be used for anything punitive.

Quick Hits

My favorite recent pieces

Threatening us with a no-good very bad time [A thoughtful piece on the messaging behind AI and why people keep getting booed for talking about AI in commencement speeches.] - Neal Serven on LinkedIn

arXiv tightens policy on hallucinated references: what researchers should know about using AI search engines [If you’re interested in the problems with using AI for references (and possible solutions) this is an detailed article.] — Singapore Management University

Using AI

AI Can Fact-Check Other AI to Fix Hallucinations [A more detailed article about AI fact checking than the personal stories I’ve shared with you in past newsletters.] — Wall Street Journal

Agents

Meet Microsoft Scout, Your AI Coworker That Never Logs Off. Microsoft’s OpenClaw-style agent appears in Teams, just like a human colleague, and automates your dull office tasks. — Wired

Testing Google’s Gemini Spark AI agent: it’s incredible, and creepy [Google’s access to so much of our data means it can be a useful and personal agent, but by revealing how much the company actually knows, it’s also creepy.] — The Verge

Bad stuff

Companies Are Using Reddit to Manipulate ChatGPT and Google AI Search. Peptide companies have been doing AI-engine optimization by spamming the biohackers subreddit to manipulate ChatGPT and Google. — 404 Media

The business of AI

Education

Government

I’m laughing

Job market

Law professors prefer AI over peer answers [Law professors rated AI answers to common questions better than answers from other law professors ~75% of the time in a blind study.] — Stanford Law Server (via Ethan Mollick)

Model & Product updates

Philosophy

We must not grant AI agents legal personhood. What kind of sanctions could keep a non-human corporation in check? — Financial Times

Robotics

The first-ever robotic rescue at sea is a milestone. An American drone recovers downed pilots in the Strait of Hormuz — The Economist

Science & Medicine

Ping, You've Got Whale. A new artificial intelligence-powered detection system is giving ship captains real-time alerts when a whale is in their path.— bioGraphic

Microsoft rolls out Copilot AI tools to over half a million NHS England staff. Says earlier pilot save employees 43 minutes per day. — TechRadar

Security

Video

Other

What’s Worth More Than Cash in San Francisco Real Estate? Anthropic Stock. Several real estate listings in the San Francisco Bay Area are offering to exchange a home for a piece of the AI startup. — Wired

When AI builds itself [On recursive self-improvement] — Anthropic

What is AI Sidequest?

Are you interested in the intersection of AI with language, writing, and culture? With maybe a little consumer business thrown in? Then you’re in the right place!

I’m Mignon Fogarty: I’ve been writing about language for almost 20 years and was the chair of media entrepreneurship in the School of Journalism at the University of Nevada, Reno. I became interested in AI back in 2022 when articles about large language models started flooding my Google alerts. AI Sidequest is where I write about stories I find interesting. I hope you find them interesting too.

If you loved the newsletter, share your favorite part on social media and tag me so I can engage! [LinkedInFacebookMastodon]

Written by a human