Paul Graham Flagged For AI Use

Let me short-circuit the flames. He wasn’t using AI, but my attempts at trying to rid myself of AI slop in my feed reader flagged him as the worst offender.

Is he? No. It just points out how hard this is.

So just yesterday, I posted a neat article to Hacker News. It was from Kagi Small Web which I’ve been using in my feed reader a ton because hell yeah I want to support the small blogs out there like the one here. I’m sick of all the usual garbage. And here’s an article that has some interesting bits I’ve never heard before, considering I’ve been in the YC circle since late 2005. Like Airbnb almost went corporate housing to make ends meet!?

Immediately the submission jumped to the top of Hacker News. Lots of upvotes coming in. But then I started seeing “slop”, “AI;DR”, and then someone pointed out the author’s x handle doesn’t even exist.

Crap. I got fooled. This article is ridiculously AI generated. The whole site is: siliconopera.com. Of the authors I’ve clicked on, all their articles follow templates and regenerate the same themes day after day. They point to non-existent or completely not-them x.com accounts.

That sucked. My submission was [flagged] as it should be.

Of course I’m embarrassed. But I can at least try to use that as some fuel to fix the problem. I asked:

Does anyone use a decent “ai detection” algo/service/on device model that they are happy with?

No takers. I could just ask Claude Opus. When I put the silicon opera article through it, it confidently thinks it’s AI spam, but it also did the work of following links, digging into the fake author’s masthead, etc. I can’t do that every single article before I read it.

So I tried to get Claude to make something locally for me to test. Is there something that can run cheaply on my device that quickly goes through a whole feed?


Here’s what I tried.


What I ran (all local) Fake article Paul Graham My blog post
Apple on-device model: “AI score” 0-100, higher = more AI 75 90 70
Apple on-device model: “how well sourced” 0-100, higher = better 40 20 20
GPT-2 perplexity, lower = more AI in theory 274 90 169
RoBERTa trained on ChatGPT: % AI 0% 0% 0%

Fack. Paul’s essay (from 2013!!!) reads as the spammiest AI slop to 3 of the 4. The RoBERTa thing is clearly useless.

But these detectors basically just tried to measure for overly good writing. Paul’s a great writer. Must be a robot.

Anyways, just wanted to apologize a bit for missing this. I hate causing even more noise in the community I lean on every day. I’m trying to make this stuff better. And clearly failing in some dimensions.

What does seem to work is digging into the article with humans or a very expensive agent. And I was hoping the Kagi Small Web had human verified this too. But this is tough at scale. At least I can contribute a PR to cleanup the Small Web list:

https://github.com/kagisearch/smallweb/pull/817

And I’d be an idiot not to mention, the feed reader I talked about above is something I’ve been working on myself: PageForth. It’s been an awesome tool to summarize news I’d like to first vet if the subject matter is even interesting to me before diving in. It has an AI detector built into it ironically. It’s an improved version of one of the approaches I tested above. But clearly that part isn’t working yet :)

 
4
Kudos
 
4
Kudos

Now read this

Cohort analysis - User retention in a Rails application.

I want my actions to be more data-driven. I want to make Dave McClure, Steve Blank, and Eric Ries proud. Easier said than done. Analytics is still a pain in the ass. How can I tell if people are using my product, Draft? (Draft is the... Continue →