The Worst AI Metric
文章指出让AI数单词中的字母数量(如"strawberry"中的r)是一种糟糕的测试方法。因为人类也无法在创作时同时分析内容细节。评估AI应关注其生成内容的质量而非琐碎信息。 2025-8-8 15:0:0 Author: danielmiessler.com(查看原文) 阅读量:0 收藏

Why the 'r's in strawberry' test is a horrible benchmark for AI

August 8, 2025

Strawberry R Test

The "how many r's in strawberry" test for AI intelligence is super dumb. Here's why.

Ask a writer to write a quality sentence for the book they're working on, and as they're writing—or typing—suddenly scream at them mid-sentence:

(Screaming) HOW MANY VOWELS IN THAT ?!?

First, they'll be very annoyed. But more importantly, you will have stopped them from creating their sentence.

Human's can't output at the same time they're thinking about how to do so.

Ask them—in the middle of a sentence—how many words they're using have an even number of characters. Or how many rhyme with "cow". Or how many r's the sentence contains, and they'll have no idea whatsoever. And you'll have ruined what they were saying.

So the question is: Do you want a sentence, or do you want information about a sentence? You need to pick one.

When we hire a writer, or a speaker, or an AI, we're hiring them for the content they produce, not for trivia about that content.

So let's not judge AIs too harshly for something we somehow forgot humans can't do either.


文章来源: https://danielmiessler.com/blog/the-worst-ai-metric?utm_source=rss&utm_medium=feed&utm_campaign=website
如有侵权请联系:admin#unsafe.sh