AI #132 第一弹:AI检测再升级
这篇内容探讨了人工智能领域的最新动态,包括大型语言模型(LLM)在实用性上的争议、AI写作品味难以培养的问题,以及各项基准测试的新进展。报告分析了AI检测技术的有效性,并揭示了AI对初级岗位招聘的负面影响。此外,还讨论了AI在教育和医疗领域的双重作用、模型中固有的政治偏见,以及行业内的主要资金动向和人事变动,最终描绘出一幅快速发展但充满挑战的行业图景。
语言模型的日常实用性
关于LLM在过去一年中的实际改进程度,观点不一。许多用户认为其性能“显著提升”,使得一年前不值得尝试的任务现在变得可行,尤其是在编码方面。然而,数据显示不同群体的使用情况存在差异:
- 性别差异: 男性使用LLM的比例略高于女性。女性用户在ChatGPT和Perplexity中占42%,在Claude中占31%。在智能手机端,这一差距更大,女性仅占ChatGPT应用下载量的27%。
- 使用顾虑: 女性用户报告称,她们担心因使用AI而受到惩罚,这可能是造成使用差异的原因之一。
语言模型实用性的局限
尽管模型在进步,但仍存在明显的局限和问题。
- 模型质量波动: Claude Opus曾因一次界面更新导致模型质量意外下降了九个小时,这证实了用户有时对“模型秘密变弱”的担忧并非空穴来风。
- 现实应用失败: 塔可钟和麦当劳在汽车穿梭餐厅部署的AI点餐系统因频频出错而被迫调整或撤回。例如,系统会错误地将培根加到冰淇淋中,或将订单金额误增至数百美元。这表明,将AI技术应用于复杂的现实世界交互仍然面临“技能问题”。
- 写作“品味”的缺失: 培养AI良好的写作品味仍然是一个巨大挑战。AI高管们自己可能缺乏足够的文学品味来指导模型的改进。即使模型能够模仿奉承和权威的语气,其写作内容本身也可能非常糟糕。
可怕的写作可以与一个感觉权威而安全的实体所发出的不懈奉承相伴而行。
模型性能基准测试
新的基准测试和模型更新不断涌现,展示了性能的持续提升。
- METR 图表: Claude Opus 4.1的性能超越了Opus 4约30%,位居第二,仅次于GPT-5。
- WeirdML 测试: 开源模型
gpt-oss-120b在高推理模式下表现出色,得分达到48.9%,大幅领先于其他开源模型,几乎达到了o4-mini或gpt-5-mini的水平。这表明,推理设置对模型性能至关重要。 - Werewolf 基准测试: 在简化的“狼人杀”游戏中,最顶尖的模型能持续获胜。
- 其他测试: GPT-5在闪小说创作风格和多样性方面获得最高分。一个新的数学基准测试显示,GPT-5以43%的正确率领先,其次是DeepSeek v3.1和Grok 4。
选择合适的AI工具
不同的任务需要不同的AI工具,并且使用方式也至关重要。
- Gemini: 建议在AI Studio中使用,而不是在Gemini应用中,因为前者的性能和质量要高得多。
- GPT-5: 其“路由器”模式只适合非常基础的任务,对于高级用户而言,直接使用高性能版本更佳。
- Claude Code: 被誉为AI编码的“DOS时代”,它假设用户有一个需要解决的问题,而不是仅仅想写代码。它能将意图直接转化为执行,连接云端智能与本地访问权限。
claude code 将与 chatgpt 一同载入史册;卓越的产品形态、训练决策和易用性。我非常尊重 anthropic 的远见。
- 代码生成统计: 在Coinbase,约40%的日常代码由AI生成,目标是在10月前超过50%。
a16z的报告显示,在排名前100的生成式AI消费应用中,许多都涉及伴侣或“辛辣”聊天。这可能是因为在这些领域,“足够好”的门槛较低。
AI媒体生成与检测
AI创意工具正在不断发展,但AI生成内容的检测技术也在进步。
- 创意工具: Justine Moore的报告指出,MidJourney、GPT Image和Krea 1是图像创作的可靠选择;谷歌在图像编辑方面有优势;视频生成需要尝试多种模型;而ElevenLabs在语音生成方面仍是默认选择。
- AI写作检测器: 一项测试发现,尽管一些检测器无效,但 Pangram、Originality和GPTZero在检测未经伪装的AI文本时表现出色,假阳性率极低(通常低于1%)。Pangram甚至对“人性化”工具(如StealthGPT)处理过的文本也有效。这表明,非对抗性的AI写作检测在很大程度上是一个已解决的问题。
有趣的观察来自一位工程经理:“一旦我知道某些文本是AI生成的,我就会失去所有阅读它的兴趣。对于绩效评估,我要求人们要么不使用AI,要么如果必须使用,就直接写下提示,这样我就不必费力去看那些文字沙拉了。”
不要作恶
将AI用于操纵性或不道德的目的引发了严肃的伦理讨论。例如,有人投资开发“AI星座运势”应用,并辩称其市场巨大。这种以盈利为唯一目的的做法遭到了尖锐批评。
人们问我为什么投资折磨中枢……我的回答是“你知道折磨中枢会有多赚钱吗?”
严肃点。不要作恶。 我不在乎作恶能赚大钱。我不在乎如果你不作恶,别人也会作恶。不要。作。恶。
AI对就业市场的影响
新的研究进一步证实了生成式AI正在对就业市场产生影响,尤其是对初级职位。
- 初级岗位招聘减少: 第二篇论文发现,采用AI的公司正在减少招聘初级员工,而高级职位未受影响。
- 具体数据: 采用AI的公司,初级员工的雇佣率下降了约7.7%,主要源于招聘减少而非裁员。
- 行业影响: 批发/零售业的初级岗位招聘削减幅度最大(约40%)。
- 学历影响呈U型: 来自第二和第三梯队学校的初级员工受影响最大,而顶尖和末流学校的毕业生受影响较小。
这与“未来劳动力需求将减少”的市场预期相符。公司可能暂时保留现有员工,因为AI能提升他们的生产力,但对未来的招聘则持谨慎态度。
AI与教育
AI在教育领域的应用带来了机遇,也带来了挑战。
如果你想用AI来学习,它是迄今为止发明的最好的学习工具。
如果你想用AI来不学习,它也是迄t's not every student. Some students are becoming more empowered and knowledgeable then ever. But there is a big big big chunk of kids who are GPTing through everything and will learn far less in high school and college, and our entire society will suffer that lost human capital.
We need to change how we teach, but it won’t happen quickly (have you been to a high school lately?). Many are writing about AI-driven job loss as if AI is doing the human jobs. Some of that is happening, but we’re also graduating humans with less skills than ever before.
Here’s a plausible hypothesis, where to use LLMs to learn you need to establish basic skills first, or else you end up using them to not learn, instead. Henry Shevlin: High-school teacher friend of mine says there’s a discontinuity between (i) 17-18 year olds who learned basic research/writing before ChatGPT and can use LLMs effectively, vs (ii) 14-16 year olds who now aren’t learning core skills to begin with, and use LLMs as pure crutches.
Natural General Intelligence (obligatory): Kids with “Google” don’t know how to use the library. TV has killed their attention span, nobody reads anymore. Etc.
You definitely need some level of basic skills. If you can’t read and write, and you’re not using LLMs in modes designed explicitly to teach you those basic skills, you’re going to have a problem. This is like a lot of other learning and tasks, both in and out of school. In order to use an opportunity to learn, LLM or otherwise, you need to be keeping up with the material so you can follow it, and then choose to follow it. If you fall sufficiently behind or don’t pay attention, you might be able to fake it (or cheat on the exams) and pass. But you won’t be learning, not really. So it isn’t crazy that there could be a breakpoint around age 16 or so for the average student, where you learn enough skills that you can go down the path of using AI to learn further, whereas relying on the LLMs before that gets the average student into trouble. This could be fixed by improving LLM interactions, and new features from Google and OpenAI are plausibly offering this if students can be convinced to use them.
I am still skeptical that this is a real phenomena. We do not yet, to my knowledge, any graphs that show this discontinuity as expressed in skills and test scores, either over time or between cohorts. We should be actively looking and testing for it, and be prepared to respond if it happens, but the response needs to focus on ‘rethink the way schools work’ rather than ‘try in vain to ban LLMs’ which would only backfire.
The Art of the Jailbreak
Pliny points us to the beloved prompt injection game Gandalf, including new levels that just dropped.
Overcoming Bias
A study from the American Enterprise Institute found that top LLMs (OpenAI, Google, Anthropic, xAI and DeepSeek) consistently rate think tanks better the closer they are to center-left on the American political spectrum. This is consistent with prior work and comes as no surprise whatsoever. It is a question of magnitude only.
This is how they present the findings:
Executive SummaryLarge-language models (LLMs) increasingly inform policy research. We asked 5 flagship LLMs from leading AI companies in 2025 (OpenAI, Google, Anthropic, xAI, and DeepSeek) to rate 26 prominent U.S. think tanks on 12 criteria spanning research integrity, institutional character, and public engagement. Their explanations and ratings expose a clear ideological tilt.
Key findingsConsistent ranking. Center-left tanks top the table (3.9 of 5), left and center-right tie (3.4 and 3.4), and right trails (2.8); this order persists through multiple models, measures, and setting changes.Overall: Across twelve evaluation criteria, center-left think tanks outscore right-leaning ones by 1.1 points (3.9 vs. 2.8).Core measures. On the three headline criteria of Moral Integrity, Objectivity, and Research Quality, center-left think tanks outscore right-leaning ones by 1.6 points on Objectivity (3.4 vs. 1.8), 1.4 points on Research Quality (4.4 vs. 3), and 1 point on Moral Integrity (3.8 vs. 2.8)Language mirrors numbers. Sentiment analysis finds more positive wording in responses for left-of-center think tanks than for right-leaning peers.Shared hierarchy. High rating correlations across providers indicate the bias originates in underlying model behavior, not individual companies, user data, or web retrieval.
Sentiment analysis has what seems like a bigger gap than the ultimate ratings. Note that the gaps reported here center-left versus right, not left versus right, which would be smaller, as there is as much ‘center over extreme’ preference here as there is for left versus right. It also jumps out that there are similar gaps across all three metrics and we see similar patterns on every subcategory:
When you go institution by institution, you see large correlations between ratings on the three metrics, and you see that the ratings do seem to largely be going by (USA Center Left > USA Center-Right > USA Left > USA Right). I’m not familiar enough with most of the think tanks to offer a useful opinion, with two exceptions.
R Street and Cato seem like relatively good center-right institutions, but I could be saying that because they are both of a libertarian bent, and this suggests it might be right to split out principled libertarian from otherwise center-right. On the other hand, Mercatus Center would also fall into that libertarian category, has had some strong talent associated with it, has provided me with a number of useful documents, and yet it is rated quite low. This one seems weird.
The American Enterprise Institute is rated the highest of all the right wing institutions, which is consistent with the high quality of this report. Why it mattersLLM-generated reputations already steer who is cited, invited, and funded. If LLMs systematically boost center-left institutes and depress right-leaning ones, writers, committees, and donors may unknowingly amplify a one-sided view, creating feedback loops that entrench any initial bias.
My model of how funding works for think tanks is that support comes from ideologically aligned sources, and citations are mostly motivated by politics. If LLMs consistently rate right wing think tanks poorly, it is not clear this changes decisions that much, whether or not it is justified? I do see other obvious downsides to being consistently rated poorly, of course.
Next stepsModel builders: publish bias audits, meet with builders, add options for user to control political traits, and invite reviewers from across the political spectrum.Think tanks: monitor model portrayals, supply machine-readable evidence of methods and funding, and contest mischaracterizations.Users: treat AI models’ responses on political questions with skepticism and demand transparency on potential biases.Addressing this divergence is essential if AI-mediated knowledge platforms are to broaden rather than narrow debate in U.S. policy discussions.
Or:
Clearly, the job of the think tanks is to correct these grievous errors? Their full recommendation here is somewhat better. I have no doubt that the baseline findings here are correct. To what extent are they the result of ‘bias’ versus reflecting real gaps? It seems likely, at minimum, that more ‘central’ think tanks are a lot better on these metrics than more ‘extreme’ ones. What about the recommendations they offer? The recommendation that model builders check for bias is reasonable, but the fundamental assumption is that we are owed some sort of ‘neutral’ perspective that treats everyone the same, or that centers itself on the center of the current American political spectrum (other places have very different ranges of opinions), and it’s up to the model creators to force this to happen, and that it would be good if the AI cater to your choice of ideological perspective without having to edit a prompt and know that you are introducing the preference. The problem is, models trained on the internet disagree with this, as illustrated by xAI (who actively want to be neutral or right wing) and DeepSeek (which is Chinese) exhibiting the same pattern. The last time someone tried a version of forcing the model to get based, we ended up with MechaHitler. If you are relying on models, yes, be aware that they are going to behave this way. You can decide for yourself how much of that is bias, the same way you already do for everything else. Yes, you should understand that when models talk about ‘moral’ or ‘reputational’ perspectives, that is from the perspective of a form of ‘internet at large’ combined with reasoning. But that seems like an excellent way to judge what someone’s ‘reputation’ is, since that’s what reputation means. For morality, I suggest using better terminology to differentiate.
What should think tanks do?
Think tanks and their collaborators may be able to improve how they are represented by LLMs. One constructive step would be to commission periodic third-party reviews of how LLMs describe their work and publish the findings openly, helping to monitor reputational drift over time. Think tanks should also consistently provide structured, machine-readable summaries of research methodology, findings, and peer review status, which LLMs can more easily draw on to inform more grounded evaluations, particularly in responding to search-based queries. Finally, think tanks researchers can endeavor to be as explicit as possible in research publications by using both qualitative and quantitative statements and strong words and rhetoric. Early research seems to indicate that LLMs are looking for balance. This means that with respect to center left and left thing tanks, any criticism or critiques by a center right or right think tanks have of reasonable chance of showing up in the response.
Some of these are constructive steps, but I have another idea? One could treat this evaluation of lacking morality, research quality and objectivity as pointing to real problems, and work to fix them? Perhaps they are not errors, or only partly the result of bias, especially if you are not highly ranked within your ideological sector. Get Involved
MATS 9.0 applications are open, apply by October 2. It will run January 5 to March 28, 2026 to be an ML Alignment or Theory Scholar, including for nontechnical policy and government. This seems like an excellent opportunity for those in the right spot.
Jennifer Chen, who works for me on Balsa Research, asks me to pass along that Canada’s only AI policy advocacy organization, AI Governance and Safety Canada (AIGS), needs additional funding from residents or citizens of Canada (for political reasons it can’t accept money from anyone else, and you can’t deduct donations) to survive, and it needs $6k CAD per month to sustain itself. Here’s what she has to say:
Jennifer Chen: AIGS is currently the only Canadian AI policy shop focused on safety. Largely comprised of dedicated, safety-minded volunteers, they produce pragmatic, implementation-ready proposals for the Canadian legislative system. Considering that Carney is fairly bullish on AI and his new AI ministry's mandate centers on investment, training, and commercialization, maintaining a sustained advocacy presence here seems incredibly valuable. Canadians who care about AI governance should strongly consider supporting them.If you're in or from Canada, and you want to see Carney push for international AGI governance, you might have a unique opportunity (I haven’t had the opportunity to investigate myself). Consider investigating further and potentially contributing here. For large sums, please email [email protected].
Anthropic is hosting the Anthropic Futures Forum in Washington DC on September 15, 9:30-2:00 EST. I have another engagement that day but would otherwise be considering attending. Seems great if you are already in the DC area and would qualify to attend.
The Anthropic Futures Forum will bring together policymakers, business leaders, and top AI researchers to explore how agentic AI will transform society. You'll hear directly from Anthropic's leadership team, including CEO Dario Amodei and Co-founder Jack Clark, learn about Anthropic’s latest research progress, and see live demonstrations of how AI is being applied to advance national security, commercial, and public services innovation.
Introducing
Grok Code Fast 1, available in many places or $0.20/$1.50 on the API. They offer a guide here which seems mostly similar to what you’d do with any other AI coder.
InstaLILY, powered by Gemini, an agentic enterprise search engine, for tasks like matching PartsTown technicians with highly specific parts. The engine is built on synthetic data generation and student model training. Another example cited is Wolf Games using it to generate daily narrative content, which is conceptually cool but does not make me want to play any Wolf Games products.
The Brave privacy-focused browser offers us Leo, the smart AI assistant built right in. Pliny respected it enough to jailbreak it via a webpage and provide its system instructions. Pliny reports the integration is awesome, but warns of course that this is a double edged sword given what can happen if you browse. Leo is based on Llama 3.1 8B, so this is a highly underpowered model. That can still be fine for many web related tasks, as long as you don’t expect it to be smart. To state the obvious, Leo might be cool, but it is wide open to hackers. Do not use Leo while your browser has access to anything you would care about getting hacked. So no passwords of value, absolutely no crypto or bank accounts or emails, and so on. It is one thing to take calculated risks with Claude for Chrome once you have access, but with something like Leo I would take almost zero risk. Unprompted Attention
OpenAI released a Realtime Prompting Guide. Carlos Perez looked into some of its suggestions, starting with ‘before any call, speak neutral filler, then call’ to avoid ‘awkward silence during tool calls.’ Um, no, thanks? Other suggestions seem better, such as being explicit about where to definitely ask or not ask for confirmation, or when to use or not use a given tool, what thresholds to use for various purposes, offering templates, and only responding to ‘clear audio’ and asking for clarification. They suggest capitalization for must-follow rules, this rudeness is increasingly an official aspect of our new programming language. Rob Wiblin shares his anti-sycophancy prompt.
In Other AI News
The Time 100 AI 2025 list is out, including Pliny the Liberator. The list has plenty of good picks, it would be very hard to avoid this, but it also has some obvious holes. How can I take such a list seriously if it doesn’t include Demis Hassabis?
Google will not be forced to do anything crazy like divest Chrome or Android, the court rightfully calling it overreach to have even asked. Nor will Google be barred from paying for Chrome to get top placement, so long as users can switch, as the court realized that this mainly devastates those currently getting payments. For their supposed antitrust violations, Google will also be forced to turn over certain tailored search index and user-interaction data, but not ads data, to competitors. I am very happy with the number of times the court replied to requests with ‘that has nothing to do with anything involved in this case, so no.’
Dan Nystedt: TSMC said the reason Nvidia CEO Jensen Huang visited Taiwan on 8/22 was to give a speech to TSMC employees at its R&D center in Hsinchu, media report, after Taiwan’s Mirror Media said Huang’s visit was to tell TSMC that US President Trump wanted TSMC to pay profit-sharing on AI chips manufactured for the China market like the 15% Nvidia and AMD agreed to.
As in, Trump wants TSMC, a Taiwanese company that is not American, to pay 15% profit-sharing on AI chips sold to China, which is also not America, but is otherwise fine with continuing to let China buy the chips. This is our official policy, folks.
METR and Factory AI are hosting a Man vs. Machine hackathon competition, where those with AI tools face off against those without, in person in SF on September 6. Prize and credits from OpenAI, Anthropic and Raindrop. Manifold market here.
Searches for Cursor, Claude Code, Lovable, Replit and Windsurf all down a lot (44%-78%) since July and August. Claude Code and Cursor are now about equal here. Usage for these tools continues to climb, so perhaps this is a saturation as everyone inclined to use such a tool now already knows about them? Could it be cyclic? Dunno.
I do know this isn’t about people not wanting the tools.
Sam Altman: really cool to see how much people are loving codex; usage is up ~10x in the past two weeks!lots more improvements to come, but already the momentum is so impressive.
A promising report, but beware the source’s propensity to hype:
Bryan Johnson: This is big. OpenAI and Retro used a custom model to make cellular reprogramming into stem cells ~50× better, faster, and safer. Similar Wright brothers’ glider to a jet engine overnight.We may be the first generation who won't die. OpenAI and Retro Biosciences reported a landmark achievement: using a domain-specialized protein design model, GPT-4b micro, they created engineered reprogramming factors that deliver over 50× higher efficiency in generating induced pluripotent stem cells (iPSCs), with broad validation across donors and cell types. These AI-designed proteins not only accelerate reprogramming but also enhance DNA repair, overcoming DNA damage as one cellular hallmark of aging hinting at relevance for aging biology.
It is early days, but this kind of thing does seem to be showing promise. Show Me the Money
Anthropic finalizes its raise of $13 billion at a $183 billion post-money valuation. They note they started 2025 at $1 billion in run-rate revenue and passed $5 billion just eight months later, over 10% of which is from Claude Code which grew 10x in three months.
These are the same people shouting from the rooftops that AGI is coming soon, and coming for many jobs soon, with timelines that others claim are highly unrealistic. So let this be a reminder: All of Anthropic’s revenue projections that everyone said were too optimistic to take seriously? Yeah, they’re doing actively better than that. Maybe they know what they’re talking about?
Meta’s new chief scientist Shengjia Zhao, co-creator of OpenAI’s ChatGPT, got the promotion in part by threatening to go back to OpenAI days after joining Meta, and even signing the employment paperwork to do so. That’s in addition to the prominent people who have already left. FT provides more on tensions within Meta and so does Charles Rollet at Business Insider. This doesn’t have to mean Zuckerberg did anything wrong, as bringing in lots of new expensive talent quickly will inevitably spark such fights.
Meta makes a wise decision that I actually do think is bullish:
Peter Wildeford: This doesn't seem very bullish for Meta.Quoted: Meta Platforms’ plans to improve the artificial intelligence features in its apps could lead the company to partner with Google or OpenAI, two of its biggest AI rivals.
Reuters: Leaders in Meta’s new AI organization, Meta Superintelligence Labs, have discussed using Google’s Gemini model to provide conversational, text-based answers to questions that users enter into Meta AI, the social media giant’s main chatbot, a person familiar with the conversations said. Those leaders have also discussed using models by OpenAI to power Meta AI and other AI features in Meta’s social media apps, another person familiar with the talks said.
Let’s face it, Meta’s AIs are not good. OpenAI and Google (and Anthropic, among others) make better ones. Until that changes, why not license the better tech? Yes, I know, they want to own their own stack here, but have you considered the piles? Better models means selling more ads. Selling more ads means bigger piles. Much bigger piles. Of money. If Meta manages to make a good model in the future, they can switch back. There’s no locking in here, as I keep saying. The most valuable companies in the world? AI, AI everywhere.
Sean Ó hÉigeartaigh: The ten biggest companies in the world by market cap: The hardware players: 1) Nvidia 9) TSMC semiconductors (both in supply chain that produces high end chips). 8) Broadcom provides custom components for tech companies' AI workloads, plus datacentre infrastructureThe digital giants:2) Microsoft 3) Apple 4) Alphabet 5) Amazon 6) Meta all have in-house AI teams; Microsoft and Amazon also have partnerships w OpenAI and Anthropic, which rely on their datacentre capacity.10) Tesla's CEO describes it as 'basically an AI company'7) Saudi Aramco is Saudi Arabia's national oil company; Saudi Arabia was one of the countries the USA inked deals with this summer that centrally included plans for AI infrastructure buildout. Low-cost and abundant energy from oil/gas makes the Middle East attractive for hosting compute.
The part about Aramco is too cute by half but the point stands.