Newest stories appear at the top. Short summaries are included, with links for more details.
A new benchmark tested large language models in the social deduction game Werewolf. Could they lead, bluff, and resist manipulation in live adversarial play? Across 210 full games, GPT-5 emerged in a league of its own with a 96.7% win rate as both villager and wolf β far ahead of Gemini 2.5 Pro (63.3%). Even more striking: GPT-5 sustained a 93% manipulation success rate into Day 2, when Seer and Witch information usually dismantles wolf cover. Turns out, GPT-5 is the ultimate werewolf strategist. πΊπΌ Read more β
Anthropic has launched Claude Opus 4.1, its strongest model so far. It improves reasoning, reduces hallucinations, and handles very long documents. Read more β
OpenAI has expanded Pro access to GPT-5-Pro and GPT-5-Thinking. Plus subscribers now receive up to 3,000 GPT-5-Thinking messages weekly, while Pro users have unlimited use. Read more β