“Grok predicted the future accurately,” Musk claims after Feb. 28 call matches Iran strikes

Elon Musk’s Grok chatbot is getting a victory lap online after a Jerusalem Post experiment published days before the U.S. and Israel struck Iran ended up matching the calendar: Grok picked Feb. 28, and the strikes began Feb. 28.

Musk reposted the claim Saturday and wrote, “Prediction of the future is the best measure of intelligence,” amplifying posts arguing Grok “predicted the future accurately.”

What happened and what was actually “predicted”

The Jerusalem Post said it ran a “stress test” on four AI systems — Anthropic’s Claude, Google’s Gemini, OpenAI’s ChatGPT, and xAI’s Grok — asking them to name the exact day the U.S. would strike Iran. The Post reported that after repeatedly pressing the systems to stop hedging and choose one date, Grok gave the clearest single-day answer: Saturday, Feb. 28, tying it to the outcome of talks in Geneva.

The other models, according to the Post’s summary of results shared publicly, gave early-March windows or shifted as they were pushed for specificity, rather than landing on Feb. 28.

Then, on Feb. 28, Israel said it launched a “pre-emptive” strike against Iran, with U.S. action accompanying Israel’s operation, according to Reuters. Reuters also reported an Israeli defense official said the operation had been planned for months and the launch date chosen weeks in advance — meaning the date existed well before the AI exercise, even if it wasn’t public.

Why Grok hit the date while the other AIs didn’t

1) Grok gave a single-day answer — the others mostly didn’t.

The biggest difference isn’t that Grok had secret information. It’s that, in the reported test, Grok was the one most willing to commit to a specific day when pressed. If three systems respond with ranges (or refuse to “predict”), and one system names an exact date, the only one that can “hit” an exact date is the one that actually chooses one.

2) The Post tied Grok’s date to a “decision window” narrative.

The Jerusalem Post said Grok anchored its Feb. 28 call to Geneva talks and the idea that failed diplomacy could trigger near-term military action. That’s an important detail, because it suggests Grok wasn’t conjuring a date from nowhere; it was using a common analyst approach — look for deadlines, talks, travel schedules, force posture, and public warnings — and then making a forced choice.

3) The other systems reportedly leaned into hedging and safety behavior.

Major chatbots are typically tuned to avoid making confident, date-certain claims about violent events. In the Post’s framing, the models became more specific only after repeated pushing — but they still largely landed in early March or shifted their picks. That’s consistent with how many AI systems behave under pressure: they’ll offer probabilistic windows, caveats, or refusals rather than a clean “Feb. 28.”

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *