CoinFeed
OpenClaw proxy task evaluation: Gemini 3 Flash success rate 95.1%, GPT-4o 85.2%. - CoinFeed
Time 03:27

OpenClaw proxy task evaluation: Gemini 3 Flash success rate 95.1%, GPT-4o 85.2%.

March 8, 2026
CoinFeed News

CoinFeed reported on March 8th that SlowMist's CISO 23pads published an article on the X platform stating that the PinchBench benchmark test evaluates the performance of AI large language models in the OpenClaw agent task. The results show that the Gemini 3 Flash leads with a 95.1% success rate in processing the OpenClaw task, followed by minimax-m2.1 and kimi-k2.5 with 93.6% and 93.4% respectively. Claude Sonnet 4.5 achieved 92.7%, and GPT-4o achieved 85.2%.

Back to News Feed