News
One of Meta's newest AI models, Llama 4 Maverick, ranks below rivals on a popular chat benchmark. Meta didn't originally ...
Meta accused of using unreleased Llama 4 variants to boost AI benchmark rankings, prompting LM Arena policy changes.
Earlier this week, Meta landed in hot water for using an experimental, unreleased version of its Llama 4 Maverick model to achieve a high score on a crowdsourced benchmark, LM Arena. The incident ...
But they said things like that. And one of the benchmarks that really got people’s attention was LM Arena. You know LM Arena? So it’s this really interesting project. It is a very small ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results