Since the Chinese AI startup DeepSeek released its powerful large language model R1, it has sent ripples through Silicon Valley and the U.S. stock market, sparking widespread discussion and debate.
In a reasoning test using Arena-Hard, Qwen 2.5-Max achieved 89.4% accuracy, and the result was higher than DeepSeek R1 and when tested on other benchmarks of coding and scientific reasoning, Qwen 2.5 ...