Alibaba dominates Hugging Face's LLM leaderboard, #ChineseAIleadership

Hugging Face has released its second LLM leaderboard to rank the best language models across various tasks. Alibaba’s Qwen models dominate the rankings, with three spots in the top ten. The leaderboard tests models on knowledge, reasoning, math, and instruction following using six benchmarks. Qwen, Alibaba’s LLM, leads the pack, followed by other models like Llama3-70B and Meta’s LLM. The tests are run on Hugging Face’s computers, powered by 300 Nvidia H100 GPUs. The leaderboard is open for submissions, with a new voting system to prioritize popular new entries for testing.

Hugging Face’s first leaderboard became popular among developers aiming for high ranks, but as models improved, the results became less meaningful, leading to the creation of a second leaderboard. Some models underperformed in the new leaderboard due to over-training on the first one’s benchmarks. This trend reflects a decline in AI performance over time, highlighting the importance of training data in LLM performance. True artificial intelligence remains a distant goal, as evidenced by the limitations of current language models. Hugging Face’s collaborative approach and commitment to transparency make it a trusted source in the LLM space.

Source link

Source link: https://www.tomshardware.com/tech-industry/artificial-intelligence/chinese-llms-storm-hugging-faces-chatbot-benchmark-leaderboard-alibaba-runs-the-board-as-major-us-competitors-have-worsened

Alibaba dominates Hugging Face’s LLM leaderboard, #ChineseAIleadership

Like this:

What do you think?

Segmenting lesions using deep learning technology for medical purposes. #MedicalAI

Rugveda takes a break on AI-suggested rest day. #Relaxation

Nvidia’s Broadcast app revolutionizes content creation with AI tools #innovation

#UCLA researchers propose Ctrl-G: Neurosymbolic Framework for LLMs. #LogicConstraints

Infinix ZeroBook Ultra: Affordable powerhouse for AI tasks. #TechPowerhouse

Creating an alliance to combat negative effects of AI apps #AIalliance

Top 7 AI tools to improve data science workflow #AIWorkflow

Intelligence and Illusion: The Turing Test Debate ⚡ #AIdebate

The impending AI revolution: Transforming industries and impacting workers #AIRevolution

AI-Art-Tutorials.com: Charming small town scene, swirling night sky. #DALL-E

Leave a ReplyCancel reply

Segmenting lesions using deep learning technology for medical purposes. #MedicalAI

#UCLA researchers propose Ctrl-G: Neurosymbolic Framework for LLMs. #LogicConstraints

Top 7 AI tools to improve data science workflow #AIWorkflow

#AmplifyingExpertise: Citizen Science and Deep Learning in Neuroimaging

East Asian Languages Chapter by Henry Heng LUO, Jun 2024 #Languages

Enhancing Communication with AI Voice Tools for Efficiency #AIVoiceTools

Prinz Leo’s guide to the summer ahead #sunnydaysahead

Center of Investigative Reporting sues OpenAI for copyright infringement. #copyrightviolations

AI-powered iOS app reads PDFs and webpages aloud. #Accessibility

Maximizing ChatGPT for efficient event planning #eventplanning

Review of Radxa Fogwise Airbox AI box Part 2: Llama3, Stable Diffusion, imgSearch, Python SDK, YOLOv8 #AIBoxReview

Top list of malicious AI use: Political deepfakes revealed. #DeepMind

Centralized vs Crypto x AI: Open Source Movement #technology

AI tool still vulnerable to RCE bug despite patching. #cybersecurity

Like my blog?

Donate via Patreon to support me.
Thank You!

Segmenting lesions using deep learning technology for medical purposes. #MedicalAI

Rugveda takes a break on AI-suggested rest day. #Relaxation

Nvidia’s Broadcast app revolutionizes content creation with AI tools #innovation

#UCLA researchers propose Ctrl-G: Neurosymbolic Framework for LLMs. #LogicConstraints

Share this:

Like this:

What do you think?

Leave a ReplyCancel reply

Like my blog?

Add to Collection

No Collections