Claude 3.5 Sonnet vs AI coding tests: ChatGPT's failure #creativity

The article discusses the release of Claude 3.5 Sonnet by Anthropic, an AI company, claiming it outperforms previous models in intelligence, especially for code generation tasks. The author tests Claude 3.5 Sonnet against various coding challenges and compares it to other AI models like Microsoft Copilot, Meta AI, Google Gemini Advanced, and ChatGPT.

In the first test of writing a WordPress plugin, Claude 3.5 Sonnet created a clean interface but failed in functionality due to a serious security flaw. In the second test of rewriting a string function, Claude failed to allow decimal values alone to be entered, resulting in a fail. However, in the third test of finding an annoying bug, Claude succeeded in identifying and correcting the bug, showcasing its platform knowledge.

The final test involved writing a script using specialized programming tools like AppleScript and Keyboard Maestro. Claude 3.5 Sonnet failed in this test by generating code that would result in a runtime error, displaying a lack of knowledge in specialized programming tools.

Overall, the author was disappointed with Claude 3.5 Sonnet’s performance in programming tasks, as it failed to deliver accurate and functional code compared to ChatGPT solutions. The article concludes with the recommendation to rely on ChatGPT for programming assistance, as it outperformed Claude 3.5 Sonnet in the tests conducted.

Source link

Source link: https://www.zdnet.com/article/i-pitted-claude-3-5-sonnet-against-ai-coding-tests-chatgpt-aced-and-it-failed-creatively/