Menu
in

#JapaneseResearchersReleaseFugakuLLM #CuttingEdgeTechnology

A team of researchers in Japan developed Fugaku-LLM, a large language model with enhanced Japanese language capabilities, using the RIKEN supercomputer Fugaku. The model has 13 billion parameters and outperforms other models in Japan, scoring 5.5 on the Japanese MT-Bench. It was trained on proprietary Japanese data collected by CyberAgent, along with English data, and is available on GitHub and Hugging Face for research and commercial use. The team optimized training methods and communication performance on Fugaku, showcasing the potential of using CPUs instead of GPUs for large language models.

The development of Fugaku-LLM involved collaboration between Tokyo Institute of Technology, Tohoku University, Fujitsu, RIKEN, Nagoya University, CyberAgent, and Kotoba Technologies. The model was trained from scratch using the team’s own data, ensuring transparency and safety. Future applications of Fugaku-LLM include natural dialogue in Japanese and innovative research and business applications.

The research was supported by the Fugaku policy-supporting proposal and aims to enhance Japan’s competitiveness in the field of AI. The results are publicly available for further development of large language models, and Fugaku-LLM will be offered to users via the Fujitsu Research Portal. Overall, the study demonstrates the potential of Fugaku-LLM in advancing AI research and applications.

Source link

Source link: https://aithority.com/machine-learning/japanese-researchers-release-fugaku-llm-trained-on-the-fugaku-supercomputer/

Leave a Reply

Exit mobile version