Large Language Models (LLMs) have shown significant progress in Information Extraction (IE) tasks in Natural Language Processing (NLP), especially when combined with instruction tuning. However, LLMs face challenges in low-resource languages due to a lack of data. To address this, researchers from the Georgia Institute of Technology introduced the TransFusion framework, which translates low-resource language data into English for model training. The framework includes steps like translation during inference, fusion of annotated data, and constructing a TransFusion Reasoning Chain.
Additionally, the team introduced GoLLIE-TF, an instruction-tuned LLM tailored for Internet Explorer tasks in low-resource languages. Experiments on multilingual IE datasets with fifty languages demonstrated the effectiveness of GoLLIE-TF in zero-shot cross-lingual transfer. Applying TransFusion to models like GPT-4 improved performance in low-resource language named entity recognition.
Overall, TransFusion and GoLLIE-TF offer a powerful solution for enhancing IE tasks in low-resource languages by leveraging English translations and fine-tuning models. The framework aims to bridge the performance gap between high- and low-resource languages, as evidenced by improved results across various models and datasets. The combination of TransFusion and GoLLIE-TF shows promise in boosting LLM efficiency in handling low-resource languages.
Source link
Source link: https://www.marktechpost.com/2024/06/30/transfusion-an-artificial-intelligence-ai-framework-to-boost-a-large-language-models-multilingual-instruction-following-information-extraction-capability/?amp