INDUS: NASA and IBM Researchers' Advanced Scientific Language Models #ResearchTech

Large Language Models (LLMs) trained on extensive data sets have shown impressive abilities in natural language generation and understanding. However, these models often struggle in specialized domains due to a distributional shift in vocabulary and context. To address this issue, a team of researchers from NASA and IBM collaborated to develop INDUS, a set of encoder-based LLMs specialized in Earth sciences, astronomy, physics, astrophysics, heliophysics, planetary sciences, and biology.

INDUS includes various models tailored to different needs, such as an encoder model for natural language understanding, a contrastive-learning-based general text embedding model for information retrieval tasks, and smaller model versions for lower latency or limited computational resources. The team also created three new scientific benchmark datasets to advance research in interdisciplinary fields.

The team utilized the byte-pair encoding (BPE) technique to create INDUSBPE, a specialized tokenizer that improves the model’s comprehension of domain-specific language. By pretraining encoder-only LLMs and fine-tuning them with a contrastive learning objective, the team developed sentence-embedding models with universal sentence embeddings. Additionally, smaller versions of these models were trained using knowledge distillation techniques to maintain performance in resource-constrained scenarios.

Experimental findings showed that these models outperformed domain-specific encoders like SCIBERT and general-purpose models like RoBERTa on benchmark tasks and domain-specific benchmarks. Overall, INDUS represents a significant advancement in Artificial Intelligence, providing professionals and researchers in scientific domains with a powerful tool for accurate and effective Natural Language Processing tasks.

Source link

Source link: https://www.marktechpost.com/2024/07/04/nasa-and-ibm-researchers-introduce-indus-a-suite-of-domain-specific-large-language-models-llms-for-advanced-scientific-research/?amp