in

INDUS: NASA and IBM Researchers’ Advanced Scientific Language Models #ResearchTech

NASA and IBM Researchers Introduce INDUS: A Suite of Domain-Specific Large Language Models (LLMs) for Advanced Scientific Research

Large Language Models (LLMs) trained on extensive data sets have shown impressive abilities in natural language generation and understanding. However, these models often struggle in specialized domains due to a distributional shift in vocabulary and context. To address this issue, a team of researchers from NASA and IBM collaborated to develop INDUS, a set of encoder-based LLMs specialized in Earth sciences, astronomy, physics, astrophysics, heliophysics, planetary sciences, and biology.

INDUS includes various models tailored to different needs, such as an encoder model for natural language understanding, a contrastive-learning-based general text embedding model for information retrieval tasks, and smaller model versions for lower latency or limited computational resources. The team also created three new scientific benchmark datasets to advance research in interdisciplinary fields.

The team utilized the byte-pair encoding (BPE) technique to create INDUSBPE, a specialized tokenizer that improves the model’s comprehension of domain-specific language. By pretraining encoder-only LLMs and fine-tuning them with a contrastive learning objective, the team developed sentence-embedding models with universal sentence embeddings. Additionally, smaller versions of these models were trained using knowledge distillation techniques to maintain performance in resource-constrained scenarios.

Experimental findings showed that these models outperformed domain-specific encoders like SCIBERT and general-purpose models like RoBERTa on benchmark tasks and domain-specific benchmarks. Overall, INDUS represents a significant advancement in Artificial Intelligence, providing professionals and researchers in scientific domains with a powerful tool for accurate and effective Natural Language Processing tasks.

Source link

Source link: https://www.marktechpost.com/2024/07/04/nasa-and-ibm-researchers-introduce-indus-a-suite-of-domain-specific-large-language-models-llms-for-advanced-scientific-research/?amp

What do you think?

Leave a Reply

GIPHY App Key not set. Please check settings

Implementing Stable Diffusion 3 Medium: A Step-by-Step Guide | by Sridevi Panneerselvam | Jul, 2024

Guide to Implementing Stable Diffusion 3 Medium: Step-by-Step #diffusion

Illustration of a pixel block brain.

‘Pro Search’ AI upgrade enhances math and research skills #technology