in

Microsoft’s Small Language Model: Tiny Stories in 9 Words #TinyStories

TinyStories. The Small Language Model from Microsoft… | by Cobus Greyling | Jun, 2024

The Small Language Model Phi-3 from Microsoft was trained on a dataset called TinyStories, which aimed to provide more diverse and qualitative elements found in natural language. The model was trained on synthetic data generated by GPT-3.5 and GPT-4 to avoid repetitive and similar training data. Despite being smaller and more restricted, Phi-3 exhibited behaviors similar to larger language models. The creators of Phi-3 focused on high-quality data, starting with a dataset of 3,000 words with equal numbers of nouns, verbs, and adjectives. They then asked a large language model to create children’s stories using one word from each category, generating millions of tiny stories. This approach allowed the model to be trained in less than a day on a single GPU. The study highlights the importance of creating a framework for generating synthetic training data for language models, showing that it can lead to effective training and diverse outputs.

Source link

Source link: https://cobusgreyling.medium.com/tinystories-4ce620e569a4?source=rss——large_language_models-5

What do you think?

Leave a Reply

GIPHY App Key not set. Please check settings

Briefing: OpenAI Restarted Its Robotics Team — The Information - The Information

AI’s potential for biosecurity: promises and risks #DHS

Next Generation AI Video is HERE (Runway Gen-3)

#NextGenAI Video Unveiled: Runway Gen-3 Takes Center Stage #AIinnovation