Databricks open-sources its Dolly large language AI model

In an attempt to open up its technology to a wider audience, enterprise software company Databricks have released Dolly, a large language model and its associated training code under an open-source licence. Despite being based on a much smaller underlying model, the company says it has ChatGPT-like functionality and can be run “in-house”.

The move was inspired by the success of OpenAI’s natural language platform ChatGPT, which became one of the fastest-growing consumer apps within a couple of months of its release in November last year. It has since caused some of the world’s largest companies including Microsoft and Google to pivot and release generative and natural language AI tools.

“We show that anyone can take a dated off-the-shelf open source LLM and give it magical ChatGPT-like instruction-following ability by training it in 30 minutes on one machine, using high-quality training data,” Databricks wrote in a blog post explaining the decision.

It found that the type of instruction-following used in ChatGPT “does not seem to require the latest or largest models”, and claims that from just six billion parameters, compared to 175 billion in GPT-3 and many more in GPT-4 or Google’s PaLM, it was able to recreate the functionality of ChatGPT.

the company stated “We believe models like Dolly will help democratise LLMs, transforming them from something very few companies can afford into a commodity every company can own and customise to improve their products.”

from LLaMA to Alpaca to Dolly

Developers like OpenAI, Anthropic, AI21 Labs, as well as Microsoft, Google and IBM charge end-users for access to their large language models through API calls. This can become expensive very quickly if you need to make a lot of calls on a regular basis. Alternatively, training those same models is an expensive endeavour that takes hundreds of GPU hours and trillions of words from datasets.

Then Meta released the weights for its high-quality language model, LLaMA, to researchers. It had been trained using more than 80,000 GPU hours, with Stanford University-built Alpaca, on top of LLaMA, tuned to a subset of 50,000 human-like questions and answers. This led to it exhibiting ChatGPT-like functionality from a relatively small training dataset.

Dolly, from Databricks can deliver what the company describes as a “surprising degree of instruction-following capabilities” but from a much smaller model. Where the Alpaca team demonstrated that a state-of-the-art model could be used as a chatbot engine, Databricks says even years-old models can be tweaked to have those same types of behaviours if fine-tuned on a small corpus of instruction training data. 

For the latest industry news, click here:

For the latest industry news, click here: