Meta has a new machine learning language model to remind you it does AI too

Illustration: Nick Barclay / The Verge

The buzz in tech these last few weeks has been focused squarely on the language models developed and deployed by the likes of Microsoft, Google, and OpenAI. But Meta, Facebook’s parent company, continues to do significant work in this field and is releasing a new AI language generator named LLaMA today.
LLaMA isn’t like ChatGPT or Bing; it’s not a system that anyone can talk to. Rather, it’s a research tool that Meta says it’s sharing in the hope of “democratizing access in this important, fast-changing field.” In other words: to help experts tease out the problems of AI language models, from bias and toxicity to their tendency to simply make up information.
To this end, Meta is releasing LLaMA (which is not actually a single system but a quartet of different-sized models) under “a noncommercial license focused on research use cases,” with access granted to groups like universities, NGOs, and industry labs.
“We believe that the entire AI community — academic researchers, civil society, policymakers, and industry — must work together to develop clear guidelines around responsible AI in general and responsible large language models in particular,” the company wrote in its post. “We look forward to seeing what the community can learn — and eventually build — using LLaMA.”

In a research paper, Meta claims that the second-smallest version of the LLaMA model, LLaMA-13B, performs better than OpenAI’s popular GPT-3 model “on most benchmarks,” while the largest, LLaMA-65B, is “competitive with the best models,” like DeepMind’s Chinchilla70B and Google’s PaLM 540B. (The numbers in these names refer to the billions of parameters in each model — a measure of the system’s size and a rough approximation of its sophistication, though these two qualities do not necessarily scale in lockstep.)
Once trained, LLaMA-13B can also run on a single data center-grade Nvidia Tesla V100 GPU. That’ll be welcome news for smaller institutions wanting to run tests on these system but doesn’t mean much for lone researchers for whom such equipment is out of reach.
Meta’s release is also notable partly because it’s missed out on some of the buzz surrounding AI chatbots. (That might not be a bad thing, though, given the criticism Microsoft has received for rushing the launch of Bing and the nosedive taken by Google’s stock price after its own chatbot made an error in a demo.)
Meta has actually released its own accessible AI chatbots in the past, but the reception has been less than stellar. One, named BlenderBot, was criticized for being simply… not very good, while another, named Galactica, which was designed to write scientific papers, was pulled offline after only three days after it kept producing scientific nonsense.
With the LLaMA quartet, Meta is presumably hoping for a kinder reception.
“Today we’re releasing a new state-of-the-art AI large language model called LLaMA designed to help researchers advance their work,” CEO Mark Zuckerberg said in a Facebook post. “LLMs have shown a lot of promise in generating text, having conversations, summarizing written material, and more complicated tasks like solving math theorems or predicting protein structures. Meta is committed to this open model of research and we’ll make our new model available to the AI research community.”

Illustration: Nick Barclay / The Verge

LLaMA isn’t like ChatGPT or Bing; it’s not a system that anyone can talk to. Rather, it’s a research tool that Meta says it’s sharing in the hope of “democratizing access in this important, fast-changing field.” In other words: to help experts tease out the problems of AI language models, from bias and toxicity to their tendency to simply make up information.

To this end, Meta is releasing LLaMA (which is not actually a single system but a quartet of different-sized models) under “a noncommercial license focused on research use cases,” with access granted to groups like universities, NGOs, and industry labs.

“We believe that the entire AI community — academic researchers, civil society, policymakers, and industry — must work together to develop clear guidelines around responsible AI in general and responsible large language models in particular,” the company wrote in its post. “We look forward to seeing what the community can learn — and eventually build — using LLaMA.”

Once trained, LLaMA-13B can also run on a single data center-grade Nvidia Tesla V100 GPU. That’ll be welcome news for smaller institutions wanting to run tests on these system but doesn’t mean much for lone researchers for whom such equipment is out of reach.

Meta’s release is also notable partly because it’s missed out on some of the buzz surrounding AI chatbots. (That might not be a bad thing, though, given the criticism Microsoft has received for rushing the launch of Bing and the nosedive taken by Google’s stock price after its own chatbot made an error in a demo.)

Meta has actually released its own accessible AI chatbots in the past, but the reception has been less than stellar. One, named BlenderBot, was criticized for being simply… not very good, while another, named Galactica, which was designed to write scientific papers, was pulled offline after only three days after it kept producing scientific nonsense.

With the LLaMA quartet, Meta is presumably hoping for a kinder reception.

“Today we’re releasing a new state-of-the-art AI large language model called LLaMA designed to help researchers advance their work,” CEO Mark Zuckerberg said in a Facebook post. “LLMs have shown a lot of promise in generating text, having conversations, summarizing written material, and more complicated tasks like solving math theorems or predicting protein structures. Meta is committed to this open model of research and we’ll make our new model available to the AI research community.”

verge-rss

Recent Posts

Recent Comments

Forget all the fancy writing apps: I wrote a novel in Google Docs and it had everything I needed

MiB working set on a 64 GiB machine

You can now rent Google’s most powerful AI chip: Trillium TPU underpins Gemini 2.0 and will put AMD and Nvidia on high alert

Categories

Archives

Recent Posts

Recent Comments

Forget all the fancy writing apps: I wrote a novel in Google Docs and it had everything I needed

MiB working set on a 64 GiB machine

You can now rent Google’s most powerful AI chip: Trillium TPU underpins Gemini 2.0 and will put AMD and Nvidia on high alert

Categories

Archives

Meta has a new machine learning language model to remind you it does AI too

Leave a Reply Cancel reply

Archives

Categories