Ai21 And Databricks Show Open Source Can Radically Slim Down Ai

SERVIDORES

jamba-dbrx-splash-crop-for-twitter-new — AI21 Labs, Databricks

As the forces of open-source generative AI try to counter the closed-source AI giants like OpenAI and Anthropic, one of their key weapons will be the efficiency gains from running smaller models that take less time to train, less energy, fewer computing resources, and, as a result, less money.

In that vein, last week brought two new open-source large language models that compete with the best of closed-source code from OpenAI and Anthropic. AI startup AI21 Labs and database technology vendor Databricks separately demonstrated how smaller neural networks can match much bigger models, at least on benchmark tests.

Also: Move over Gemini, open-source AI has video tricks of its own

AI21's Jamba is a remarkable combination of two different approaches to language models: a Transformer, the key technology on which most language models are based, including OpenAI's GPT-4, and a second neural network called a "state space model," or SSM.

Scholars at Carnegie Mellon University and Princeton University improved the SSM to make a more efficient solution called "Mamba." AI21 scholars Opher Lieber, Barak Lenz and team then combined Mamba with the Transformer to produce "Joint Attention and Mamba," or Jamba. As described in AI21's blog post, "Jamba outperforms or matches other state-of-the-art models in its size class on a wide range of benchmarks."

ai21-labs-2024-jamba-architecture — Jamba combines a form of recurrent neural network called a state space model with a typical Transformer, a novel hybrid that combines the virtues of each.

AI21 Labs

In a number of tables, Lieber, Lenz and team show how Jamba performs on reasoning and other tasks. "Noticeably, Jamba performs comparably to the leading publicly available models of similar or larger size, including Llama-2 70B and Mixtral."

Jamba meets or beats top open-source models despite radically slimming down the memory requirements of neural nets.

AI21 Labs

Jamba slims down the memory usage of a large language model. At 12 billion "parameters," or, neural weights, it is in one sense comparable to Meta's open-source Llama 2 7-billion parameter model. However, while Llama 2 7B uses 128GB of DRAM to store the "keys and values" that make the Transformer's attention function work, Jamba requires only 4GB.

As the team put it, "Trading off attention layers for Mamba layers reduces the total size of the KV cache" (the key-value memory database). The result of slimming down the memory is that "we end up with a powerful model that fits in a single 80GB GPU" (one of Nvidia's older A100 GPUs).

Also:Cybercriminals are using Meta's Llama 2 AI, according to CrowdStrike

Despite the slimmer size, Jamba hits a new high mark: the ability to take in the most amount of characters or words of any open-source model. "Our model supports a context length of 256K tokens -the longest supported context length for production-grade publicly available models."

Jamba's code is available on Hugging Face under the Apache open-source license.

The second striking innovation this week is Databricks' DBRX. Databricks' internal AI team, MosaicML, which the company acquired in 2023, built DBRX from what's called a "mixture of experts," a large language model approach that shuts off some of the neural weights to conserve computing and memory needs. "MoE," as it's often known, is among the tools that Google used for its recent Gemini large language model.

As Databricks explains in its blog post, "MoEs essentially let you train bigger models and serve them at faster throughput." Because DBRX can shut down some parameters, it uses only 36 billion out of its 132 billion neural weights to make predictions.

MoE lets DBRX do more with less. Among its remarkable achievements, "DBRX beats GPT-3.5 on most benchmarks," the MosaicML team wrote, including tests of language understanding and computer coding ability, even though GPT-3.5 has 175 billion parameters (five times as many).

databricks-2024-dbrx-outperforms-gpt-3-5 — DBRX outperforms OpenAI's GPT-3.5 despite being far smaller in terms of the number of parameters.

Databricks

Also:Why open-source generative AI models are still a step behind GPT-4

What's more, when used through the prompt as a chatbot, "DBRX generation speed is significantly faster than LLaMA2-70B," even though Llama 2 has twice the number of parameters.

Databricks aims to drive the adoption of open-source models in enterprises. "Open-source LLMs will continue gaining momentum," the team declared. "In particular, we think they provide an exciting opportunity for organizations to customize open-source LLMs that can become their IP, which they use to be competitive in their industry. Towards that, we designed DBRX to be easily customizable so that enterprises can improve the quality of their AI applications. Starting today on the Databricks Platform, enterprises can interact with DBRX, leverage its long context abilities in RAG systems, and build custom DBRX models on their own private data."

DBRX's code is offered on GitHub and Hugging Face through Databricks' open-source license.

As significant as both achievements are, the one over-arching shortcoming of these models is that they are not "multimodal," -- they deal only with text, not with images and video the way that GPT-4, Gemini, and other models can.

Artificial Intelligence

I asked Gemini and GPT-4 to explain deep learning AI, and Gemini won hands down
How to use ChatGPT's file analysis capability (and what it can do for you)
I tried Copilot Notebook: Microsoft's new AI tool offers two handy prompt features
What to know about Mistral AI: The company behind the latest GPT-4 rival

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

SERVIDORES

NOTÍCIAS QUENTES

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Difference between campus switch and data center switch

Huawei S6730-H28Y4C Campus CloudEngine Switch Datasheet

S6730-H48Y6C: Unleashing Power and Flexibility for Modern Networking

CloudEngine S6730-H Series Switches Datasheet

Huawei CloudEngine Switch S6730-S24X6Q Datasheet

CloudEngine S6700 Series Switches Naming Conventions & Description

Huawei CloudEngine S6730-H24X6C Datasheet

Huawei S6730 Series Switches Datasheet

Huawei CloudEngine Switch S6730-H48X6C Datasheet

Introduction to the Huawei CloudEngine S6730-S Series Switches

Huawei S6730-H48X6CZ-V2: The Ultimate High-Speed Network Switch

Overview of the S6730-H28X6CZ-V2 Switch

Huawei CloudEngine S6730-H24X4Y4C: A High-Performance Enterprise Switch for Modern Networks

AI21 and Databricks show open source can radically slim down AI

Artificial Intelligence

Tags quentes : Inovação

Ordering Guide

Recursos

Quem somos