Cadastre-se agora para um orçamento mais personalizado!

Databricks' $1.3 billion buy of AI startup MosaicML is a battle for the database's future

jun, 26, 2023 Hi-network.com

Naveen Rao, [left] MosaicML co-founder and CEO, and Hanlin Tang, co-founder and CTO. The company's training technologies are being applied to "building experts," using large language models more efficiently to handle corporate data. 

MosaicML

On Monday, Databricks, a ten-year-old software maker based in San Francisco, announced it would acquire MosaicML, a three-year-old San Francisco-based startup focused on taking AI beyond the lab, for$1.3 billion.

The deal is a sign not only of the fervor for assets in the white-hot generative artificial intelligence market, but also a sign of the changing nature of the modern cloud database market. 

Also: What is ChatGPT and why does it matter? Here's what you need to know

MosaicML, staffed with semiconductor veterans, has built a program called Composer that makes it easy and affordable to take any standard version of AI programs such as OpenAI's GPT and dramatically speed up the development of that program, the beginning phase known as the training of a neural network. 

The company this year introduced cloud-based commercial services where businesses can for a fee both train a neural network and perform inference, the rendering of predictions in response to user queries. 

However, the more profound element of MosaicML's approach implies that whole areas of working with data --  such as the traditional relational database --  could be completely reinvented. 

"Neural network models can actually be thought of almost as a database of sorts, especially when we're talking about generative models," Naveen Rao, co-founder and CEO of MosaicML, told in an interview prior to the deal. 

"At a very high level, what a database is, is a set of endpoints that are typically very structured, so typically rows and columns of some sort of data, and then, based upon that data, there is a schema on which you organize it," explained Rao.

Unlike a traditional relational database, such as Oracle, or a document database, such as MongdoDB, said Rao, where the schema is preordained, with a large language model, "the schema is discovered from [the data], it produces a latent representation based upon the data, it's flexible." And the query is also flexible, unlike fixed lookups into a database such as SQL, which dominates traditional databases.

Also: Serving Generative AI just got a lot easier with OctoML's OctoAI

"So, basically," added Rao, "You took the database, loosened up the constraints on its inputs, schema, and its outputs, but it is a database." In the form of a large language model, such a database, moreover, can handle large blobs of data that have eluded traditional structured data stores.

"I can ingest a whole bunch of books by an author, and I can query ideas and relationships within those books, which is something you can't do with just text," said Rao. 

Using clever prompting in an LLM, the prompt context gives flexible ways to query the database. "When you prompt it the right way, you'll get it to produce something because of the context created by the prompt," explained Rao. "And, so, you can query aspects of the original data from that, which is a pretty big concept that can apply to many things, and I think that's actually why these technologies are very important."

The MosaicML work is part of a broad movement to make so-called generative AI programs like ChatGPT more relevant for practical business purposes. 

Also:Why open source is essential to allaying AI fears, according to Stability.ai founder

For example, Snorkel, a three-year-old AI startup based in San Francisco, offers tools that let companies write functions which automatically create labeled training data for so-called foundation models -- the largest neural nets that exist, such as OpenAI's GPT-4.

And another startup, OctoML, last week unveiled a service to smooth the work of serving up inference.

The acquisition by Databricks brings MosaicML into a vibrant non-relational database market that has for several years been shifting the paradigm of a data store beyond row and column. 

That includes the data lake of Hadoop, techniques to operate on it, and the map and reduce paradigm of Apache Spark, of which Databricks is the leading proponent. The market also includes streaming data technologies, where the store of data can in some sense be in the flow of data itself, known as "data in motion," such as the Apache Kafka software promoted by Confluent.

Also: The best AI chatbots: ChatGPT and other noteworthy alternatives

MosaicML, which raised$64 million prior to the deal, appealed to businesses with language models that would be not so much the generalists of the ChatGPT form but more focused on domain-specific business use cases, what Rao called "building experts." 

The prevailing trend in artificial intelligence, including generative AI, has been to build programs that are more and more general, capable of handling tasks in all sorts of domains, from playing video games to engaging in chat to writing poems, captioning pictures, writing code, and even controlling a robotic arm stacking blocks.

The fervor over ChatGPT demonstrates how compelling such a broad program can be when it can be wielded to handle any number of requests. 

Also:AI startup Snorkel preps a new kind of expert for enterprise AI

And yet, the use of AI in the wild, by individuals and institutions, is likely to be dominated by approaches far more focused because they can be far more efficient. 

"I can build a smaller model for a particular domain that greatly outperforms a larger model," Rao told .

MosaicML had made a name for itself with performance achievements by demonstrating its prowess in the MLPerf benchmark tests that show how fast a neural network can be trained. Among the secrets to speeding up AI is the observation that smaller neural networks, built with greater focus, can be more efficient. 

That idea was explored extensively in a 2019 paper by MIT scientists Jonathan Frankle and Michael Carbin that won a best paper award that year at the International Conference on Learning Representations. The paper introduced the "lottery ticket hypothesis," the notion that every big neural net contains "sub-networks" that can be just as accurate as the total network, but with less compute effort. 

Also: Six skills you need to become an AI prompt engineer

Frankle and Carbin have been advisors to MosaicML. 

MosaicML also draws explicitly on techniques explored by Google's DeepMind unit that show there is an optimal balance between the amount of training data and the size of a neural network. By boosting the amount of training data by as much as double, it's possible to make a smaller network much more accurate than a bigger one of the same type. 

All of those efficiencies are encapsulated by Rao in what he calls a kind of Moore's Law of the speed-up of networks. Moore's Law is the semiconductor rule of thumb which posited, roughly, that the amount of transistors in a chip would double every 18 months, for the same price. This is the economic miracle that made possible the PC revolution, followed by the smartphone revolution. 

Also:Google, Nvidia split top marks in MLPerf AI training benchmark

In Rao's version, neural nets can become four times faster with every generation, just by applying the tricks of smart compute with the MosaicML Composer tool.

Several surprising insights come from such an approach. One, contrary to the oft-repeated phrase that machine learning forms of AI require massive amounts of data, it may be that smaller data sets can work well if applied in the optimal balance of data and model

tag-icon Tags quentes : Inteligência artificial Inovação

Copyright © 2014-2024 Hi-Network.com | HAILIAN TECHNOLOGY CO., LIMITED | All Rights Reserved.