Head over to our on-demand library to view sessions from VB Transform 2023. Register Here


Database vendor Rockset is raising $44 million in new funding, as demand for its real-time indexing capabilities grows in the modern generative AI era.

The new fundraise follows the company’s series B round and brings total funding to date for the San Mateo, California-based company to $105 million. Icon Ventures led the new round, with participation from Glynn Capital, Four Rivers, K5 Global, Sequoia and Greylock.

Over the course of 2023 in particular, Rockset has been growing its technology, which uses the open-source RocksDB persistent key-value store originally created at Meta (formerly Facebook) as a foundation. In March, Rockset rolled out a platform update designed to make its real-time indexing database dramatically faster. That update was followed in April by vector embedding support to help enable AI use cases.

“We’re getting pulled in more and more into AI applications that are getting built, and that is a very, very big platform shift that’s happening,” Venkat Venkataramani, cofounder and CEO of Rockset, told VentureBeat. “Fundamentally what we do is real-time indexing, and it turns out applications also need real-time indexing on vector embeddings.”

Event

VB Transform 2023 On-Demand

Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.

Register Now

Vector support is about more than just a new data type

The use of vector embeddings, stored in some form of vector database, has grown in 2023 with the rise of generative AI.

Vectors, numerical representations of data, are used to help power large language models (LLMs). There are a number of purpose-built vector databases, including Pinecone and Milvus, which join a growing number of existing database technologies including DataStax, MongoDB and Neo4j that support vector embeddings.

Inside Rockset, vector embeddings are supported as a data type known as an “array of floats” in the existing database. Venkataramani emphasized, however, that simply supporting vectors as a data type isn’t what’s particularly interesting to him.

Rather, what is more interesting from his perspective is how Rockset has now built a real-time index technology for the vector embeddings. The index provides a logical key for enabling search on a given set of data. Having the index updated in real time is critical for certain production use cases requiring the most updated information possible.

As it turns out, the same basic approach that Rockset has built for real-time indexing of metadata also works well for vectors. Having a real-time index that can query both regular data and vectors is useful for modern AI applications, according to Venkataramani.

“Every AI application we were dealing with doesn’t only work with vectors. There are always all these other database metadata fields associated with every one of them — and the application needs to query on all of them,” he said.

How Rockset has built a real-time index for vector embeddings

At the foundation of Rockset’s real-time database is the RocksDB data store, which the company has extended with the RocksDB Cloud technology.

Venkataramani explained that Rockset has developed a number of advanced techniques with RocksDB Cloud that help accelerate indexing for all data types. He noted that RocksDB Cloud now has an approximate nearest neighbor (ANN) indexing implementation, which is critical to enabling real-time search on vector data.

“Now, like any other index in Rockset, once you build a similarity ANN index for a vector embeddings column, it’s always up-to-date,” Venkataramani said. “It just automatically keeps itself up-to-date across inserts, updates and deletes.”

Rockset also integrates a distributed SQL engine for fast data queries. Venkataramani noted that the company’s SQL engine is now able to execute real-time queries across all supported data types on the database.

“You can now literally, in a single SQL query, do a whole bunch of filters and joins and aggregations, and also use a vector embedding to do ranking relevance in a similarity search use case,” he said. “A single SQL query is extremely efficient and very, very fast, because the SQL engine is built to power applications and not analysts that are waiting for reports.”

Looking forward, Venkataramani expects that there will be a lot more development of AI capabilities in Rockset. Among the future capabilities he’s looking forward to is support for GPU acceleration to further speed queries for LLMs and generative AI use cases.

“This industry is just getting started. This platform shift is not a fad; this is going to be a core part of every application,” he said.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

Source