Comet partners with Snowflake to enhance the reproducibility of machine learning datasets 

Comet partners with Snowflake to enhance the reproducibility of machine learning datasets 

Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More


MLOps platform Comet today announced a strategic partnership with Snowflake that is aimed at introducing innovative solutions that empower data scientists to build superior machine learning (ML) models at an accelerated pace, bolstering data-driven decision-making.

The company said that the collaboration will integrate Comet’s solutions into Snowflake’s unified platform, enabling developers to track and version their Snowflake queries and datasets within their Snowflake environment. 

Comet anticipates that this integration will facilitate lineage tracing of models and performance, providing improved visibility and comprehension of the development process and the influence of data changes on model performance. By leveraging Snowflake data, customers can now benefit from a streamlined and transparent model development process.

Faster model training, deployment and monitoring

Snowflake’s Data Cloud and Comet’s ML platform combined will allow customers worldwide to build, train, deploy and monitor models significantly faster, according to the companies.

Event

Transform 2023

Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.

Register Now

“In addition, this partnership fosters a feedback loop between model development in Comet and data management in Snowflake,” Comet CEO Gideon Mendels told VentureBeat. 

This loop can continuously improve models and bridge the gap between experimenting and deploying them, fulfilling the key promise of ML — the ability to learn and adapt over time. Clear versioning between datasets and models can enable organizations to define actionable steps to address data changes and their impact on models in production.

Comet’s new offering follows its recent release of a suite of tools and integrations designed to accelerate workflows for data scientists working with large language models (LLMs).

Enhancing ML models through constant feedback 

When data scientists or developers execute queries to extract datasets from Snowflake for their ML models, Comet can log, version and directly link these queries to the resulting models. 

Mendels said this approach offers several advantages, including increased reproducibility, collaboration, auditability and iterative improvement.

“The integration between Comet and Snowflake aims to provide a more robust, transparent and efficient framework for ML development by enabling the tracking and versioning of Snowflake queries and datasets within Snowflake itself,” he explained. “By versioning the SQL queries and datasets, data scientists can always trace back to the exact version of the data that was used to train a specific model version. This is crucial for model reproducibility.”

Connecting changes in model performance to data alterations

In ML, training data holds equal importance to the model itself. Alterations in the data, such as introducing new features, addressing missing values or modifications in data distributions can profoundly affect a model’s performance.

The company says that by tracing the lineage of a model, it becomes possible to establish a connection between changes in model performance and specific alterations in the data. This not only aids in debugging and comprehending performance but guides data quality and feature engineering.

Mendels said that tracking queries and data over time can create a feedback loop that drives continuous improvements in both data management and model development stages.

“Model lineage can facilitate collaboration among a team of data scientists, as it allows anyone to understand a model’s history and how it was developed without the need for extensive documentation,” said Mendels. “This is particularly useful when team members leave or when new members join the team, allowing for seamless knowledge transfer.”

What’s next for Comet? 

The company claims that customers using Comet — such as Uber, Etsy and Shopify — typically report a 70% to 80% improvement in their ML velocity. 

“This is due to faster research cycles, the ability to understand model performance and detect issues faster, better collaboration and more,” said Mendels. “With the joint solution, this should increase even more as today there are still challenges in bridging the two systems. Customers save on ingress and consumption costs by keeping the data within Snowflake instead of transferring it over the wire and saving it in other locations.”

Mendels said that Comet aims to establish itself as the de facto AI development platform. 

“Our view is that businesses will only see real value from AI after they deploy these models based on their own data,” he said. “Whether they are training from scratch, fine-tuning an OSS model or using context injection to ChatGPT, Comet’s mandate is to make this process seamless and bridge the gap between research and production.”

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

Source