This Google leader says ML infrastructure is ‘conduit’ to company’s AI success

Two years ago, Google spun outa new group focused on machine learning infrastructure, led by a VP of engineering from its artificial intelligence research division — part of a push to make “substantial gains” in AI. At this year’s Google I/O, it became clear that this Core ML group, developed to serve as a “center of gravity” in applying ML to Google products, had certainly succeeded in its mission.

“I could see the fingerprints of the team on everything happening on stage,” Nadav Eiron, who built and leads the 1,200-member team, told VentureBeat. “It was an extremely proud moment for me.”

In an exclusive interview, Eiron discussed the essential role Core ML has played in Google’s recent race to implement generative AI in its products — particularly how ML infrastructure serves as a “conduit” between research teams at Google DeepMind and the company’s product teams. (Editor’s note: This interview has been edited for length and clarity.)

Event

Transform 2023

Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.

Register Now

VentureBeat: How do you describe the Core ML team’s mission at Google?

Nadav Eiron: We look to the Core ML team to enable innovations to become actual products. I always tell my team that we need to look at the entire journey from the point the researcher has a great idea or product has a need and finds a researcher to solve it — all the way to the point that a billion people’s lives have been changed by that idea. That journey is especially interesting these days because ML is going through an accelerated journey of becoming an industry, while up until two or three years ago, it was just the subject of academic research.

VB: How does your team sit within the Google organization?

Eiron: We sit in an infrastructure organization, and our goal is to provide services to all of Google products as well as externally, things like the entire TensorFlow ecosystem, open-source projects that my team owns and develops.

The journey from a great idea to a great product is very, very long and complicated. It’s especially complicated and expensive when it’s not one product but like 25, or however many were announced that Google I/O. And with the complexity that comes with doing all that in a way that’s scalable, responsible, sustainable and maintainable.

We build a partnership, on the one hand, with Google DeepMind to help them, from the get-go, to think about how their ideas can influence products and what does it mean for those ideas to be built in a way that they’re easy to incorporate into products later. But there is also a tight partnership with the people building the products — providing them with tools, services, technology that they can incorporate into their products. 

As we look at what’s been happening in the past few months, this field has really accelerated because building a generative AI experience is complicated. It’s much more software than just being able to provide input to a model and then take the output from that model. There’s a lot more that goes into that, including owning the model once it’s no longer a research thing, but actually becomes a piece of infrastructure.

VB: This gives me a whole other view into what Google is doing. From your standpoint, what is your team doing that you think people don’t really know about when it comes to Google?

Eiron: So it’s about Google, but I think it’s a wider trend about how ML turns from an academic pursuit into an industry. If you think of a lot of big changes in society, the internet started as a big research project, 20 years later it became an industry and people turned it into a business. I think ML is on the precipice of doing the same thing. If you create this change in a deliberate way, you can make the process happen faster and have better results.

There are things that you do differently with an industry versus in research. I look at it as an infrastructure builder. We really want to make sure that there are industry standards. I gave this example to my team the other day: If you want to optimize shipping, you might argue over whether a shipping container needs to be 35 or 40 or 45 feet. But once you decide shipping containers are the way to go, the fact that everybody agrees on the size is a lot more important than what the size is.

That’s just an example of the kind of stuff that you optimize when you do research and you don’t want to worry about when you build an industry. So this is why, for example, we created the OpenXLA [an open-source ML compiler ecosystem co-developed by AI/ML industry leaders to compile and optimize models from all leading ML frameworks] because the interface into the compiler in the middle is something that would benefit everybody if it’s commoditized and standardized.

VB: How would you describe the way a project goes from a Google DeepMind research paper to a Google product?

Eiron: ML used to be about getting a bunch of data, figuring out the ML architecture, training a model from scratch, evaluating it, rinse and repeat. What we see today is ML turns out to be a lot more like software. You train a foundational model and then you need to fine-tune it and then the foundational model changes and then maybe your fine-tuning data changes and then maybe you want to use it for a different task. So it creates a workflow. That means you need different tools and different things matter. You want these models to have longevity and continuity.

So we ask ourselves questions like, “How can you make updates to the model without people being jarred by it?” That’s a big problem when you build software because you’re going to have many people building the prompts, and you want to be able to update the base model without having 20 products returned. You could say that these unique problems come from scale. You can also say they come from the need to provide continuity to the end user, or from focusing on really delivering the product experience. There’s a big gap between “We have a great model” and “We have a great generative AI experience.”

VB: What is your day-to-day work like?

Eiron: A lot of it is creating connections between different parts of the organization that think differently about things. For example we talked about the different ways product people think about problems versus researchers. Because we work with all of these folks, we can represent them to each other. We find ourselves in research forums representing the common good of all of the products. We find ourselves in product forums, helping them understand where research is coming from and how we can help them. And obviously, a lot of time is spent with folks supporting the product — responsible AI experts, policy experts, exploring, what is possible and what is desirable.

The team basically spans the entire stack — all the way from the low-level hardware and software code design all the way to applied AI — working with the products, advising them on what models to use, helping them build the tools and being full partners in the launch.

VB: Were there any products announced at Google I/O that you really felt strongly about in terms of all the work that your team had put in?

Eiron: I particularly like our collaborations with Google Workspace for a variety of reasons. One, I believe Workspace has a unique opportunity in the generative AI space because generative AI is about generating content and Workspace tools are a lot about creating content. And I feel like having the AI with you in the tool, basically having a little angel sit on your shoulder as you do your work is a super powerful thing to do. 

I’m also especially proud of that because I think the Workspace team came into this generative AI revolution with less expertise and contact with our own research teams than some of the other teams. For example, Search has a long-standing tradition of working on state-of-the-art ML. But Workspace needed more of my team’s help, as the centralized team that has experts and has tools that they can take off the shelf and use.

VB: I know you’ve been at Google for over 17 years, but I’m really curious about what the last six months have been like. Is there a tremendous amount of pressure now?

Eiron: What has changed is this acceleration of the use of generative AI in products. The pace of work has definitely gone up. It’s been crazy. I haven’t taken a real vacation in way too long.

But there’s also a lot of energy coming from that. Again, from the perspective of someone who builds infrastructure and is interested in this transition from research to industry into product, it creates pressure to accelerate that transition.

For example, we were able to show that a single foundational model can be used across different products, which accelerated the development of products that used this technology and allowed us to have a front-row seat to see how people actually use technology to build products.

I strongly believe that the best infrastructure comes from the experience of trying to do the thing without having the infrastructure. Because of this time pressure and the number of people working on it, the best and brightest, we were able to see: Here’s what product people do when they have to launch a generative AI experience, and here’s where as infrastructure providers we can give them better tools, services and building blocks to be able to do it faster next time.

VB: Can you talk about how the Core ML team is organized?

Eiron: In layers. There are people that focus on the hardware, software, code design and optimization on compilers, the lower layers of the stack. The people in the middle build the building blocks for ML — so they will build a training service, a data management service and inference service. They also build frameworks — so they’re responsible for Jax, TensorFlow and other frameworks.

And then at the top we have folks that are focused on the applied ML experience for product builders — so they are working shoulder-to-shoulder with the product people and bringing back this knowledge of what it takes to actually build a product as well as infrastructure. That’s really the cutting edge of where we interact with products on the one hand and research on the other hand.

We’re a little bit of a conduit of the technology moving across space. But we own a lot of this infrastructure. For example, we talk about building this whole new stack of services to create a generative AI experience. Like, how do you manage RLHF? How do you manage filtering? How do you manage takedowns? How do you manage the data curation for fine-tuning for these products? All these are components that we own for the long run. It’s not just “Here’s the thing you need,” it’s more “I noticed this is a thing that a lot of people need now, so I build it and I provide it.”

VBe: Is there anything you’re doing or see coming to improve infrastructure?

Eiron: One of the things that I’m very excited about is providing API access to these models. You really see not just the open-source community, but independent software vendors building products on top of these generative AI experiences. I think we’re very early in this journey of generative AI, we’re gonna see a lot of products coming to the market. I hope many of them will come from Google, but I know many ideas, many good ideas will happen elsewhere. And I think really creating an open environment where people can innovate on top of these awesome pieces of technology is something that’s that’s really exciting to me. I think we’re gonna see a lot of interesting things happening over the next few years.

>>Don’t miss our special issue: Building the foundation for customer data quality.<<

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

Source