Elevate your enterprise data technology and strategy at Transform 2021.
How do you start assembling an AI team? Well, hire unicorns who can understand the business problem, can translate it into the “right” AI building blocks, and can deliver on the implementation and production deployment. Sounds easy! Except that sightings of such unicorns are extremely rare. Even if you find a unicorn, chances are you won’t be able to afford it!
In my experience leading Data+AI products and platforms over the past two decades, a more effective strategy is to focus on recruiting solid performers who cumulatively support seven specific skill personas in the team.
The 7 skill personas of a unicorn AI team
Datasets interpreter persona
The lifeblood of an AI project is data. Finding the right datasets, preparing the data, and ensuring high quality on an ongoing basis is a key skill. There is a lot of tribal knowledge about datasets, so you require someone who can specialize in tracking the meaning of data attributes and the origins of different datasets. A related challenge with data is tackling multiple definitions within the organization for business metrics. In one of my projects, we were dealing with eight definitions of “monthly new customers” across sales, finance, and marketing. A good starting point for this skill persona is a traditional data warehouse engineer who has strong data modeling skills and an inherent curiosity to correlate the meaning of data attributes with application and business operations.
Pipeline builder persona
Getting data from multiple sources to AI models requires data pipelines. Within the pipeline, data is cleaned, prepared, transformed, and converted into ML features. These data pipelines (known as Extract-Transform-Load or ETL in traditional data warehousing) can get quite complicated. Organizations typically have pipeline jungles with thousands of pipelines built using heterogeneous big data technologies such as Spark, Hive, and Presto. The pipeline builder persona focuses on building and running pipelines at scale with the right robustness and performance. The best place to find this persona is data engineers with years of experience developing batch as well as real-time event pipelines.
AI full-stack persona
AI is inherently iterative from design, training, deployment, and re-training. Building ML models require hundreds of experiments for different permutations of code, features, datasets, and model configurations. This persona is a combination of AI domain knowledge and strong system-building skills. They specialize in existing AI platforms, such as Tensorflow, Pytorch, or cloud-based solutions such as AWS, Google, and Azure. With the democratization of these AI platforms and widespread online courses, this persona is no longer a scarcity. In my experience, a strong background in software engineering combined with their curiosity to gain mastery in AI is an extremely effective combination. In hiring for this persona, it is easy to run into geniuses who like to fly solo instead of being a team player – be on the lookout and weed them out early.
AI algorithms persona
Most AI projects seldom need to start from scratch or implement new algorithms. The role of this persona is to guide the team on the search space of AI algorithms and techniques within the context of the problem. They help reduce dead-ends with course correction and help balance solution accuracy and complexity. This persona is not easy to get given the high demand at places focusing on AI algorithmic innovations. If you cannot afford to get someone full time for this skill, consider getting an expert as a consultant or a startup advisor. Another option is to invest in training the full-stack team by giving them time to learn research advancements and algorithmic internals.
Data+AI operations persona
After the AI solution is deployed in production, it needs to be continuously monitored to ensure it is working correctly. A lot of things can go wrong in production: data pipelines failing, bad quality data, under-provisioned model inference endpoint, drift in the correctness of model predictions, uncoordinated changes in business metric definitions, and so on. This persona focuses on building the right monitoring and automation to ensure seamless operations. In comparison to traditional DevOps for software products, Data+AI Ops is significantly complex given the number of moving pieces. Google researchers summarized this complexity correctly as the CACE principle: Change Anything Change Everything. A good starting point to find this persona is experienced DataOps engineers aspiring to learn the Data+AI space.
Hypothesis planner persona
AI projects are full of surprises! The journey from raw data to usable AI intelligence is not a straight line. You need flexible project planning – adapting based on proving or disproving hypotheses about datasets, features, model accuracy, customer experience. A good place to find this skill persona is in traditional data analysts with experience working on multiple concurrent projects with tight deadlines. They can act as excellent project managers given their instincts to track and parallelize hypotheses.
Impact owner persona
An impact owner is intimately familiar with the details of how the AI offering will be deployed to deliver value. For instance, when solving a problem related to improving customer retention using AI, this persona will have a complete understanding of the journey map associated with customer acquisition, retention, and attrition. They will be responsible for defining how the customer attrition predictions from the AI solution will be implemented by the support team specialist to reduce churn. The best place to find this persona is within the existing business team — ideally, an engineer with strong product instincts and pragmatism. Without this persona, teams end up building what is technically feasible rather than being pragmatic on what is actually required in the end-to-end workflow to generate value.
To summarize, these seven skill personas are a must-have for every AI team. The importance of these personas varies depending on the maturity of the data, type of AI problems, and skillsets available with the broader data and application teams. For instance, the data interpreter persona is much more critical in organizations with data in a large number of small tables compared to those with a small number of large tables. These factors should be taken into account in determining the right seniority and cardinality for each of the skill personas within the AI team. Hopefully, you can now start building your AI team instead of waiting for unicorns to show up!
Sandeep Uttamchandani Chief Data Officer and VP of Product Engineering at Unravel Data Systems. He is an entrepreneur with more than two decades of experience building Data+AI products and author of the book The Self-Service Data Roadmap: Democratize Data and Reduce Time to Insight (O’Reilly, 2020).
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
- up-to-date information on the subjects of interest to you
- our newsletters
- gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
- networking features, and more