Cloud leader AWS shifts its database focus to DataZone and Zero-ETL

To further strengthen our commitment to providing industry-leading coverage of data technology, VentureBeat is excited to welcome Andrew Brust and Tony Baer as regular contributors. Watch for their articles in the Data Pipeline.

Cloud leader AWS’s impact on IT trends takes many forms, but none has become more impactful than its stable of database services. 

At most yearly re:Invent conferences, AWS has rolled out a shiny new database that affirms the company’s presence amid cloud-based databases. These were sometimes open source and often purpose-built

But this year was different. At AWS re:Invent 2022, the company turned its sights toward making its existing array of cloud data tools more palatable to enterprise IT. That means data integration and data management are now due for attention.

To that end the company released Amazon DataZone data management services to catalog and govern data stored on the AWS cloud and on-premises. As well, DataZone could support third-party sources via APIs, AWS said, mentioning partners DataBricks, Snowflake, and Tableau in this context.

Event

Intelligent Security Summit On-Demand

Learn the critical role of AI & ML in cybersecurity and industry specific case studies. Watch on-demand sessions today.

Watch Here

The timing is right. Enterprises find the number of different sources of data they need to combine is growing dramatically. Management and governance of scattered data holdings grows onerous. 

Combining data feeds

Now as before, cost effectiveness drives IT to the cloud, AWS CEO Adam Selipsky told re:Invent attendees. For AWS today, cost-effective data engines begin with Aurora, AWS’s version of open-source PostgreSQL, and Redshift, the columnar MPP data warehouse that upended the economics of data analytics with its 2012 introduction.  

The database procession that brought Aurora and Redshift also included RDS, Neptune, DynamoDB, DocumentDB, Elastic Cache, TimeStream, and the Quantum Ledger DB, some of these stirring controversy as start-ups wrestled with cloud giant AWS’s aggressive approach to open-source licensing.

Selipsky did not come to re:Invent to tout a new database – though there were updates to several existing engines. Instead, he promoted the notion of tying the existing portfolio together more effectively. 

“Having all these tools to store and analyze data reveals the next challenge that people face … you need to be able to combine information across these different methods of data exploration to see the full picture and truly gain insights,” he said.

Give ‘em ETL 

In his re:Invent address, Selipsky took aim at the integration challenges around Extract Transform Load (ETL), the long-simmering backwater of high tech that innovators have lately been revisiting. 

He announced new integrations said to eliminate the need for ETL between Amazon Aurora and Amazon Redshift services, and between Spark and Redshift.

Selipsky’s aim here is clear-eyed. With low-code/no-code on the rise, it may be time to dial up “Zero ETL.” It’s a stage in data processing, involving a lot of repetitive custom scripting, that is necessary and generally glossed over when digital transformation is the enterprise’s ultimate goal.

The dull work of ETL data preparation can stand in the way of progress.  To show IT’s frustration with the process, Selipsky read an excerpt from a letter from a customer that described ETL as a “thankless, unsustainable black hole.” The new Aurora and Redshift capabilities help customers move toward a Zero-ETL future on AWS, he said.

Echoes of Tableau

Although perhaps overshadowed by machine learning and other announcements, the focus on larger data management issues at re:Invent 2022 suggests new maturity in AWS’s approach to IT’s data needs.

There is also the implication here that Adam Selipsky is setting a new course for AWS cloud. Given his years at the helm of business intelligence provider Tableau, this isn’t entirely unexpected.

Under his watch, Tableau distinguished itself for innovation in visual data presentation and established itself as an expert in ease of use and drag-and-drop integration support for both structured and unstructured data sets. 

AWS’s DataZone and Zero-ETL neatly fit in a similar picture of cloud data evolution. Future moves will be closely watched to see if AWS is moving more up-stack in the data edifice.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

Source