Commercial Trino vendor Starburst released a public beta version of its new Galaxy managed service for Trino.
The capabilities of the open source Trino query engine continue to expand as usage grows and vendors come out with new commercial services, some of which were revealed this week at the Datanova 2021 virtual conference run by Starburst.
Trino was formerly known as Presto SQL and was rebranded in December of 2020 to help avoid confusion with the rival Presto DB effort, which is led by the Linux Foundation.
Among the leading Trino vendors is Starburst, based in Boston, which raised $100 million in funding on Jan. 8.
At the Datanova event, users, including the Royal Bank of Canada (RBC) and Scotiabank, outlined some of their data challenges, and Starburst executives provided insights into how Trino is now supporting the open source Apache Iceberg technology to improve data management and analytics.
Presto/Trino comes north to the Royal Bank of Canada
In a technical session at Datanova on Feb. 9, Ahmad El-Kays, senior director of enterprise data and analytics architecture for RBC, explained how the organization is using Starburst to enable what he referred to as enterprise data domains. Data domains are groupings of data sources based on governance requirements from the RBC.
RBC is one of the largest financial institutions in Canada and operates around the world. El-Kays noted that RBC faces multiple data challenges, including possessing many places where data is stored as well as large volumes of data that is often needed for data analytics.
“The challenge we’re trying to tackle is how do we move from the governance-based domains to business-usable domains, while keeping those domains aligned,” El-Kays said.
That challenge has led the RBC to evaluate Trino and Starburst. El-Kays noted that with Starburst and Trino, it’s possible to create semantic views to enable the creation of business-usable data domains.
Ahmad El-KaysSenior director of enterprise data and analytics architecture, RBC
RBC has several proof-of-concept evaluations with Starburst in progress.
“Looking at tools like Starburst Presto, and implementing Data Domain concepts, what it allows for is the creation of the abstraction layer between the storage systems and our end users,” El-Kays said.
El-Kays noted that RBC’s business users today need to know what data store to access and how to put different data elements together to get the analysis they want. Using Starburst as an abstraction layer for the underlying data sources will make it easier for business users to do their jobs, he said.
Apache Iceberg support comes to Trino
During a keynote panel on Feb. 10, the co-founders of the Presto SQL project, now known as Trino, outlined recent developments in the open source project.
Among the additions is support for Apache Iceberg, which was originally created at Netflix and now has a growing community of users and contributors including Apple, Adobe, LinkedIn and Expedia.
Iceberg provides a format for organizing large data tables in a scalable approach. Dain Sundstrom, a CTO at Starburst, said Iceberg is an active project that is focused on the difficult problem of how to organize data.
David Phillips, also a CTO at Starburst, told SearchDataManagement that Iceberg support is generally available in Trino, meaning that it can be used it in production. That said, Phillips noted that developers of the Apache Iceberg project itself are working on new features.
“The Iceberg connector in Trino is also under active development to take advantage of these new features and to add other improvements and performance enhancements,” Phillips said in an email. “I personally expect that this will continue for at least the next several years, as more people adopt Iceberg and it sees wider production usage.”
Starburst Galaxy previews managed Trino cloud service
To date, Starburst’s primary commercial platform has been Starburst Enterprise. In a product session on Tuesday at Datanova, Colleen Tartow, director of engineering at Starburst, introduced a preview of the Starburst Galaxy service that brings a managed instance of Starburst’s Trino technology to the cloud.
The underlying engine for Starburst Galaxy is the Starburst Enterprise platform. The difference is that Galaxy is fully managed, enabling organization to get up and running faster than they likely could on their own, Tartow said. Galaxy also scales the infrastructure as needed to handle growing data workloads.
“With Starburst Galaxy, you get a managed data platform that connects directly to your data sources,” Tartow said.