Hear from the CIO, CTO and other C-level and senior executives on data and AI strategies at the Future of Work Summit on January 12, 2022. Learn more
Let OSS Enterprise Newsletter Guide yourself Open Source Travel! Sign up here,
Starburst, the commercial entity behind the open source Presto-based SQL query engine Trino, has announced a new full-powered, cross-cloud analytics product that allows companies to query data hosted on any “big three” infrastructure – without moving. Data from its original location.
While many large cloud data analytics vendors support the growing multicloud movement by making their products available for every platform, problems remain in terms of making it easier to access data stored in multiple environments. Companies still have to find a way to “pool” data from these different silos, whether by moving the data to a single cloud or data warehouse, which is not only time consuming but can also take a so-called “aggression” fee to transfer. Data and Starburst is now addressing this by expanding its fully-powered software-as-a-service (SaaS) product, allowing its customers to analyze data in the main cloud with a single SQL query.
From Presto to Trino
Starburst has followed a circular path to where it is today. The company’s foundations can be traced back to 2012 when a group of Facebook engineers developed a distributed SQL query engine called Presto to help its in-house data scientists and data analysts run faster queries across large data sets. Facebook made Presto open-source the following year, but following an ongoing feud with authorities over Facebook, Presto’s creators eventually left the social network and launched a fork called PrestoSQL – rebranded as Trino last December.
Like many similar open source projects, Trino has a commercial counterpart now known as Starburst, whose founders include the original Presto creators among other early Presto adopters. Initially, Starburst was offered in a single “enterprise” flavor that could be self-managed and hosted on-premises or on any public cloud. Earlier this year, Starburst launched a new fully-powered SaaS offer called Starburst Galaxy, which has an integrated SQL editor out-of-the-box for querying data and connectors for integration with data sources.
The Starburst Galaxy was originally only available for AWS, but to support Starburst’s push in cross-cloud analytics, the company is now backing Microsoft’s Azure and Google Cloud Platform (GCP). Notably, Starburst previously introduced a cross-cloud analytics product called Stargate for self-powered avatars. Now Starburst is bringing the same functionality to its fully managed service, where it handles all the infrastructure and the customer doesn’t have to worry about what’s going on under the hood.
Matt Fuller, co-founder of Starburst, told VentureBeat, “This allows us to expand our cross-cloud analytics capabilities for any and all departments without the help of Central IT.” “This domain allows experts to own the data they know best and distribute it to the rest of the organization as a product.”
So what’s the big brohaha on the multicloud? Isn’t it easy for companies to choose the public cloud and stick with it? In some cases, that may be true, but companies often take a multicloud approach for whatever reason.
Some clouds are better than others in some respects, in which case it may make sense to use GCP for one thing and AWS for another. Moreover, cost and compliance considerations can also take a company down a multicloud or hybrid-cloud approach, mixing on-premises infrastructure with one or more public clouds. And sometimes, companies can find themselves in the multicloud world by event, either by acquiring companies that use different cloud or by choosing the cloud where different internal segments best suit their needs.
Cross-cloud analytics goes some way towards helping these companies prevent data silos that make up all these different scenarios.
“By having data in these different clouds, it creates a further extension of the data silo problem where data not only exists in different data sources, but is now also in very different locations,” Fuller said. “That’s why cross-cloud analytics is essential – otherwise, data will have to be moved to a single cloud. Like the previous solution to the problem of trying to move all data into a single data warehouse. “
It is also worth noting that even if the company uses the same cloud provider, the company may have to store data in different cloud “territories” to meet local data residency needs. In such cases, using alternative analytics solutions that involve data transfer between systems or locations is not an option – where Starburst’s latest solution can really shine.
“Cross-cloud analytics allows the process to be pushed to the region where the data resides and leaves only the overall insight,” Fuller explained. “If restricted data must be released, it can be masked to comply with the requirements.”
VentureBeat’s mission is to become a digital town square for technical decision makers to gain knowledge about transformative technology and practices. Our site delivers essential information on data technologies and strategies so you can lead your organizations. We invite you to access, to become a member of our community:
- Up-to-date information on topics of interest to you
- Our newsletters
- Gated idea-leader content and discounted access to our precious events, such as Transform 2021: Learn more
- Networking features and more
Become a member