The open supply Presto SQL question engine is constant to maneuver ahead, with massive customers comparable to Uber, Alibaba and Fb utilizing the expertise at ever-growing scale.
On the PrestoCon Day digital occasion on March 24, Presto customers and builders gathered to debate how the expertise is getting used and the place it’s headed sooner or later.
Presto is a SQL question engine initially developed by Fb and at the moment run as an open supply undertaking below the governance of the Presto Software program Basis, which itself is operated by the Linux Basis. Till December 2020, there have been two distinct variations of Presto, with PrestoDB run by the Presto Software program Basis, and PrestoSQL, which was rebranded as Trino by its backers, together with Starburst and Varada, amongst others.
With the cut up and confusion of two completely different Presto variations now up to now, the Presto undertaking is trying to spotlight its successes. Whereas Presto continues to be strongly influenced by Fb, it has discovered enterprise assist and adoption as nicely, with distributors such Ahana, which offers a managed Presto service that runs on AWS.
Presto customers element advantages
Throughout a person session on the digital PrestoCon Day occasion, a number of Ahana prospects detailed how they use Presto. Amongst these customers is income administration platform vendor Carbon, which is headquartered in New York Metropolis. Jordan Hoggart, information engineer at Carbon, described how the corporate moved from Amazon Athena to PrestoDB managed by Ahana. Hoggart defined that Athena is principally an implementation of Presto, albeit one which has been custom-made by Amazon. Hoggart stated Athena did not present the scalability that Carbon wanted to course of a number of sorts of information queries, with completely different units of parameters.
“With Ahana, one other factor we may do was experiment with utilizing completely different clusters for various workloads. With Athena we have been caught with one queue that served all the pieces,” Hoggart stated. “Whereas now if we would like, we are able to spin up a pair completely different clusters which have completely different configurations.”
In the meantime, B2B e-commerce market vendor Cartona, based mostly in Giza, Egypt, can also be utilizing Ahana’s supported model of Presto.
Throughout one other person session, Omar Mohamed, senior information engineer at Cartona, defined that the seller was encountering challenges analyzing information throughout a number of sources of knowledge together with transactional and analytics databases.
Omar MohamedSenior information engineer, Cartona
Mohamed famous that Cartona was getting roughly 200,000 occasions coming into its information sources each 12 hours. With Cartona anticipating to continue to grow and having to take care of much more information within the coming months and years, the seller determined to make use of Presto to allow quick information queries throughout the disparate information sources.
“So now we’re capable of be a part of information queries throughout our completely different databases with out having to repeat or ingest information,” Mohamed stated. “It is all finished in Presto, which saved us hours of planning and guide work.”
Experience-share big Uber is among the greatest contributors to and customers of Presto. In a technical session, Girish Baliga, engineering supervisor at Uber, famous that during the last six months Uber has established Presto as its de facto SQL question engine for many information and analytics purposes. Baliga emphasised that Uber, like many others that use Presto, advantages considerably from Fb’s continued contributions and testing of Presto at scale.
“Fb does a lot of the work; we nonetheless need to do some unique work right here as a result of we use completely different applied sciences, so we do some further testing,” Baliga stated. “However the stability and scalability is absolutely sussed out by the Fb testing course of.”
The state of Presto, from Fb’s perspective
In a keynote session, Biswapesh Chattopadhyay, tech lead for information infrastructure and compute at Fb, outlined the state of Presto and the place it’s headed when it comes to technical course.
One of many new capabilities that the group has been engaged on is a course of known as “auto-awesomize,” which mechanically configures and tunes Presto for optimum operational deployment.
“There’s much less and fewer persistence for individuals to truly spend hours fine-tuning their question efficiency as a result of they should run advert hoc queries in a short time and you recognize time is cash,” Chattopadhyay stated.
He famous that customers simply need their question engines to mechanically work out how you can run a question rapidly in opposition to any given information set. Presto is creating the auto-awesomize capabilities with superior question execution expertise in addition to history-based optimizations that study from previous queries.
Presto is typically in contrast with Apache Spark, which additionally offers a question engine, in addition to a distributed parallel processing framework for operating queries at large scale. There’s an effort now to allow Presto to run on high of Spark, utilizing Presto because the question engine and Spark because the underlying framework for parallel processing.
“Presto-on-Spark is one thing that I am actually enthusiastic about as a result of we see it basically breaking down the scalability limits of Presto,” Chattopadhyay stated.