Amazon unveiled AWS Glue DataBrew, a new data preparation tool that expands the analytics capabilities of the AWS platform by adding a no-code of version of its previously existing data preparation tool.
The new capability, revealed Nov. 11, is now available in select parts of the U.S., Europe and Asia with availability in more regions in the near future.
In the U.S., it’s currently available in just Northern Virginia, Ohio and Oregon.
Amazon first introduced AWS Glue in 2016 to enable data engineers to extract, transmit and load data from the various AWS storage platforms to prepare it for analysis. AWS Glue, however, is a code-based tool and requires users to understand how write code to wrangle and ready their data.
AWS Glue DataBrew, using a point-and-click interface, gives data engineers that same ability to extract, transmit and load their data to get it ready for analysis, but does so without requiring them to write code. Also, given its no-code nature, it expands the audience of potential users to include business analysts.
“They’re filling a huge need for everyone who’s just starting out working with data [storage] and trying to get data into the cloud real quick,” said Rick Sherman, founder and managing partner of Athena IT Solutions. “What [no-code tools] do is great and highly useful.”
Sherman cautioned, however, that no-code data preparation tools have limitations, and that depending on the nature of the data, at a certain point in the data integration lifecycle a certain amount of coding knowledge may be necessary.
No-code tools are able to move data from one location to another. But for getting data from disparate sources to be consistent — each of the disparate sources might define “customer” differently, for example, and those differences have to be resolved before the data is ready for analysis or modeling — code is required.
“With complexity comes the need for sophistication,” Sherman said. “There’s more to data integration.”
According to AWS, DataBrew offers customers more than 250 pre-built transformations to enable then to automate data preparation tasks — including filtering anomalies, standardizing formats and correcting invalid values — that in some cases could take weeks for a data scientist to complete.
Once prepared, AWS customers can then use the analytics and machine learning platforms of their choice, whether from AWS or third-party vendors, to query the data and train machine learning models.
Raju Gulabani, vice president of database and analytics at AWS, said in a release that customers told AWS they were spending too much time on data preparation tasks. In addition, because AWS Glue required the ability to write code, the number of people within a given organization who could prepare data was severely limited.
The tech giant developed AWS Glue DataBrew in response.