eCommerceNews Asia - Technology news for digital commerce decision-makers
Asia
Google adds AI.AGG for BigQuery natural-language SQL

Google adds AI.AGG for BigQuery natural-language SQL

Tue, 30th Jun 2026 (Today)
Sean Mitchell
SEAN MITCHELL Publisher

Google has introduced AI.AGG() in preview for BigQuery. The function lets users run natural-language aggregation queries over unstructured data in SQL.

The addition extends BigQuery's existing AI functions beyond row-by-row analysis to grouped analysis across large volumes of text and image data. It can summarise or synthesise information from millions of rows within a single SQL statement.

Users can apply the tool to datasets such as system logs, product descriptions, documents and image collections. The aim is to make it easier for analysts to ask broader questions of unstructured data, including recurring product complaints, common system errors and patterns in customer support failures.

One example focused on system logging. Engineering teams can use the function to examine large numbers of log messages together and identify patterns such as repeated retries, latency spikes or inefficiencies that may not appear as fatal errors.

Google said its BigQuery engineering team used that approach during development to identify edge cases in input handling. In a demonstration using a public Apache Spark log dataset, the function grouped logs by component and produced short summaries of normal operations alongside unusual patterns.

Product categories

Google also outlined how the function could be used with retail-style datasets that combine structured and unstructured information. In a sample pet supply catalogue, AI.AGG() was first used to identify broad product categories from names and descriptions, then to return those categories in JSON format for use in later steps.

That output was then passed to another BigQuery function, AI.CLASSIFY(), to assign a category to each product. Once products had been labelled, users could combine conventional SQL aggregation, such as row counts, with AI-generated summaries for each group.

The same dataset was also used to show image analysis. Because the function supports multimodal inputs, users can point BigQuery at image files stored in object tables and ask for an aggregated summary of what the collection contains.

How it works

Google said AI.AGG() addresses large language model context limits by splitting input rows into batches, aggregating each batch, and then aggregating the intermediate results into a final output. This removes the need to manually divide large datasets to fit within a model's context window.

It warned, however, that individual rows still need to be small enough to fit within the model context. If a single row is too large, it will be skipped rather than split across multiple batches.

Token use can also rise above the raw token count of the original table because of the multi-stage aggregation process. Users are advised to reduce input size where possible by filtering or limiting data before running the function.

The feature lets users specify a model endpoint directly in the SQL call. If no endpoint is set, BigQuery defaults to a recent model, though explicit model selection may be preferable in production data pipelines.

Operational limits

AI.AGG() supports text input, references to text files and image data, as well as arrays of those types in some cases. The output is always a string, even when the user requests structured formats such as JSON or Markdown.

That means the model can be instructed to return machine-readable text, but the database engine does not enforce the structure. Multimodal output such as generated images is not supported.

Google also highlighted how the function handles NULL values. It skips NULL input rows automatically, and with structured inputs, a single NULL field can cause the entire row to be treated as NULL and excluded unless the user adds fallback values.

On error handling, Google said the function will attempt to return partial results if it encounters invalid input or model processing errors. Rows rejected during processing are excluded from the final output, and users can inspect job statistics in BigQuery to see how many rows failed.

AI.AGG() is available in preview to BigQuery users alongside other managed AI functions, including AI.CLASSIFY(), AI.IF(), AI.SCORE() and the general-purpose AI.GENERATE() function.