summaryrefslogtreecommitdiffstats
path: root/yql/essentials/sql/v1/aggregation.h
Commit message (Collapse)AuthorAgeFilesLines
* YQL-20436: Translate GROUP BY to YqlSelectvitya-smirnov2025-12-081-0/+33
This patch adds a support for a basic aggregation on `YqlSelect` and includes translation, type annotation, expansion changes. There are `count`, `min`, `max`, `sum`, `avg` and other aggregations without parameters supported. `HAVING` support will be added in a future PRs. Support for (2+)-arg aggregation is postponed, as it is not required for TPCH queries. ### Translator The `std::expected` was passed though the function call hierarchy to gracefully do not support some aggregation functions. The translation unit `select_yql_aggregation` was introduced. It implements an alternative aggregation translation to `YqlAggFactory` and `YqlAgg` callables. It reuses an `extractor` body and `aggregation_traits_factory` from a legacy code, via dirty hack with a `friend` keyword, as I decided to make minimal changes on an existing code. For a query fragment: ```yql ... Sum(body) ... ``` Resulting YQLs AST looks like this: ```yqls (YqlAggApply (YqlAggFactory '"sum") '() # <-- Options (Void) # <-- Result type stub body') ``` ### Type Annotation I followed a plan described at the https://nda.ya.ru/t/psA5Gaji7PDe9R. Besides, new `PgXXX` to `YqlXXX` renames, there were introduced `YqlAggFactoryWrapper` and `YqlAggWrapper`. First just checks argument count and types, but the second does more sofisticated work. `YqlAggWrapper` acts in 2 stages. On the first stage it will write an expression with a `TypeOf` of a `Result` to a result type stub and then calls for a transformation pipeline repetition. On the second stage result type stub is typed and so it just defines its type to be equal to a type of the stub. The tricky thing here is to contruct this expression with `Result` type. To do this it needs to instantiate an aggregation traits factory with a correct `list_type` and an `extractor`. The `list_type` is just a `(ListType (TypeOf row))`, where `row` is an `Argument` from an enclosing `lambda` at `YqlResultItem`. An `extractor` is just a `(lambda (row) body')`. The traits are imported from the `mount/lib/aggregate.yqls`, then they are beta-reduced with constructed `list_type` and `extractor` expressions. Then the `state` method is extracted and beta-reduced with a `body`. Then the `finish` method is extracted and beta-reduced with a some `state`-expression. So the resulting expression is constructed. ### Expansion The `YqlAgg` is expanded at `BuildAggregationTraits` from `yql_co_pgselect`, that is nicely integrated into `PgSelect` expansion infrastructure. There it is sufficient just to import an aggregation traits factory and beta-reduce it with `list_type` and `extractor`, that are provided by the `PgSelect`-related code. ### Refactoring Loading an `ExprNode`s from a module, so I extracted a logic to `ImportReadonly` and `ImportDeeplyCopied` at `yql_module_helpers`. commit_hash:9303f00567dd423bedc334e7c7514584f5bd8cff