You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Multiple queries with SELECT *, or multiple of the same query, could bloat object storage with unnecessary copies of result files
Design
Two separate things to tackle here- each could be their own PR probably. Be sure to sanitize the queries (eg trim whitespace off and case desensitize them) before programmatically trying to determine their contents. Send the non-sanitized versions of the queries to datafusion if they are determined to be unique.
Handling SELECT * queries: if a query of SELECT * FROM table_x (and nothing more) is detected, AND the original file is in csv format, return the original file as the query result instead of going through data fusion.
Handling duplicated queries: if a query already exists, do some SQL magic to find the previous query ID, then the corresponding report ID, then the result file ID and return that to the user. If all instances of the previous query failed, send to data fusion.
Impact
Adds additional logic to the query report repository layer to save on storage
The text was updated successfully, but these errors were encountered:
Reason
Multiple queries with
SELECT *
, or multiple of the same query, could bloat object storage with unnecessary copies of result filesDesign
Two separate things to tackle here- each could be their own PR probably. Be sure to sanitize the queries (eg trim whitespace off and case desensitize them) before programmatically trying to determine their contents. Send the non-sanitized versions of the queries to datafusion if they are determined to be unique.
SELECT * FROM table_x
(and nothing more) is detected, AND the original file is in csv format, return the original file as the query result instead of going through data fusion.Impact
Adds additional logic to the query report repository layer to save on storage
The text was updated successfully, but these errors were encountered: