Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#5071] Improvement (test): Add integration tests for Trino cascading queries #5073

Merged
merged 41 commits into from
Oct 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
1060a7f
Add integration test for Trino cascading query
diqiu50 Oct 8, 2024
b7b0ce1
Add a pipeline to trigger the Trino cascading query integration test
diqiu50 Oct 8, 2024
2b70537
Fix error for the trino_test.sh
diqiu50 Oct 8, 2024
834bd45
Add logic to try create metalake
diqiu50 Oct 8, 2024
47c5b44
Fix path miss match on the dowload trino cascading connector
diqiu50 Oct 8, 2024
92f1182
Fix error of checking gravitino status
diqiu50 Oct 8, 2024
eacb79b
Fix error of trino-cascading-env/inspect_ip.sh
diqiu50 Oct 8, 2024
e9a781c
Add container logs for Trino cascading query testers
diqiu50 Oct 9, 2024
87c5b28
fix ci error
diqiu50 Oct 9, 2024
c46c152
Fix ci error
diqiu50 Oct 9, 2024
137480d
fix ci failed
diqiu50 Oct 9, 2024
eb5720d
Fix ci error
diqiu50 Oct 9, 2024
b3acc8f
fix ci
diqiu50 Oct 9, 2024
abbe5a8
Fix ci error
diqiu50 Oct 9, 2024
9af0ce4
Fix ci error
diqiu50 Oct 9, 2024
445194a
Fix ci error
diqiu50 Oct 9, 2024
1739e6c
Update trino cascading connector download url
diqiu50 Oct 10, 2024
4b068d9
Update for review
diqiu50 Oct 12, 2024
636e912
fix ci error
diqiu50 Oct 22, 2024
79f5bcd
Fix ci error
diqiu50 Oct 22, 2024
7eba430
Fix ci error
diqiu50 Oct 23, 2024
d28256d
fix ci error
diqiu50 Oct 23, 2024
f194abb
fix ci error
diqiu50 Oct 23, 2024
aa10633
fix ci error
diqiu50 Oct 23, 2024
524adcf
fix ci error
diqiu50 Oct 23, 2024
8c5b8eb
fix ci error
diqiu50 Oct 23, 2024
7486f2e
fix ci error
diqiu50 Oct 23, 2024
d76cca2
fix ci error
diqiu50 Oct 23, 2024
5a27e85
Add tester trigger
diqiu50 Oct 23, 2024
0cb677c
Fix ci error
diqiu50 Oct 23, 2024
93c3fa3
Fix ci error
diqiu50 Oct 24, 2024
bc25f0e
Fix ci error
diqiu50 Oct 24, 2024
1ea9c15
Fix ci error
diqiu50 Oct 24, 2024
8b181ee
Fix ci error
diqiu50 Oct 24, 2024
23cf698
Fix ci error
diqiu50 Oct 24, 2024
f3428e3
Fix ci error
diqiu50 Oct 24, 2024
30de930
Fix ci error
diqiu50 Oct 24, 2024
1ee9781
Update for review
diqiu50 Oct 25, 2024
5eece2e
Add comments for the run_test script
diqiu50 Oct 28, 2024
1d6a3c0
increase timeout of trino integration test
diqiu50 Oct 28, 2024
68b6b7b
Update for review
diqiu50 Oct 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 7 additions & 5 deletions .github/workflows/trino-integration-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ jobs:
needs: changes
if: needs.changes.outputs.source_changes == 'true'
runs-on: ubuntu-latest
timeout-minutes: 30
timeout-minutes: 60
strategy:
matrix:
architecture: [linux/amd64]
Expand All @@ -76,7 +76,7 @@ jobs:

- name: Package Gravitino
run: |
./gradlew compileDistribution -x test -PjdkVersion=${{ matrix.java-version }}
./gradlew compileDistribution compileTrinoConnector -x test -PjdkVersion=${{ matrix.java-version }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the task compileTrinoConnector NOT included in compileDistribution?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the compileTrinoConnector task does not include compileDistribution


- name: Free up disk space
run: |
Expand All @@ -87,6 +87,7 @@ jobs:
run: |
./gradlew -PskipTests -PtestMode=embedded -PjdkVersion=${{ matrix.java-version }} -PskipDockerTests=false :trino-connector:integration-test:test
./gradlew -PskipTests -PtestMode=deploy -PjdkVersion=${{ matrix.java-version }} -PskipDockerTests=false :trino-connector:integration-test:test
trino-connector/integration-test/trino-test-tools/run_test.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between this test script and the above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script uses a different test set and testing environment to test the Trino connector. It can also specify a custom test set and testing environment for any test cases. Additionally, the above command can be modified to use the script.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"a different test set" means tpcds and tpch, right? If so, plz add comment on it otherwise we don't know why run the script individually

Copy link
Contributor Author

@diqiu50 diqiu50 Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"a different test set" means the test set for Trino cascading queries. Now the test set only includes TPCH and TCPDS queries.


- name: Upload integrate tests reports
uses: actions/upload-artifact@v3
Expand All @@ -95,10 +96,11 @@ jobs:
name: trino-connector-integrate-test-reports-${{ matrix.java-version }}
path: |
build/reports
trino-connector/integrate-test/build/*.log
trino-connector/integrate-test/build/*.tar
trino-connector/integration-test/build/*.log
trino-connector/integration-test/build/*.tar
trino-connector/integration-test/build/trino-cascading-env
integration-test-common/build/trino-ci-container-log
distribution/package/logs/gravitino-server.out
distribution/package/logs/gravitino-server.log
catalogs/**/*.log
catalogs/**/*.tar
catalogs/**/*.tar
4 changes: 1 addition & 3 deletions build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -488,11 +488,9 @@ tasks.rat {
"dev/docker/kerberos-hive/kadm5.acl",
"**/*.log",
"**/*.out",
"**/testsets",
"**/trino-ci-testset",
"**/licenses/*.txt",
"**/licenses/*.md",
"integration-test/**/*.sql",
"integration-test/**/*.txt",
"docs/**/*.md",
"spark-connector/spark-common/src/test/resources/**",
"web/web/.**",
Expand Down
6 changes: 4 additions & 2 deletions integration-test-common/docker-script/shutdown.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,9 @@
cd "$(dirname "$0")"

LOG_DIR=../build/trino-ci-container-log
docker cp trino-ci-hive:/usr/local/hadoop/logs $LOG_DIR/hdfs
docker cp trino-ci-hive:/tmp/root $LOG_DIR/hive
if [ -d $LOG_DIR ]; then
docker cp trino-ci-hive:/usr/local/hadoop/logs $LOG_DIR/hdfs
docker cp trino-ci-hive:/tmp/root $LOG_DIR/hive
fi

docker compose down
1 change: 0 additions & 1 deletion trino-connector/integration-test/build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,6 @@ tasks.test {
}

tasks.register<JavaExec>("TrinoTest") {
dependsOn("build")
classpath = sourceSets["test"].runtimeClasspath
mainClass.set("org.apache.gravitino.trino.connector.integration.test.TrinoQueryTestTool")

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,12 +74,6 @@ public class TrinoQueryIT extends TrinoQueryITBase {
static {
testsetsDir = TrinoQueryIT.class.getClassLoader().getResource("trino-ci-testset").getPath();
testsetsDir = ITUtils.joinPath(testsetsDir, "testsets");

ciTestsets.add("hive");
ciTestsets.add("lakehouse-iceberg");
ciTestsets.add("jdbc-mysql");
ciTestsets.add("jdbc-postgresql");
ciTestsets.add("tpch");
}

@BeforeAll
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
select count(*) from call_center;
select count(*) from catalog_page;
select count(*) from catalog_returns;
select count(*) from catalog_sales;
select count(*) from customer;
select count(*) from customer_address;
select count(*) from customer_demographics;
select count(*) from date_dim;
select count(*) from household_demographics;
select count(*) from income_band;
select count(*) from inventory;
select count(*) from item;
select count(*) from promotion;
select count(*) from reason;
select count(*) from ship_mode;
select count(*) from store;
select count(*) from store_returns;
select count(*) from store_sales;
select count(*) from time_dim;
select count(*) from warehouse;
select count(*) from web_page;
select count(*) from web_returns;
select count(*) from web_sales;
select count(*) from web_site;

SELECT * FROM call_center ORDER BY cc_call_center_sk, cc_call_center_id, cc_rec_start_date LIMIT 10;

SELECT * FROM catalog_page ORDER BY cp_catalog_page_sk, cp_catalog_page_id, cp_start_date_sk LIMIT 10;

SELECT * FROM catalog_returns ORDER BY cr_returned_date_sk, cr_returned_time_sk, cr_item_sk LIMIT 10;

SELECT * FROM catalog_sales ORDER BY cs_sold_date_sk, cs_sold_time_sk, cs_ship_date_sk LIMIT 10;

SELECT * FROM customer ORDER BY c_customer_sk, c_customer_id, c_current_cdemo_sk LIMIT 10;

SELECT * FROM customer_address ORDER BY ca_address_sk, ca_address_id, ca_street_number LIMIT 10;

SELECT * FROM customer_demographics ORDER BY cd_demo_sk, cd_gender, cd_marital_status LIMIT 10;

SELECT * FROM date_dim ORDER BY d_date_sk, d_date_id, d_date LIMIT 10;

SELECT * FROM household_demographics ORDER BY hd_demo_sk, hd_income_band_sk, hd_buy_potential LIMIT 10;

SELECT * FROM income_band ORDER BY ib_income_band_sk, ib_lower_bound, ib_upper_bound LIMIT 10;

SELECT * FROM inventory ORDER BY inv_date_sk, inv_item_sk, inv_warehouse_sk LIMIT 10;

SELECT * FROM item ORDER BY i_item_sk, i_item_id, i_rec_start_date LIMIT 10;

SELECT * FROM promotion ORDER BY p_promo_sk, p_promo_id, p_start_date_sk LIMIT 10;

SELECT * FROM reason ORDER BY r_reason_sk, r_reason_id, r_reason_desc LIMIT 10;

SELECT * FROM ship_mode ORDER BY sm_ship_mode_sk, sm_ship_mode_id, sm_type LIMIT 10;

SELECT * FROM store ORDER BY s_store_sk, s_store_id, s_rec_start_date LIMIT 10;

SELECT * FROM store_returns ORDER BY sr_returned_date_sk, sr_return_time_sk, sr_item_sk LIMIT 10;

SELECT * FROM store_sales ORDER BY ss_sold_date_sk, ss_sold_time_sk, ss_item_sk LIMIT 10;

SELECT * FROM time_dim ORDER BY t_time_sk, t_time_id, t_time LIMIT 10;

SELECT * FROM warehouse ORDER BY w_warehouse_sk, w_warehouse_id, w_warehouse_name LIMIT 10;

SELECT * FROM web_page ORDER BY wp_web_page_sk, wp_web_page_id, wp_rec_start_date LIMIT 10;

SELECT * FROM web_returns ORDER BY wr_returned_date_sk, wr_returned_time_sk, wr_item_sk LIMIT 10;

SELECT * FROM web_sales ORDER BY ws_sold_date_sk, ws_sold_time_sk, ws_ship_date_sk LIMIT 10;

SELECT * FROM web_site ORDER BY web_site_sk, web_site_id, web_rec_start_date LIMIT 10;
Loading
Loading