Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#4551] feat(iceberg): add S3 and GCS support for IcebergRESTService docker image #5243

Merged
merged 3 commits into from
Oct 30, 2024

Conversation

FANNG1
Copy link
Contributor

@FANNG1 FANNG1 commented Oct 24, 2024

What changes were proposed in this pull request?

  1. add AWS and GCP bundle jar to IcebergRESTServer docker image
  2. use environment variable to change the config

Why are the changes needed?

Fix: #4551

Does this PR introduce any user-facing change?

no

How was this patch tested?

run SQL with access S3 and GCS data

@FANNG1 FANNG1 changed the title [SIP] support docker image to support credential vending [#4551] feat(iceberg): add S3 and GCS support for IcebergRESTService docker image Oct 29, 2024
@FANNG1
Copy link
Contributor Author

FANNG1 commented Oct 29, 2024

@jerryshao @yuqi1129 @jerqi please help to review, thanks

docs/iceberg-rest-service.md Outdated Show resolved Hide resolved
docs/iceberg-rest-service.md Outdated Show resolved Hide resolved

cd ${iceberg_rest_server_dir}

python bin/rewrite_config.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the python command installed by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's installed in docker image

dev/docker/iceberg-rest-server/rewrite_config.py Outdated Show resolved Hide resolved
@yuqi1129
Copy link
Contributor

Please update the doc docker-image-details.md also.

@FANNG1
Copy link
Contributor Author

FANNG1 commented Oct 30, 2024

Please update the doc docker-image-details.md also.

updated, please help to review again

bundles/gcp-bundle/build.gradle.kts Show resolved Hide resolved
config_map = parse_config_file(config_file_path)

for k, v in init_config.items():
update_config(config_map, k, v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can make the dict config_map as the method parameter and do not need to call the method several time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could improve the performance, but the current implementation is more extensiable, I think this is more important, WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by more extensiable? do you mean we can easily support more properties in this way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, like what env_map is processed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean we can change it into the following kind:

update_config(config_map, init_config):
    for key, value in init_config.items():
          config_map[config_prefix + key] = value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then how to process env_map?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the method contains only one simple sentence, I can't find the difference that you replace it with config[config_prefix + key] = value directly if insist on this the way.

def update_config(config, key, value):
    config[config_prefix + key] = value

Copy link
Contributor Author

@FANNG1 FANNG1 Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If just for performance, I prefer not to change it. You suggest to pass map to update_config, seems difficult to process env_map ?

@FANNG1
Copy link
Contributor Author

FANNG1 commented Oct 30, 2024

@yuqi1129 any other comments? @jerryshao do you have time to review again?

@yuqi1129
Copy link
Contributor

@yuqi1129 any other comments? @jerryshao do you have time to review again?

I have no further comments.

@FANNG1 FANNG1 self-assigned this Oct 30, 2024
@FANNG1 FANNG1 added the branch-0.7 Automatically cherry-pick commit to branch-0.7 label Oct 30, 2024
@FANNG1 FANNG1 merged commit 0232b3c into apache:main Oct 30, 2024
26 checks passed
github-actions bot pushed a commit that referenced this pull request Oct 30, 2024
…docker image (#5243)

### What changes were proposed in this pull request?
1.  add AWS and GCP bundle jar to IcebergRESTServer docker image
2.  use environment variable to change the config 

### Why are the changes needed?


Fix: #4551 

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
run SQL with access S3 and GCS data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-0.7 Automatically cherry-pick commit to branch-0.7
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement] add S3 and GCS support for Gravitino Iceberg REST server docker image
2 participants