Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read csvs with AWS-standard client-side encryption #11

Open
seamusabshere opened this issue Oct 3, 2018 · 3 comments
Open

read csvs with AWS-standard client-side encryption #11

seamusabshere opened this issue Oct 3, 2018 · 3 comments
Labels
help wanted Extra attention is needed

Comments

@seamusabshere
Copy link

it would be really cool if i could store a secret key in my Stitch Data integration and then have this tap decrypt files transparently

from the official AWS ruby SDK: [1] (edited for clarity)

# just a random secret for now, but you get the idea
require 'openssl'
key = OpenSSL::PKey::RSA.new(1024)

# encryption client
s3 = Aws::S3::Encryption::Client.new(encryption_key: key)

# round-trip an object, encrypted/decrypted locally
s3.put_object(bucket:'aws-sdk', key:'hipaa.csv', body:'lots,of,health,data')
s3.get_object(bucket:'aws-sdk', key:'hipaa.csv').body.read
#=> 'lots,of,health,data'

# reading encrypted object without the encryption client
# results in the getting the cipher text
Aws::S3::Client.new.get_object(bucket:'aws-sdk', key:'hipaa.csv').body.read
#=> "... cipher text ..."

There is apparently a port of this to Python [1] but its example is significantly less clear, so I won't mention it, even though it's probably what you would want to use since taps are written in python.

Key things:

  • (at least in Rubyland) it decrypts the file in a streaming manner, which I imagine is a requirement for taps (you don't want to pull the whole file locally just to read it)
  • we don't want to involve AWS KMS. We just want to store a secret at https://app.stitchdata.com in the integration configuration

[1] https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/S3/Encryption.html
[2] https://github.com/boldfield/s3-encryption (see issue boldfield/s3-encryption#9 for a slight clarification)

@nick-mccoy
Copy link
Contributor

Hi @seamusabshere, that's an interesting idea! If you want to make the changes locally, test, and submit a PR, we would consider merging it and adding the corresponding field on the integration's settings page.

@dmosorast dmosorast added the help wanted Extra attention is needed label Oct 8, 2018
@aaronsteers
Copy link

I arrived here because I'm actually interested in adding support for KMS encryption on the target side, for target-s3-csv. I think it's a great addition if both can support server-side encryption. I'll post back here if I have updates on that front.

For reference, I did find this link, although primarily focused on KMS: https://www.justdocloud.com/2018/09/21/upload-download-s3-using-aws-kms-python/

@aaronsteers
Copy link

aaronsteers commented Jan 10, 2020

I've created a related Issue on the pipelinewise target-s3-csv repo here: transferwise/pipelinewise-target-s3-csv#5

I imagine the code to accomplish both is very similar, and would be great if the settings/config needed on both side are similar or identical.

UPDATE:

After further research, I've found that KMS decryption occurs transparently as long as the user has access to the applied KMS key. In that case, we probably can accomplish KMS integration without any change to this tap (feel free to correct me if that doesn't seem correct).

https://aws.amazon.com/premiumsupport/knowledge-center/decrypt-kms-encrypted-objects-s3/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants