You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using qubole connector for reading from amazonkinesis streams in spark structured streaming mode.
If i change the log level from INFO to DEBUG in log4j.properties, i see the physical plan getting dump in target/unit-tests.log. Physical plan contains sensitive information awsSecretKey and awsAccessKeyId . It the security wise concern as is it coming as plain text.
18/10/25 20:04:05.031 main TRACE BaseSessionStateBuilder$$anon$1:
`
Drivers pass the awsSecretKey and awsSecretKeyId in plain text to spark, so it is getting dump in physical plan of spark logs. Ideally drivers should take care of encrypting and decrypting the passwords, which is not done .
Can encryption/decryption mechanism be added for awsSecretKey and awsAccessKeyId before passing to Spark ?
Thanks,
Anuja
The text was updated successfully, but these errors were encountered:
Hi @itsvikramagr
I'm using qubole connector for reading from amazonkinesis streams in spark structured streaming mode.
If i change the log level from INFO to DEBUG in log4j.properties, i see the physical plan getting dump in target/unit-tests.log. Physical plan contains sensitive information awsSecretKey and awsAccessKeyId . It the security wise concern as is it coming as plain text.
Snippet of Physical plan :
`
18/10/25 20:04:05.023 main TRACE BaseSessionStateBuilder$$anon$1:
=== Applying Rule org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences ===
!'Project [unresolvedalias(cast('approximateArrivalTimestamp as timestamp), None)] 'Project [unresolvedalias(cast(approximateArrivalTimestamp#4 as timestamp), None)]
+- StreamingRelation DataSource(org.apache.spark.sql.SparkSession@4d0b0fd4,kinesis,List(),None,List(),None,Map(awsAccessKeyId -> AKIAF6LDVCA3FMAD5FCV, endpointUrl -> kinesis.us-west-1.amazonaws.com, awsSecretKey -> 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, startingposition -> LATEST, streamName -> bdstest),None), kinesis, [data#0, streamName#1, partitionKey#2, sequenceNumber#3, approximateArrivalTimestamp#4] +- StreamingRelation DataSource(org.apache.spark.sql.SparkSession@4d0b0fd4,kinesis,List(),None,List(),None,Map(awsAccessKeyId -> AKIAF6LDVCA3FMAD5FCV, endpointUrl -> kinesis.us-west-1.amazonaws.com, awsSecretKey -> 80vlwE5oxcvar9XPlFr1hooYuG9At47nB/ujbgKi8, startingposition -> LATEST, streamName -> bdstest),None), kinesis, [data#0, streamName#1, partitionKey#2, sequenceNumber#3, approximateArrivalTimestamp#4]
18/10/25 20:04:05.031 main TRACE BaseSessionStateBuilder$$anon$1:
`
Drivers pass the awsSecretKey and awsSecretKeyId in plain text to spark, so it is getting dump in physical plan of spark logs. Ideally drivers should take care of encrypting and decrypting the passwords, which is not done .
Can encryption/decryption mechanism be added for awsSecretKey and awsAccessKeyId before passing to Spark ?
Thanks,
Anuja
The text was updated successfully, but these errors were encountered: