-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a PathHandler implementation for Azure blob. #17
Conversation
* Add support for downloading directories
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move .gitignore to a separate PR with just that change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed with @davides and we decided the remaining issue(s) can be addressed in a follow up.
@davides has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Notes
Adds a new PathHandler to support read/write with Azure Blob Storage. Some controls have been put in place so that read/write operations use a known amount of memory when dealing with larger files:
_open("wb", buffering=<buffer-size>)
will buffer up to the requested amount of data in memory before flushing it to the service with the PutBlock operation_open("rb", buffering=<buffer-size>)
will use the Blob client's chunk iterator to only download a fixed amount of data at a time_close()
in write-mode will flush any buffered data with one more PutBlock, and finalize the blob with PutBlockListThe block-based approach should work for both block blobs and append blobs (see the Azure docs).
Testing