Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing S3 data #9

Open
rnowling opened this issue Jun 6, 2014 · 1 comment
Open

Missing S3 data #9

rnowling opened this issue Jun 6, 2014 · 1 comment

Comments

@rnowling
Copy link

rnowling commented Jun 6, 2014

All of the data except for the 5node sequence-snappy crawl data seems to be missing from S3. I'm accessing S3 directly using wget and my web browser, both from outside of AWS.

See the listing here:
https://big-data-benchmark.s3.amazonaws.com/

@sohan
Copy link

sohan commented Aug 15, 2014

That listing only shows the first 1000 keys in the AWS bucket. Try using s3cmd to navigate the directories. E.g.

$ s3cmd ls s3://big-data-benchmark/pavlo/
                       DIR   s3://big-data-benchmark/pavlo/sequence-snappy/
                       DIR   s3://big-data-benchmark/pavlo/sequence/
                       DIR   s3://big-data-benchmark/pavlo/text-deflate/
                       DIR   s3://big-data-benchmark/pavlo/text/
2013-05-27 21:30         0   s3://big-data-benchmark/pavlo/sequence-snappy_$folder$
2013-05-08 21:53         0   s3://big-data-benchmark/pavlo/sequence_$folder$
2013-05-27 21:28         0   s3://big-data-benchmark/pavlo/text-deflate_$folder$
2013-05-08 21:24         0   s3://big-data-benchmark/pavlo/text_$folder$
$ s3cmd ls s3://big-data-benchmark/pavlo/text/tiny/rankings/
2013-05-08 21:23      7738   s3://big-data-benchmark/pavlo/text/tiny/rankings/part-00000
2013-05-08 21:23      7771   s3://big-data-benchmark/pavlo/text/tiny/rankings/part-00001
2013-05-08 21:23      7556   s3://big-data-benchmark/pavlo/text/tiny/rankings/part-00002
2013-05-08 21:23      7271   s3://big-data-benchmark/pavlo/text/tiny/rankings/part-00003
2013-05-08 21:23      7241   s3://big-data-benchmark/pavlo/text/tiny/rankings/part-00004
2013-05-08 21:23      7783   s3://big-data-benchmark/pavlo/text/tiny/rankings/part-00005
2013-05-08 21:23      7102   s3://big-data-benchmark/pavlo/text/tiny/rankings/part-00006
2013-05-08 21:23      7343   s3://big-data-benchmark/pavlo/text/tiny/rankings/part-00007
2013-05-08 21:23      7699   s3://big-data-benchmark/pavlo/text/tiny/rankings/part-00008
2013-05-08 21:23      7063   s3://big-data-benchmark/pavlo/text/tiny/rankings/part-00009

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants