Skip to content

Commit

Permalink
Merge pull request #207 from mlcommons/tfds_croissant
Browse files Browse the repository at this point in the history
Change GroupRecordSet call to return the sample directly.
  • Loading branch information
ccl-core authored Sep 7, 2023
2 parents 660285a + 4a50992 commit 2f77166
Show file tree
Hide file tree
Showing 10 changed files with 1,358 additions and 1,359 deletions.
8 changes: 4 additions & 4 deletions datasets/coco2014-mini/output/captions.jsonl
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{"captions": {"id": 48, "image_id": 318556, "caption": "A very clean and well decorated empty bathroom", "split": "train"}}
{"captions": {"id": 67, "image_id": 116100, "caption": "A panoramic view of a kitchen and all of its appliances.", "split": "train"}}
{"captions": {"id": 126, "image_id": 318556, "caption": "A blue and white bathroom with butterfly themed wall tiles.", "split": "train"}}
{"captions": {"id": 148, "image_id": 116100, "caption": "A panoramic photo of a kitchen and dining room", "split": "train"}}
{"id": 48, "image_id": 318556, "caption": "A very clean and well decorated empty bathroom", "split": "train"}
{"id": 67, "image_id": 116100, "caption": "A panoramic view of a kitchen and all of its appliances.", "split": "train"}
{"id": 126, "image_id": 318556, "caption": "A blue and white bathroom with butterfly themed wall tiles.", "split": "train"}
{"id": 148, "image_id": 116100, "caption": "A panoramic photo of a kitchen and dining room", "split": "train"}
4 changes: 2 additions & 2 deletions datasets/coco2014-mini/output/images.jsonl
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
{"images": {"image_filename": "COCO_train2014_000000467840.jpg", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>", "split": "train"}}
{"images": {"image_filename": "COCO_train2014_000000533055.jpg", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>", "split": "train"}}
{"image_filename": "COCO_train2014_000000467840.jpg", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>", "split": "train"}
{"image_filename": "COCO_train2014_000000533055.jpg", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>", "split": "train"}
20 changes: 10 additions & 10 deletions datasets/gpt-3/output/default.jsonl
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
{"default": {"context": "\n\nQ: What is 65360 plus 16204?\n\nA:", "completion": "81564", "task": "five_digit_addition"}}
{"default": {"context": "\n\nQ: What is 91169 plus 57223?\n\nA:", "completion": "148392", "task": "five_digit_addition"}}
{"default": {"context": "\n\nQ: What is 52888 plus 52240?\n\nA:", "completion": "105128", "task": "five_digit_addition"}}
{"default": {"context": "\n\nQ: What is 35742 plus 78660?\n\nA:", "completion": "114402", "task": "five_digit_addition"}}
{"default": {"context": "\n\nQ: What is 69074 plus 90431?\n\nA:", "completion": "159505", "task": "five_digit_addition"}}
{"default": {"context": "\n\nQ: What is 61530 plus 83035?\n\nA:", "completion": "144565", "task": "five_digit_addition"}}
{"default": {"context": "\n\nQ: What is 98901 plus 6004?\n\nA:", "completion": "104905", "task": "five_digit_addition"}}
{"default": {"context": "\n\nQ: What is 60097 plus 38097?\n\nA:", "completion": "98194", "task": "five_digit_addition"}}
{"default": {"context": "\n\nQ: What is 35779 plus 79717?\n\nA:", "completion": "115496", "task": "five_digit_addition"}}
{"default": {"context": "\n\nQ: What is 67255 plus 99168?\n\nA:", "completion": "166423", "task": "five_digit_addition"}}
{"context": "\n\nQ: What is 65360 plus 16204?\n\nA:", "completion": "81564", "task": "five_digit_addition"}
{"context": "\n\nQ: What is 91169 plus 57223?\n\nA:", "completion": "148392", "task": "five_digit_addition"}
{"context": "\n\nQ: What is 52888 plus 52240?\n\nA:", "completion": "105128", "task": "five_digit_addition"}
{"context": "\n\nQ: What is 35742 plus 78660?\n\nA:", "completion": "114402", "task": "five_digit_addition"}
{"context": "\n\nQ: What is 69074 plus 90431?\n\nA:", "completion": "159505", "task": "five_digit_addition"}
{"context": "\n\nQ: What is 61530 plus 83035?\n\nA:", "completion": "144565", "task": "five_digit_addition"}
{"context": "\n\nQ: What is 98901 plus 6004?\n\nA:", "completion": "104905", "task": "five_digit_addition"}
{"context": "\n\nQ: What is 60097 plus 38097?\n\nA:", "completion": "98194", "task": "five_digit_addition"}
{"context": "\n\nQ: What is 35779 plus 79717?\n\nA:", "completion": "115496", "task": "five_digit_addition"}
{"context": "\n\nQ: What is 67255 plus 99168?\n\nA:", "completion": "166423", "task": "five_digit_addition"}
2 changes: 1 addition & 1 deletion datasets/huggingface-c4/output/en.jsonl
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"en": {"text": "Beginners BBQ Class Taking Place in Missoula!\nDo you want to get better at making delicious BBQ? You will have the opportunity, put this on your calendar now. Thursday, September 22nd join World Class BBQ Champion, Tony Balay from Lonestar Smoke Rangers. He will be teaching a beginner level class for everyone who wants to get better with their culinary skills.\nHe will teach you everything you need to know to compete in a KCBS BBQ competition, including techniques, recipes, timelines, meat selection and trimming, plus smoker and fire information.\nThe cost to be in the class is $35 per person, and for spectators it is free. Included in the cost will be either a t-shirt or apron and you will be tasting samples of each meat that is prepared.", "timestamp": "2019-04-25T12:57:54Z", "url": "https://klyq.com/beginners-bbq-class-taking-place-in-missoula/"}}
{"text": "Beginners BBQ Class Taking Place in Missoula!\nDo you want to get better at making delicious BBQ? You will have the opportunity, put this on your calendar now. Thursday, September 22nd join World Class BBQ Champion, Tony Balay from Lonestar Smoke Rangers. He will be teaching a beginner level class for everyone who wants to get better with their culinary skills.\nHe will teach you everything you need to know to compete in a KCBS BBQ competition, including techniques, recipes, timelines, meat selection and trimming, plus smoker and fire information.\nThe cost to be in the class is $35 per person, and for spectators it is free. Included in the cost will be either a t-shirt or apron and you will be tasting samples of each meat that is prepared.", "timestamp": "2019-04-25T12:57:54Z", "url": "https://klyq.com/beginners-bbq-class-taking-place-in-missoula/"}
20 changes: 10 additions & 10 deletions datasets/huggingface-mnist/output/default.jsonl
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
{"default": {"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 7}}
{"default": {"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 2}}
{"default": {"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 1}}
{"default": {"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 0}}
{"default": {"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 4}}
{"default": {"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 1}}
{"default": {"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 4}}
{"default": {"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 9}}
{"default": {"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 5}}
{"default": {"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 9}}
{"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 7}
{"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 2}
{"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 1}
{"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 0}
{"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 4}
{"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 1}
{"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 4}
{"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 9}
{"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 5}
{"image": "<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at <MEMORY_ADDRESS>>", "label": 9}
16 changes: 8 additions & 8 deletions datasets/pass-mini/output/images.jsonl
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
{"images": {"creator_uname": "PaperBird+Photography%3C3", "latitude": null, "longitude": null, "date_taken": "2007-05-06 06:11:48", "hash": "75f7305b1fd94044e14bdcdde469dbb2", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}}
{"images": {"creator_uname": "Chiara+Marra", "latitude": 38.23818, "longitude": 13.183593, "date_taken": "2007-05-04 15:46:43", "hash": "dd571a41a015354d92a859f7ef31201", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}}
{"images": {"creator_uname": "maplesbranch", "latitude": null, "longitude": null, "date_taken": "2006-05-01 07:34:13", "hash": "598ad3bc7e6e876e61af116693c7ad9", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}}
{"images": {"creator_uname": "maplesbranch", "latitude": null, "longitude": null, "date_taken": "2006-04-23 19:20:40", "hash": "e48d6d552465c5728585b82a53d6e02c", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}}
{"images": {"creator_uname": "quinnums", "latitude": null, "longitude": null, "date_taken": "2004-05-17 00:44:29", "hash": "ffd3eb12a16cb83138f26e6f36dec967", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}}
{"images": {"creator_uname": "striatic", "latitude": 53.535233, "longitude": -113.565075, "date_taken": "2004-05-11 02:00:33", "hash": "fff0eece99cc71c2e91fe716051599", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}}
{"images": {"creator_uname": "striatic", "latitude": null, "longitude": null, "date_taken": "2004-05-27 10:34:28", "hash": "fedefe9f11bf2a749a749bfca8bf28", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}}
{"images": {"creator_uname": "quinnums", "latitude": null, "longitude": null, "date_taken": "2004-05-29 02:14:36", "hash": "ff379727f52bcec4dfb237ace41627", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}}
{"creator_uname": "PaperBird+Photography%3C3", "latitude": null, "longitude": null, "date_taken": "2007-05-06 06:11:48", "hash": "75f7305b1fd94044e14bdcdde469dbb2", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}
{"creator_uname": "Chiara+Marra", "latitude": 38.23818, "longitude": 13.183593, "date_taken": "2007-05-04 15:46:43", "hash": "dd571a41a015354d92a859f7ef31201", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}
{"creator_uname": "maplesbranch", "latitude": null, "longitude": null, "date_taken": "2006-05-01 07:34:13", "hash": "598ad3bc7e6e876e61af116693c7ad9", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}
{"creator_uname": "maplesbranch", "latitude": null, "longitude": null, "date_taken": "2006-04-23 19:20:40", "hash": "e48d6d552465c5728585b82a53d6e02c", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}
{"creator_uname": "quinnums", "latitude": null, "longitude": null, "date_taken": "2004-05-17 00:44:29", "hash": "ffd3eb12a16cb83138f26e6f36dec967", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}
{"creator_uname": "striatic", "latitude": 53.535233, "longitude": -113.565075, "date_taken": "2004-05-11 02:00:33", "hash": "fff0eece99cc71c2e91fe716051599", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}
{"creator_uname": "striatic", "latitude": null, "longitude": null, "date_taken": "2004-05-27 10:34:28", "hash": "fedefe9f11bf2a749a749bfca8bf28", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}
{"creator_uname": "quinnums", "latitude": null, "longitude": null, "date_taken": "2004-05-29 02:14:36", "hash": "ff379727f52bcec4dfb237ace41627", "image_content": "<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=3x2 at <MEMORY_ADDRESS>>"}
6 changes: 3 additions & 3 deletions datasets/simple-join/output/publications_by_user.jsonl
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
{"publications_by_user": {"title": "A New Approach to Machine Learning Using Neural Networks", "author_email": "[email protected]", "author_fullname": "John Smith"}}
{"publications_by_user": {"title": "The Application of Machine Learning to Natural Language Processing", "author_email": "[email protected]", "author_fullname": "Jane Doe"}}
{"publications_by_user": {"title": "The Use of Machine Learning to Predict the Stock Market", "author_email": "[email protected]", "author_fullname": "David Lee"}}
{"title": "A New Approach to Machine Learning Using Neural Networks", "author_email": "[email protected]", "author_fullname": "John Smith"}
{"title": "The Application of Machine Learning to Natural Language Processing", "author_email": "[email protected]", "author_fullname": "Jane Doe"}
{"title": "The Use of Machine Learning to Predict the Stock Market", "author_email": "[email protected]", "author_fullname": "David Lee"}
20 changes: 10 additions & 10 deletions datasets/simple-parquet/output/persons.jsonl
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
{"persons": {"name": "person0", "age": 0}}
{"persons": {"name": "person1", "age": 1}}
{"persons": {"name": "person2", "age": 2}}
{"persons": {"name": "person3", "age": 3}}
{"persons": {"name": "person4", "age": 4}}
{"persons": {"name": "person5", "age": 5}}
{"persons": {"name": "person6", "age": 6}}
{"persons": {"name": "person7", "age": 7}}
{"persons": {"name": "person8", "age": 8}}
{"persons": {"name": "person9", "age": 9}}
{"name": "person0", "age": 0}
{"name": "person1", "age": 1}
{"name": "person2", "age": 2}
{"name": "person3", "age": 3}
{"name": "person4", "age": 4}
{"name": "person5", "age": 5}
{"name": "person6", "age": 6}
{"name": "person7", "age": 7}
{"name": "person8", "age": 8}
{"name": "person9", "age": 9}
Loading

0 comments on commit 2f77166

Please sign in to comment.