-
Notifications
You must be signed in to change notification settings - Fork 32
5. Datasets 💽
John Yang edited this page Jun 27, 2023
·
1 revision
The main task paradigm that the current Intercode environment supports is NL Query to Code/Answers. The Intercode environment supports datasets via the IntercodeDataLoader
abstraction, which requires two fields to capture this task.
-
query
: The NL Query is a human readable question that specifies some desired standard output (i.e.cat
a file) or environment modification (i.e. move files from one folder to another) -
gold
: A command that accomplishes the task conveyed by thequery
.
The IntercodeDataLoader
takes in a data_path
as an argument. The data_path
must point at a file that satisfies the following requirements:
- Must be a
csv
,tsv
,json
, orpickle
file - Must have the fields/columns
query
(str
) andgold
(str
, executable code)
Supported Datasets
The following existing datasets can be used for evaluation on the Intercode platform.
Dataset | Language | Website | Scripts | File |
---|---|---|---|---|
Spider | SQL | Homepage | Link | data/spider/dev_spider.json |
NL2Bash | Bash | Homepage | Link | data/nl2bash/nl2bash.json |