Skip to content

4. Experiments 🧪

John Yang edited this page Jun 27, 2023 · 1 revision

Run Existing Experiments

The easiest way to recreate the experiments is to clone this repository and run the corresponding scripts as listed below. If you'd like to build from source, make sure to have

  • The experiments/ and scripts/ folders downloaded
  • The intercode package built from source or from pypi
  • The pip dependencies listed in environment.yml installed
  • Put a keys.cfg file in the root directory of the repository and copy/paste + fill out the following template (All keys are not necessary if you are only interested in running a subset of all models):
OPENAI_API_KEY: '<OpenAI Key Here>'
HF_TOKEN: '<HuggingFace Token Here>'
HF_API_URL: '<HuggingFace Endpoint URL>'
PALM_API_KEY: '<PaLM Key Here>'

The following table lists each of the runnable experiments along with the script to invoke the experiment and its implementation (each of which includes a set of flags).

Experiment (Prompt Strategy) Script File
Try Again ./scripts/expr_multi_turn.sh
./scripts/expr_n_turn_others.sh
./experiments/eval_n_turn.py
./experiments/eval_n_turn_others.py
Plan & Solve [1]** ./scripts/expr_plan_solve.sh ./experiments/eval_plan_solve.py
ReAct [2]** ./scripts/expr_react.sh ./experiments/eval_react.py
  • * - The eval_n_turn file is written to handle running Try Again experiments for the GPT family, while eval_n_turn_others is for running the PaLM and open source family models mentioned in the paper.
  • ** - At the moment, this experiment has only been test run with the GPT 3.5 model

Results

The output .json files containing the reward and interaction history for the task instances of each experiments discussed in the main paper can be found in the ./data/results/ folder.

Clone this wiki locally