-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2 from backend-developers-ltd/experiment_results
reference experiment results
- Loading branch information
Showing
55 changed files
with
6,744 additions
and
0 deletions.
There are no files selected for viewing
6 changes: 6 additions & 0 deletions
6
...sults/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/experiment.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
comment: 1x A100 SXM4 80GB | ||
experiment: vllm_llama_3_70b_instruct_awq | ||
experiment_hash: exp_hash_v1:7aa490 | ||
run_id: vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb | ||
slug: 1x_a100_sxm4_80gb | ||
timestamp: 2024-08-22_12-16-19 |
8 changes: 8 additions & 0 deletions
8
...n/results/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/output.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
Count to 1000, skip unpopular numbers: 5fa4c4a18a1534b96c2eb2c5a30f63da0237b338aebf745d27d3d73dbc8dedfa2aed7070799440ac37e8610f9dd4926371f77a98e79c50a2c8b5b583cbf7c86e | ||
Describe justice system in UK vs USA in 2000-5000 words: 83c0ec6b7f37d53b798093724f72a40195572be308b65471e8d2aae18379ef79655233858eb842ebf73967b058c38685fbea9543a3d1b3b4f41684b5fd95eede | ||
Describe schooling system in UK vs USA in 2000-5000 words: f5d13dd9ee6b6b0540bd3e4adf6baec37ff5d4dc3e1158344f5ab2c690880de0ac1263d3f2691d6b904271298ba0b023adf541ba2f7fb1add50ba27f7a67d3a1 | ||
Explain me some random problem for me in 2000-5000 words: 143fc78fb373d10e8b27bdc3bcd5a5a9b5154c8a9dfeb72102d610a87cf47d5cfeb7a4be0136bf0ba275e3fa46e8b6cfcbeb63af6c45714abcd2875bb7bd577c | ||
Tell me entire history of USA: 210fa7578650d083ad35cae251f8ef272bdc61c35daa08eb27852b3ddc59262718300971b1ac9725c9ac08f63240a1a13845d6c853d2e08520567288d54b5518 | ||
Write a ballad. Pick a random theme.: 21c8744c38338c8e8c4a9f0efc580b9040d51837573924ef731180e7cc2fb21cb96968c901803abad6df1b4f035096ec0fc75339144f133c754a8303a3f378e3 | ||
Write an epic story about a dragon and a knight: 81ff9b82399502e2d3b0fd8f625d3c3f6141c4c179488a247c0c0cc3ccd77828f0920c3d8c03621dfe426e401f58820a6094db5f3786ab7f12bfb13d6224ef94 | ||
Write an essay about being a Senior developer.: 0921d5c3b2e04616dbb655e6ba4648911b9461a4ecdb0d435ebf190d903a92c20cf1343d98de65b6e9690f5e6b1c8f3bfc58e720168fa54dc0e293f0f595505c |
15 changes: 15 additions & 0 deletions
15
...results/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/run.local.log
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
2024-08-22 12:16:19,452 - __main__ - INFO - Starting experiment vllm_llama_3_70b_instruct_awq with comment: 1x A100 SXM4 80GB | ||
2024-08-22 12:16:19,455 - __main__ - INFO - Local log file: /home/rooter/dev/bac/deterministic-ml/tests/integration/results/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/run.local.log | ||
2024-08-22 12:16:19,564 - paramiko.transport - INFO - Connected (version 2.0, client OpenSSH_8.9p1) | ||
2024-08-22 12:16:19,769 - paramiko.transport - INFO - Auth banner: b'Welcome to vast.ai. If authentication fails, try again after a few seconds, and double check your ssh key.\nHave fun!\n' | ||
2024-08-22 12:16:19,772 - paramiko.transport - INFO - Authentication (publickey) successful! | ||
2024-08-22 12:16:19,774 - __main__ - INFO - Syncing files to remote | ||
2024-08-22 12:16:19,961 - tools.ssh - INFO - Command: 'mkdir -p ~/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/output' stdout: '' stderr: '' status_code: 0 | ||
2024-08-22 12:16:22,432 - __main__ - INFO - Setting up remote environment | ||
2024-08-22 12:16:25,588 - tools.ssh - INFO - Command: '\n set -exo pipefail\n \n curl -LsSf https://astral.sh/uv/install.sh | sh\n export PATH=$HOME/.cargo/bin:$PATH\n \n cd ~/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb\n uv venv -p python3.11 --python-preference managed\n source .venv/bin/activate \n uv pip install ./deterministic_ml*.whl pyyaml -r vllm_llama_3_70b_instruct_awq/requirements.txt\n ' stdout: "installing to /root/.cargo/bin\n uv\n uvx\neverything's installed!\n" stderr: "+ curl -LsSf https://astral.sh/uv/install.sh\n+ sh\ndownloading uv 0.3.1 x86_64-unknown-linux-gnu\n+ export PATH=/root/.cargo/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\n+ PATH=/root/.cargo/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\n+ cd /root/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb\n+ uv venv -p python3.11 --python-preference managed\nUsing Python 3.11.9\nCreating virtualenv at: .venv\nActivate with: source .venv/bin/activate\n+ source .venv/bin/activate\n++ '[' -n x ']'\n++ SCRIPT_PATH=.venv/bin/activate\n++ '[' .venv/bin/activate = bash ']'\n++ deactivate nondestructive\n++ unset -f pydoc\n++ '[' -z '' ']'\n++ '[' -z '' ']'\n++ hash -r\n++ '[' -z '' ']'\n++ unset VIRTUAL_ENV\n++ unset VIRTUAL_ENV_PROMPT\n++ '[' '!' nondestructive = nondestructive ']'\n++ VIRTUAL_ENV=/root/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/.venv\n++ '[' linux-gnu = cygwin ']'\n++ '[' linux-gnu = msys ']'\n++ export VIRTUAL_ENV\n++ _OLD_VIRTUAL_PATH=/root/.cargo/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\n++ PATH=/root/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/.venv/bin:/root/.cargo/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\n++ export PATH\n++ '[' x2024-08-22_12-16-19_1x_a100_sxm4_80gb '!=' x ']'\n++ VIRTUAL_ENV_PROMPT=2024-08-22_12-16-19_1x_a100_sxm4_80gb\n++ export VIRTUAL_ENV_PROMPT\n++ '[' -z '' ']'\n++ '[' -z '' ']'\n++ _OLD_VIRTUAL_PS1=\n++ PS1='(2024-08-22_12-16-19_1x_a100_sxm4_80gb) '\n++ export PS1\n++ alias pydoc\n++ true\n++ hash -r\n+ uv pip install ./deterministic_ml-0.1.dev2+g218f083.d20240822-py3-none-any.whl pyyaml -r vllm_llama_3_70b_instruct_awq/requirements.txt\nResolved 108 packages in 57ms\nPrepared 1 package in 2ms\nInstalled 108 packages in 473ms\n + aiohappyeyeballs==2.4.0\n + aiohttp==3.10.5\n + aiosignal==1.3.1\n + annotated-types==0.7.0\n + anyio==4.4.0\n + attrs==24.2.0\n + certifi==2024.7.4\n + charset-normalizer==3.3.2\n + click==8.1.7\n + cloudpickle==3.0.0\n + cmake==3.30.2\n + datasets==2.21.0\n + deterministic-ml==0.1.dev2+g218f083.d20240822 (from file:///root/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/deterministic_ml-0.1.dev2+g218f083.d20240822-py3-none-any.whl)\n + dill==0.3.8\n + diskcache==5.6.3\n + distro==1.9.0\n + fastapi==0.112.1\n + filelock==3.15.4\n + frozenlist==1.4.1\n + fsspec==2024.6.1\n + h11==0.14.0\n + httpcore==1.0.5\n + httptools==0.6.1\n + httpx==0.27.0\n + huggingface-hub==0.24.6\n + idna==3.7\n + interegular==0.3.3\n + jinja2==3.1.4\n + jiter==0.5.0\n + jsonschema==4.23.0\n + jsonschema-specifications==2023.12.1\n + lark==1.2.2\n + llvmlite==0.43.0\n + lm-format-enforcer==0.10.3\n + markupsafe==2.1.5\n + mpmath==1.3.0\n + msgpack==1.0.8\n + multidict==6.0.5\n + multiprocess==0.70.16\n + nest-asyncio==1.6.0\n + networkx==3.3\n + ninja==1.11.1.1\n + numba==0.60.0\n + numpy==1.26.4\n + nvidia-cublas-cu12==12.1.3.1\n + nvidia-cuda-cupti-cu12==12.1.105\n + nvidia-cuda-nvrtc-cu12==12.1.105\n + nvidia-cuda-runtime-cu12==12.1.105\n + nvidia-cudnn-cu12==9.1.0.70\n + nvidia-cufft-cu12==11.0.2.54\n + nvidia-curand-cu12==10.3.2.106\n + nvidia-cusolver-cu12==11.4.5.107\n + nvidia-cusparse-cu12==12.1.0.106\n + nvidia-ml-py==12.560.30\n + nvidia-nccl-cu12==2.20.5\n + nvidia-nvjitlink-cu12==12.6.20\n + nvidia-nvtx-cu12==12.1.105\n + openai==1.42.0\n + outlines==0.0.46\n + packaging==24.1\n + pandas==2.2.2\n + pillow==10.4.0\n + prometheus-client==0.20.0\n + prometheus-fastapi-instrumentator==7.0.0\n + protobuf==5.27.3\n + psutil==6.0.0\n + py-cpuinfo==9.0.0\n + pyairports==2.1.1\n + pyarrow==17.0.0\n + pycountry==24.6.1\n + pydantic==2.8.2\n + pydantic-core==2.20.1\n + python-dateutil==2.9.0.post0\n + python-dotenv==1.0.1\n + pytz==2024.1\n + pyyaml==6.0.2\n + pyzmq==26.2.0\n + ray==2.34.0\n + referencing==0.35.1\n + regex==2024.7.24\n + requests==2.32.3\n + rpds-py==0.20.0\n + safetensors==0.4.4\n + sentencepiece==0.2.0\n + setuptools==73.0.1\n + six==1.16.0\n + sniffio==1.3.1\n + starlette==0.38.2\n + sympy==1.13.2\n + tiktoken==0.7.0\n + tokenizers==0.19.1\n + torch==2.4.0\n + torchvision==0.19.0\n + tqdm==4.66.5\n + transformers==4.44.1\n + triton==3.0.0\n + typing-extensions==4.12.2\n + tzdata==2024.1\n + urllib3==2.2.2\n + uvicorn==0.30.6\n + uvloop==0.20.0\n + vllm==0.5.4\n + vllm-flash-attn==2.6.1\n + watchfiles==0.23.0\n + websockets==13.0\n + xformers==0.0.27.post2\n + xxhash==3.5.0\n + yarl==1.9.4\n" status_code: 0 | ||
2024-08-22 12:16:25,608 - __main__ - INFO - Gathering system info | ||
2024-08-22 12:16:28,471 - tools.ssh - INFO - Command: '\n set -exo pipefail\n \n cd ~/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb\n export PATH=$HOME/.cargo/bin:$PATH\n source .venv/bin/activate;\n python -m deterministic_ml._internal.sysinfo > ~/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/output/sysinfo.yaml' stdout: '' stderr: "+ cd /root/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb\n+ export PATH=/root/.cargo/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\n+ PATH=/root/.cargo/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\n+ source .venv/bin/activate\n++ '[' -n x ']'\n++ SCRIPT_PATH=.venv/bin/activate\n++ '[' .venv/bin/activate = bash ']'\n++ deactivate nondestructive\n++ unset -f pydoc\n++ '[' -z '' ']'\n++ '[' -z '' ']'\n++ hash -r\n++ '[' -z '' ']'\n++ unset VIRTUAL_ENV\n++ unset VIRTUAL_ENV_PROMPT\n++ '[' '!' nondestructive = nondestructive ']'\n++ VIRTUAL_ENV=/root/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/.venv\n++ '[' linux-gnu = cygwin ']'\n++ '[' linux-gnu = msys ']'\n++ export VIRTUAL_ENV\n++ _OLD_VIRTUAL_PATH=/root/.cargo/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\n++ PATH=/root/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/.venv/bin:/root/.cargo/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\n++ export PATH\n++ '[' x2024-08-22_12-16-19_1x_a100_sxm4_80gb '!=' x ']'\n++ VIRTUAL_ENV_PROMPT=2024-08-22_12-16-19_1x_a100_sxm4_80gb\n++ export VIRTUAL_ENV_PROMPT\n++ '[' -z '' ']'\n++ '[' -z '' ']'\n++ _OLD_VIRTUAL_PS1=\n++ PS1='(2024-08-22_12-16-19_1x_a100_sxm4_80gb) '\n++ export PS1\n++ alias pydoc\n++ true\n++ hash -r\n+ python -m deterministic_ml._internal.sysinfo\n" status_code: 0 | ||
2024-08-22 12:16:28,485 - __main__ - INFO - Running experiment code on remote | ||
2024-08-22 12:20:56,768 - tools.ssh - INFO - Command: '\n set -exo pipefail\n \n cd ~/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb\n export PATH=$HOME/.cargo/bin:$PATH\n source .venv/bin/activate;\n python -m vllm_llama_3_70b_instruct_awq ~/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/output | tee ~/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/output/stdout.txt' stdout: "gpu_count=1\nStarting model loading\nINFO 08-22 10:16:34 awq_marlin.py:89] The model is convertible to awq_marlin during runtime. Using awq_marlin kernel.\nINFO 08-22 10:16:34 llm_engine.py:174] Initializing an LLM engine (v0.5.4) with config: model='casperhansen/llama-3-70b-instruct-awq', speculative_config=None, tokenizer='casperhansen/llama-3-70b-instruct-awq', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, rope_scaling=None, rope_theta=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=8192, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=awq_marlin, enforce_eager=True, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None), seed=0, served_model_name=casperhansen/llama-3-70b-instruct-awq, use_v2_block_manager=False, enable_prefix_caching=False)\nINFO 08-22 10:16:35 model_runner.py:720] Starting to load model casperhansen/llama-3-70b-instruct-awq...\nINFO 08-22 10:16:36 weight_utils.py:225] Using model weights format ['*.safetensors']\nINFO 08-22 10:17:10 model_runner.py:732] Loading model weights took 37.0561 GB\nINFO 08-22 10:17:16 gpu_executor.py:102] # GPU blocks: 6068, # CPU blocks: 819\nmodel loading took 46.38 seconds\nStarting 8 responses generation\n8 responses generation took 213.59 seconds\n{'Count to 1000, skip unpopular numbers': '5fa4c4a18a1534b96c2eb2c5a30f63da0237b338aebf745d27d3d73dbc8dedfa2aed7070799440ac37e8610f9dd4926371f77a98e79c50a2c8b5b583cbf7c86e',\n 'Describe justice system in UK vs USA in 2000-5000 words': '83c0ec6b7f37d53b798093724f72a40195572be308b65471e8d2aae18379ef79655233858eb842ebf73967b058c38685fbea9543a3d1b3b4f41684b5fd95eede',\n 'Describe schooling system in UK vs USA in 2000-5000 words': 'f5d13dd9ee6b6b0540bd3e4adf6baec37ff5d4dc3e1158344f5ab2c690880de0ac1263d3f2691d6b904271298ba0b023adf541ba2f7fb1add50ba27f7a67d3a1',\n 'Explain me some random problem for me in 2000-5000 words': '143fc78fb373d10e8b27bdc3bcd5a5a9b5154c8a9dfeb72102d610a87cf47d5cfeb7a4be0136bf0ba275e3fa46e8b6cfcbeb63af6c45714abcd2875bb7bd577c',\n 'Tell me entire history of USA': '210fa7578650d083ad35cae251f8ef272bdc61c35daa08eb27852b3ddc59262718300971b1ac9725c9ac08f63240a1a13845d6c853d2e08520567288d54b5518',\n 'Write a ballad. Pick a random theme.': '21c8744c38338c8e8c4a9f0efc580b9040d51837573924ef731180e7cc2fb21cb96968c901803abad6df1b4f035096ec0fc75339144f133c754a8303a3f378e3',\n 'Write an epic story about a dragon and a knight': '81ff9b82399502e2d3b0fd8f625d3c3f6141c4c179488a247c0c0cc3ccd77828f0920c3d8c03621dfe426e401f58820a6094db5f3786ab7f12bfb13d6224ef94',\n 'Write an essay about being a Senior developer.': '0921d5c3b2e04616dbb655e6ba4648911b9461a4ecdb0d435ebf190d903a92c20cf1343d98de65b6e9690f5e6b1c8f3bfc58e720168fa54dc0e293f0f595505c'}\n" stderr: "+ cd /root/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb\n+ export PATH=/root/.cargo/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\n+ PATH=/root/.cargo/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\n+ source .venv/bin/activate\n++ '[' -n x ']'\n++ SCRIPT_PATH=.venv/bin/activate\n++ '[' .venv/bin/activate = bash ']'\n++ deactivate nondestructive\n++ unset -f pydoc\n++ '[' -z '' ']'\n++ '[' -z '' ']'\n++ hash -r\n++ '[' -z '' ']'\n++ unset VIRTUAL_ENV\n++ unset VIRTUAL_ENV_PROMPT\n++ '[' '!' nondestructive = nondestructive ']'\n++ VIRTUAL_ENV=/root/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/.venv\n++ '[' linux-gnu = cygwin ']'\n++ '[' linux-gnu = msys ']'\n++ export VIRTUAL_ENV\n++ _OLD_VIRTUAL_PATH=/root/.cargo/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\n++ PATH=/root/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/.venv/bin:/root/.cargo/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\n++ export PATH\n++ '[' x2024-08-22_12-16-19_1x_a100_sxm4_80gb '!=' x ']'\n++ VIRTUAL_ENV_PROMPT=2024-08-22_12-16-19_1x_a100_sxm4_80gb\n++ export VIRTUAL_ENV_PROMPT\n++ '[' -z '' ']'\n++ '[' -z '' ']'\n++ _OLD_VIRTUAL_PS1=\n++ PS1='(2024-08-22_12-16-19_1x_a100_sxm4_80gb) '\n++ export PS1\n++ alias pydoc\n++ true\n++ hash -r\n+ python -m vllm_llama_3_70b_instruct_awq /root/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/output\n+ tee /root/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/output/stdout.txt\n\rLoading safetensors checkpoint shards: 0% Completed | 0/9 [00:00<?, ?it/s]\n\rLoading safetensors checkpoint shards: 11% Completed | 1/9 [00:01<00:12, 1.55s/it]\n\rLoading safetensors checkpoint shards: 22% Completed | 2/9 [00:03<00:14, 2.06s/it]\n\rLoading safetensors checkpoint shards: 33% Completed | 3/9 [00:06<00:13, 2.31s/it]\n\rLoading safetensors checkpoint shards: 44% Completed | 4/9 [00:09<00:12, 2.42s/it]\n\rLoading safetensors checkpoint shards: 56% Completed | 5/9 [00:11<00:09, 2.44s/it]\n\rLoading safetensors checkpoint shards: 67% Completed | 6/9 [00:15<00:08, 2.97s/it]\n\rLoading safetensors checkpoint shards: 78% Completed | 7/9 [00:17<00:05, 2.72s/it]\n\rLoading safetensors checkpoint shards: 89% Completed | 8/9 [00:21<00:03, 3.06s/it]\n\rLoading safetensors checkpoint shards: 100% Completed | 9/9 [00:22<00:00, 2.40s/it]\n\rLoading safetensors checkpoint shards: 100% Completed | 9/9 [00:22<00:00, 2.51s/it]\n\n/root/experiments/vllm_llama_3_70b_instruct_awq/2024-08-22_12-16-19_1x_a100_sxm4_80gb/.venv/lib/python3.11/site-packages/vllm/model_executor/layers/sampler.py:287: UserWarning: cumsum_cuda_kernel does not have a deterministic implementation, but you set 'torch.use_deterministic_algorithms(True, warn_only=True)'. You can file an issue at https://github.com/pytorch/pytorch/issues to help us prioritize adding deterministic support for this operation. (Triggered internally at ../aten/src/ATen/Context.cpp:83.)\n probs_sum = probs_sort.cumsum(dim=-1)\n\rProcessed prompts: 0%| | 0/8 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]\rProcessed prompts: 12%|█▎ | 1/8 [03:33<24:55, 213.59s/it, est. speed input: 0.15 toks/s, output: 19.18 toks/s]\rProcessed prompts: 100%|██████████| 8/8 [03:33<00:00, 26.70s/it, est. speed input: 1.32 toks/s, output: 153.42 toks/s]\n" status_code: 0 | ||
2024-08-22 12:20:56,801 - __main__ - INFO - Syncing output back to local | ||
2024-08-22 12:20:57,304 - __main__ - INFO - Done |
Oops, something went wrong.