We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Root Cause (first observed failure): [0]: time : 2024-08-05_10:01:43 host : iZuf6ct0ygsd4zjh2lit8uZ rank : 0 (local_rank: 0) exitcode : 1 (pid: 46669) error_file: /tmp/torchelastic_i4d4ivao/none_jzj2c4lc/attempt_0/0/error.json traceback : Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 355, in wrapper return f(*args, **kwargs) File "/ncluster/dushuai/torchtitan/train.py", line 207, in main tokenizer = create_tokenizer(tokenizer_type, job_config.model.tokenizer_path) File "/ncluster/dushuai/torchtitan/torchtitan/datasets/tokenizer/init.py", line 19, in create_tokenizer return TikTokenizer(tokenizer_path) File "/ncluster/dushuai/torchtitan/torchtitan/datasets/tokenizer/tiktoken.py", line 52, in init mergeable_ranks = load_tiktoken_bpe(model_path) File "/usr/local/lib/python3.10/dist-packages/tiktoken/load.py", line 148, in load_tiktoken_bpe return { File "/usr/local/lib/python3.10/dist-packages/tiktoken/load.py", line 149, in base64.b64decode(token): int(rank) ValueError: invalid literal for int() with base 10: b'coding=utf-8'
The text was updated successfully, but these errors were encountered:
Is it possible that the tokenizer is corrupted? Can you re-download the tokenizer and try again?
Sorry, something went wrong.
Again, the problem occurs. Oddly, both llama2 and test_tiktoken are successful
No branches or pull requests
Root Cause (first observed failure):
[0]:
time : 2024-08-05_10:01:43
host : iZuf6ct0ygsd4zjh2lit8uZ
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 46669)
error_file: /tmp/torchelastic_i4d4ivao/none_jzj2c4lc/attempt_0/0/error.json
traceback : Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 355, in wrapper
return f(*args, **kwargs)
File "/ncluster/dushuai/torchtitan/train.py", line 207, in main
tokenizer = create_tokenizer(tokenizer_type, job_config.model.tokenizer_path)
File "/ncluster/dushuai/torchtitan/torchtitan/datasets/tokenizer/init.py", line 19, in create_tokenizer
return TikTokenizer(tokenizer_path)
File "/ncluster/dushuai/torchtitan/torchtitan/datasets/tokenizer/tiktoken.py", line 52, in init
mergeable_ranks = load_tiktoken_bpe(model_path)
File "/usr/local/lib/python3.10/dist-packages/tiktoken/load.py", line 148, in load_tiktoken_bpe
return {
File "/usr/local/lib/python3.10/dist-packages/tiktoken/load.py", line 149, in
base64.b64decode(token): int(rank)
ValueError: invalid literal for int() with base 10: b'coding=utf-8'
The text was updated successfully, but these errors were encountered: