You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Loading checkpoint shards: 100%|██████████████████| 4/4 [00:01<00:00, 2.23it/s]
[INFO|modeling_utils.py:4473] 2024-10-31 01:03:11,758 >> All model checkpoint weights were used when initializing Qwen2ForCausalLM.
[INFO|modeling_utils.py:4481] 2024-10-31 01:03:11,758 >> All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at /home/wladmin/ai/Qwen2.5-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.
[INFO|configuration_utils.py:991] 2024-10-31 01:03:11,765 >> loading configuration file /home/wladmin/ai/Qwen2.5-7B-Instruct/generation_config.json
[INFO|configuration_utils.py:1038] 2024-10-31 01:03:11,765 >> Generate config GenerationConfig {
"bos_token_id": 151643,
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"repetition_penalty": 1.05,
"temperature": 0.7,
"top_k": 20,
"top_p": 0.8
}
Traceback (most recent call last):
File "/home/wladmin/anaconda3/bin/llamafactory-cli", line 8, in
sys.exit(main())
^^^^^^
File "/home/wladmin/ai/LLaMA-Factory/src/llamafactory/cli.py", line 111, in main
run_exp()
File "/home/wladmin/ai/LLaMA-Factory/src/llamafactory/train/tuner.py", line 50, in run_exp
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/home/wladmin/ai/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 48, in run_sft
model = load_model(tokenizer, model_args, finetuning_args, training_args.do_train)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wladmin/ai/LLaMA-Factory/src/llamafactory/model/loader.py", line 160, in load_model
model = load_class.from_pretrained(**init_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4000, in from_pretrained
dispatch_model(model, **device_map_kwargs)
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/accelerate/big_modeling.py", line 494, in dispatch_model
model.to(device)
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/transformers/modeling_utils.py", line 2871, in to
return super().to(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1174, in to
return self._apply(convert)
^^^^^^^^^^^^^^^^^^^^
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 780, in _apply
module._apply(fn)
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 780, in _apply
module._apply(fn)
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 780, in _apply
module._apply(fn)
[Previous line repeated 2 more times]
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 854, in _apply
self._buffers[key] = fn(buf)
^^^^^^^
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1160, in convert
return t.to(
^^^^^
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 15.69 GiB of which 22.50 MiB is free. Process 1548 has 239.88 MiB memory in use. Process 23735 has 362.46 MiB memory in use. Including non-PyTorch memory, this process has 14.76 GiB memory in use. Of the allocated memory 14.38 GiB is allocated by PyTorch, and 148.86 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
Reminder
System Info
llamafactory
version: 0.9.1.dev0Reproduction
10/31/2024 01:03:08 - INFO - llamafactory.hparams.parser - Process rank: 0, device: cuda:0, n_gpu: 1, distributed training: False, compute dtype: torch.bfloat16
[INFO|configuration_utils.py:731] 2024-10-31 01:03:08,680 >> loading configuration file /home/wladmin/ai/Qwen2.5-7B-Instruct/config.json
[INFO|configuration_utils.py:800] 2024-10-31 01:03:08,681 >> Model config Qwen2Config {
"_name_or_path": "/home/wladmin/ai/Qwen2.5-7B-Instruct",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.43.4",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|tokenization_utils_base.py:2287] 2024-10-31 01:03:08,681 >> loading file vocab.json
[INFO|tokenization_utils_base.py:2287] 2024-10-31 01:03:08,681 >> loading file merges.txt
[INFO|tokenization_utils_base.py:2287] 2024-10-31 01:03:08,681 >> loading file tokenizer.json
[INFO|tokenization_utils_base.py:2287] 2024-10-31 01:03:08,681 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2287] 2024-10-31 01:03:08,681 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2287] 2024-10-31 01:03:08,681 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2533] 2024-10-31 01:03:08,780 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|configuration_utils.py:731] 2024-10-31 01:03:08,781 >> loading configuration file /home/wladmin/ai/Qwen2.5-7B-Instruct/config.json
[INFO|configuration_utils.py:800] 2024-10-31 01:03:08,781 >> Model config Qwen2Config {
"_name_or_path": "/home/wladmin/ai/Qwen2.5-7B-Instruct",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.43.4",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|tokenization_utils_base.py:2287] 2024-10-31 01:03:08,781 >> loading file vocab.json
[INFO|tokenization_utils_base.py:2287] 2024-10-31 01:03:08,781 >> loading file merges.txt
[INFO|tokenization_utils_base.py:2287] 2024-10-31 01:03:08,781 >> loading file tokenizer.json
[INFO|tokenization_utils_base.py:2287] 2024-10-31 01:03:08,781 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2287] 2024-10-31 01:03:08,781 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2287] 2024-10-31 01:03:08,781 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2533] 2024-10-31 01:03:08,872 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
10/31/2024 01:03:08 - WARNING - llamafactory.model.loader - Processor was not found: 'Qwen2Config' object has no attribute 'vision_config'.
10/31/2024 01:03:08 - INFO - llamafactory.data.template - Replace eos token: <|im_end|>
10/31/2024 01:03:08 - INFO - llamafactory.data.loader - Loading dataset 英中_专利_记忆库.json...
training example:
input_ids:
[151644, 8948, 198, 2610, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 14880, 108965, 44063, 107083, 105205, 105395, 17714, 104811, 198, 1944, 6730, 369, 1818, 332, 89244, 553, 81345, 9299, 354, 2283, 66848, 6988, 151645, 198, 151644, 77091, 198, 100359, 38212, 99272, 44956, 103697, 24339, 115391, 120143, 9370, 102360, 39907, 151645]
inputs:
<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
请帮我将这段英文翻译为中文
test Method for Plutonium by Controlled-Potential Coulometry<|im_end|>
<|im_start|>assistant
控制电势库仑法测定钚的试验方法<|im_end|>
label_ids:
[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 100359, 38212, 99272, 44956, 103697, 24339, 115391, 120143, 9370, 102360, 39907, 151645]
labels:
控制电势库仑法测定钚的试验方法<|im_end|>
[INFO|configuration_utils.py:731] 2024-10-31 01:03:09,836 >> loading configuration file /home/wladmin/ai/Qwen2.5-7B-Instruct/config.json
[INFO|configuration_utils.py:800] 2024-10-31 01:03:09,836 >> Model config Qwen2Config {
"_name_or_path": "/home/wladmin/ai/Qwen2.5-7B-Instruct",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.43.4",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 152064
}
[INFO|modeling_utils.py:3641] 2024-10-31 01:03:09,847 >> loading weights file /home/wladmin/ai/Qwen2.5-7B-Instruct/model.safetensors.index.json
[INFO|modeling_utils.py:1572] 2024-10-31 01:03:09,847 >> Instantiating Qwen2ForCausalLM model under default dtype torch.bfloat16.
[INFO|configuration_utils.py:1038] 2024-10-31 01:03:09,848 >> Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151645
}
Loading checkpoint shards: 100%|██████████████████| 4/4 [00:01<00:00, 2.23it/s]
[INFO|modeling_utils.py:4473] 2024-10-31 01:03:11,758 >> All model checkpoint weights were used when initializing Qwen2ForCausalLM.
[INFO|modeling_utils.py:4481] 2024-10-31 01:03:11,758 >> All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at /home/wladmin/ai/Qwen2.5-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.
[INFO|configuration_utils.py:991] 2024-10-31 01:03:11,765 >> loading configuration file /home/wladmin/ai/Qwen2.5-7B-Instruct/generation_config.json
[INFO|configuration_utils.py:1038] 2024-10-31 01:03:11,765 >> Generate config GenerationConfig {
"bos_token_id": 151643,
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"repetition_penalty": 1.05,
"temperature": 0.7,
"top_k": 20,
"top_p": 0.8
}
Traceback (most recent call last):
File "/home/wladmin/anaconda3/bin/llamafactory-cli", line 8, in
sys.exit(main())
^^^^^^
File "/home/wladmin/ai/LLaMA-Factory/src/llamafactory/cli.py", line 111, in main
run_exp()
File "/home/wladmin/ai/LLaMA-Factory/src/llamafactory/train/tuner.py", line 50, in run_exp
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/home/wladmin/ai/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 48, in run_sft
model = load_model(tokenizer, model_args, finetuning_args, training_args.do_train)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wladmin/ai/LLaMA-Factory/src/llamafactory/model/loader.py", line 160, in load_model
model = load_class.from_pretrained(**init_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4000, in from_pretrained
dispatch_model(model, **device_map_kwargs)
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/accelerate/big_modeling.py", line 494, in dispatch_model
model.to(device)
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/transformers/modeling_utils.py", line 2871, in to
return super().to(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1174, in to
return self._apply(convert)
^^^^^^^^^^^^^^^^^^^^
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 780, in _apply
module._apply(fn)
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 780, in _apply
module._apply(fn)
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 780, in _apply
module._apply(fn)
[Previous line repeated 2 more times]
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 854, in _apply
self._buffers[key] = fn(buf)
^^^^^^^
File "/home/wladmin/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1160, in convert
return t.to(
^^^^^
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 15.69 GiB of which 22.50 MiB is free. Process 1548 has 239.88 MiB memory in use. Process 23735 has 362.46 MiB memory in use. Including non-PyTorch memory, this process has 14.76 GiB memory in use. Of the allocated memory 14.38 GiB is allocated by PyTorch, and 148.86 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
nvdia-smi
Thu Oct 31 01:06:37 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.171.04 Driver Version: 535.171.04 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4080 Off | 00000000:01:00.0 On | N/A |
| 0% 40C P2 51W / 340W | 939MiB / 16376MiB | 4% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1406 G /usr/lib/xorg/Xorg 127MiB |
| 0 N/A N/A 1548 C+G ...libexec/gnome-remote-desktop-daemon 239MiB |
| 0 N/A N/A 1601 G /usr/bin/gnome-shell 41MiB |
| 0 N/A N/A 23735 C+G /opt/todesk/bin/ToDesk_Session 362MiB |
| 0 N/A N/A 468326 G ...irefox/5134/usr/lib/firefox/firefox 138MiB |
+---------------------------------------------------------------------------------------+
Expected behavior
怎么处理,具体方法
Others
No response
The text was updated successfully, but these errors were encountered: