Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RedPanda training inference error #115

Open
qrpike opened this issue May 10, 2023 · 1 comment
Open

RedPanda training inference error #115

qrpike opened this issue May 10, 2023 · 1 comment

Comments

@qrpike
Copy link
Contributor

qrpike commented May 10, 2023

Describe the bug
After following the red panda fine tuning tutorial, running the bot inference script with the output model results in an error.

$python ./inference/bot.py  --model=model_ckpts/hf/
Loading model_ckpts/hf/ to cuda:0...
Welcome to OpenChatKit shell.   Type /help or /? to list commands.

>>> who is allen turing?
Traceback (most recent call last):
  File "/home/ubuntu/OpenChatKit/./inference/bot.py", line 285, in <module>
    main()
  File "/home/ubuntu/OpenChatKit/./inference/bot.py", line 281, in main
    ).cmdloop()
  File "/home/ubuntu/miniconda3/envs/OpenChatKit/lib/python3.10/cmd.py", line 138, in cmdloop
    stop = self.onecmd(line)
  File "/home/ubuntu/miniconda3/envs/OpenChatKit/lib/python3.10/cmd.py", line 217, in onecmd
    return func(arg)
  File "/home/ubuntu/OpenChatKit/./inference/bot.py", line 150, in do_say
    output = self._model.do_inference(
  File "/home/ubuntu/OpenChatKit/./inference/bot.py", line 92, in do_inference
    outputs = self._model.generate(
  File "/home/ubuntu/miniconda3/envs/OpenChatKit/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/miniconda3/envs/OpenChatKit/lib/python3.10/site-packages/transformers/generation_utils.py", line 1326, in generate
    return self.sample(
  File "/home/ubuntu/miniconda3/envs/OpenChatKit/lib/python3.10/site-packages/transformers/generation_utils.py", line 1981, in sample
    next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

To Reproduce
Steps to reproduce the behavior:

Expected behavior
Inference to run properly.

Environment:
The code is running on a lambdalabs 8xA100 40GB SMX4

@ChengYen-Tang
Copy link

#86 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants