Alpaca format mapping? #1704
-
Hi folks, I have a dataset that is structured like this: {
"instruction": "<my instruction>",
"input": "<my input>",
"output": "<my response>"
} I'm doing SFT training with --template alpaca I ran an evaluation of my lora traning and it reported that the format it's using is:
The model I'm doing the SFT training against expects
Looking in template.py it seems the result I'm getting is what the code is aiming to do, but I don't see a template that will result in the format the model expects. Is there any way of accomplishing this? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
We adopted the alpaca template without an input field since most samples in alpaca 52k have no input. {
"instruction": "<your instruction>",
"input": "### Input:\n<your input>",
"output": "<your response>"
} |
Beta Was this translation helpful? Give feedback.
We adopted the alpaca template without an input field since most samples in alpaca 52k have no input.
Using the alpaca template in this framework, you can choose to pre-process the training set by prepending the tokens to the inputs: