Alpaca format mapping? #1704

EllenOrange · 2023-12-01T18:06:06Z

EllenOrange
Dec 1, 2023

Hi folks,

I have a dataset that is structured like this:

    {
        "instruction": "<my instruction>",
        "input": "<my input>",
        "output": "<my response>"
    }

I'm doing SFT training with --template alpaca

I ran an evaluation of my lora traning and it reported that the format it's using is:

<s> Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
<my instruction>
<my input>

### Response:

The model I'm doing the SFT training against expects

<s> Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
<my instruction>
### Input:
<my input>
### Response:

Looking in template.py it seems the result I'm getting is what the code is aiming to do, but I don't see a template that will result in the format the model expects.

Is there any way of accomplishing this?

Answered by hiyouga

Dec 3, 2023

We adopted the alpaca template without an input field since most samples in alpaca 52k have no input.
Using the alpaca template in this framework, you can choose to pre-process the training set by prepending the tokens to the inputs:

{
  "instruction": "<your instruction>",
  "input": "### Input:\n<your input>",
  "output": "<your response>"
}

View full answer

hiyouga · 2023-12-03T03:50:59Z

hiyouga
Dec 3, 2023
Maintainer

We adopted the alpaca template without an input field since most samples in alpaca 52k have no input.
Using the alpaca template in this framework, you can choose to pre-process the training set by prepending the tokens to the inputs:

{
  "instruction": "<your instruction>",
  "input": "### Input:\n<your input>",
  "output": "<your response>"
}

1 reply

EllenOrange Dec 3, 2023
Author

Makes sense, thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alpaca format mapping? #1704

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Alpaca format mapping? #1704

EllenOrange Dec 1, 2023

Replies: 1 comment · 1 reply

hiyouga Dec 3, 2023 Maintainer

EllenOrange Dec 3, 2023 Author

EllenOrange
Dec 1, 2023

Replies: 1 comment 1 reply

hiyouga
Dec 3, 2023
Maintainer

EllenOrange Dec 3, 2023
Author