- 
                Notifications
    You must be signed in to change notification settings 
- Fork 377
clue prompt templates #808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| jinja: "Question: \"{{question}}\"\nAnswer choices: {{ answer_choices[:-1] | join(',\ | ||
| \ ') }}, or {{ answer_choices[-1] }}?\nPassage: {% for statement in context\ | ||
| \ %} \n{{ statement }}\n{% endfor %}\n|||\n{{ answer }}" | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This renders as e.g. ['Given the dialogue / passage below, what is the answer for the question "根据对话,可以知道什么?"\nAnswer choices: 今天天气不好, 比赛时间变了, or  校长忘了时间?\n \n男:足球比赛是明天上午八点开始吧?\n \n女:因为天气不好,比赛改到后天下午三点了。', '比赛时间变了']
How should the model know where the passage actually ends?
It may be reasonable to just continue the previous passage. Fine-tuning on such examples may lead to generation quality decreasing I think cc @thomasw21
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the point you're making. I don't know why the rendering is a list ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad the first one is the input, the second the answer. So if we separate them with a whitespace the model will get:
Given the dialogue / passage below, what is the answer for the question "根据对话,可以知道什么?"\nAnswer choices: 今天天气不好, 比赛时间变了, or 校长忘了时间?\n \n男:足球比赛是明天上午八点开始吧?\n \n女:因为天气不好,比赛改到后天下午三点了。 比赛时间变了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I understand, yeah good point, I think putting the passage above makes more sense
{passage} 
Given the dialogue / passage below, what is the answer for the question "根据对话,可以知道什么?"\nAnswer choices: 今天天气不好, 比赛时间变了, or 校长忘了时间?
One other way of doing it is have a between input and target, which we've been avoiding but in this case it might make sense?
Nit: also you can remove the answer choices for him to figure out (much harder task, but would maybe help quite a bit the training?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is have a between you mean have a EOS token?
Agreed, let's add another prompt without answer choices if you agree @yongzx?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes woops
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeap I can do that (moving the passage before the task description, and adding prompts without answer choices).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Muennighoff @thomasw21 For prompts without answer choices, should I mark it as non-original because I think we should use ROUGE or other generation metrics instead of original metric "accuracy", as we are no longer choosing answer from the given answer options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, I'm not too familiar with how that terminology is used for. Perhaps @VictorSanh can help to know this kind of things.
| subset: csl | ||
| templates: | ||
| 219679f8-a02f-4ee3-91c7-9ed4726dd828: !Template | ||
| answer_choices: no ||| yes | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| answer_choices: no ||| yes | |
| answer_choices: yes ||| no | 
Currently I get the below:
['Do these keywords "纳米粒子, 干细胞, 氧化物, 顺磁性" represent key concepts in the abstract "目的探讨常见氧化铁纳米粒子几种神经干细胞标记技术的标记效率.材料与方法使用超顺磁性氧化铁纳米粒子(SPIO)和超微超顺磁性氧化铁纳米粒子(USPIO)以25μgFe/ml分别单独标记、与多聚赖氨酸(PLL)及脂质体联合标记神经干细胞,以未标记细胞做对照,采用普鲁士蓝染色评价细胞标记率,并采用4.7TMRIT2WI多回波序列测量T2弛豫率(R2)评价细胞内的铁摄取量,比较各组R2的差异.结果①普鲁士蓝染色结果:SPIO及USPIO单独标记组标记率为60%~70%,低于联合标记组的100%;②MRI结果:未标记细胞R2为(2.10±0.11)/s,SPIO、USPIO单独标记组细胞R2分别为(3.39±0.21)/s、(3.16±0.32)/s,SPIO-脂质体联合标记组及USPIO-脂质体联合标记组R2分别为(4.03±025)/s、(3.61±0.32)/s,SPIO-PLL联合标记组及USPIO-PLL联合标记组R2分别为(5.38±0.52)/s、(4.44±0.35)/s,SPIO、USPIO与PLL联合标记组R2大于SPIO、USPIO与脂质体联合标记组(P<0.05);而与脂质体联合标记组R2大于单独标记组(P<0.05);SPIO与USPIO单独标记细胞时R2差异无统计学意义(P>0.05),SPIO与脂质体或PLL联合标记时R2高于USPIO(P<0.05).结论SPIO、USPIO单独标记及与PLL、脂质体联合标记均可以成功标记神经干细胞,提高R2,其中SPIO与PLL联合标记效率最高."?', 'no']
我觉得应该是yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I follow the labeling described in the CLUE paper: https://arxiv.org/pdf/2004.05986.pdf (Table 5), and label 0 corresponds to false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so CSL appears to be very noisy, see this issue.
The example from the paper also appears with both labels..
Let's leave it as is, but not sure we should use it for fine-tuning
| name: write_keywords_after_abstract | ||
| reference: '' | ||
| 2e851dd2-2677-415a-ad90-5d885aa91fdc: !Template | ||
| answer_choices: no ||| yes | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| answer_choices: no ||| yes | |
| answer_choices: yes ||| no | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disagree for the reason given above (In CLUE paper, the label 0 corresponds to false).
| name: generate_keywords | ||
| reference: '' | ||
| aaf47f6f-fd8f-4180-8d85-e4c7df088ac6: !Template | ||
| answer_choices: no ||| yes | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| answer_choices: no ||| yes | |
| answer_choices: yes ||| no | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disagree for the reason given above (In CLUE paper, the label 0 corresponds to false).
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
| jinja: 'Do "{{ sentence1 }}" and "{{ sentence2 }}" express the same thing? | ||
|  | ||
| ||| | ||
|  | ||
| {{ answer_choices[label] }}' | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stupid question: How does this generate samples exactly? in particular with \n and whitespaces in the beginning and the end? Does it get trimmed all the time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the \n and whitespaces before & after ||| get trimmed away

Add prompt templates to the CLUE benchmark tasks.
Currently:
WIP: