This is the GitHub repository of "AdvisorQA: Towards Helpful and Harmless Advice-seeking Question Answering with Collective Intelligence", accepted at NAACL 2025. This paper mainly discusses the hurdles to progress in subjective QA, mainly in post-processing (alignment).
AdvisorQA dataset is in "[data link]". If you download it as JSON files, move it to the 'data' directory for post-training: SFT, DPO, and PPO.
Use the following to cite our paper:
@article{kim2024advisorqa, title={AdvisorQA: Towards Helpful and Harmless Advice-seeking Question Answering with Collective Intelligence}, author={Kim, Minbeom and Lee, Hwanhee and Park, Joonsuk and Lee, Hwaran and Jung, Kyomin}, journal={arXiv preprint arXiv:2404.11826}, year={2024} } 