登录 EN

添加临时用户

基于提示控制预训练语言模型的对话系统

Prompt-based Grounded Dialogue System Exploiting Pre-trained Language Models

作者:张笑涵
  • 学号
    2019******
  • 学位
    硕士
  • 电子邮箱
    qzx******com
  • 答辩日期
    2022.05.20
  • 导师
    唐杰
  • 学科名
    数据科学和信息技术
  • 页码
    69
  • 保密级别
    公开
  • 培养单位
    601 清华大学全球创新学院
  • 中文关键词
    对话系统,自然语言生成,预训练语言模型,提示工程
  • 英文关键词
    Dialogue System,Natural Language Generation,Pre-trained Language Model,Prompt Engineering

摘要

对话系统是人工智能领域的重要挑战之一,近年来,预训练语言模型的突破在包括对话系统的众多下游任务上显示出了不俗的前景;研究者也发现,基于一定外部信息的生成(GDG) 能够有效增强对话系统的质量,促进生成有吸引力的对话内容。然而对于独立开发者来说,仍然面临许多挑战,如高质量对话训练语料的获取,外部信息资源的整合,以及精调大模型的昂贵成本等。针对上述这些问题,基于充分发挥大规模预训练语言模型的能力,我们提出了一种轻量的、零微调、融合信息的对话系统框架(XDAI)。具体而言,我们设计了一种面向对话任务的预训练模型提示构建方法,自然融合对话多轮上下文历史以及注入的信息,挖掘预训练语言模型的零样本学习能力。除了外部开放域或特定领域知识,也能够基于人物角色、情感倾向,语言风格等个性化信息进行受控生成;除此之外我们提供了易用的管理开放域领外部知识资源的机制,且具有特定领域的适应性;最后我们提供了轻便的系统部署方案,使得用户可以在诸如微信的即时通信系统及其他前端体验。在此框架下,开发者可以在无需精调大模型的情况下利用预训练语言模型,从而能够快速创建开放域对话系统,并轻松定制自己的特定领域的对话机器人。涵盖人类多维度评价、图灵测试和在线评估在内的大量实验表明,与最先进的通用预训练语言模型和特定经过精调的对话模型相比,我们提出的系统具有很强的竞争力。我们对于训练语言模型的能力挖掘进行了试点研究,并取得有趣的发现,这可能会对其他基于预训练模型的未来研究和应用有所启发。

Dialogue system is one of the crucial challenges in the field of artificial intelligence. In recent years, breakthroughs in large-scale pre-trained language models(PLMs) have shown promising prospects for various downstream tasks, including dialogue systems. Researchers have also revealed the enhancement of dialogue systems grounded on external information, leading to a more informative and engaging level of generation. However, individual developers still face many challenges, such as acquiring a high-quality conversational training corpus, integrating external information resources, and the expensive cost of fine-tuning large models. To tackle these obstacles, the thesis propose XDAI, a lightweight tuning-free framework for e\textbf{X}ploiting large-scale pre-trained language models in building grounded \textbf{D}ialogue \textbf{AI} systems (\textbf{XDAI}). Specifically, a construction method of prompts for pre-trained language models is designed, which naturally integrates the context of dialogue history and the injected external information, leveraging the zero-shot learning potential of PLMs.The system can be grounded not only with knowledge of the open or specific domain but also be with persona, emotion, linguistic style, and other personalized information. In addition, this approach is supported by the ready-to-use open-domain external knowledge resources, plus the easy-to-change domain-specific mechanism;Furthermore, we offer a lightweight deployment for the system, enabling users to experience instant interaction with our system via instant-message platforms. With XDAI, the developers can leverage the large-scale language models without any fine-tuning cost to quickly create the open-domain dialogue systems and easily customize their own domain-specific systems. Extensive experiments, including human evaluation, Turing test, and online evaluation, have demonstrated the competitive performance of our proposed system compared with the state-of-the-art general PLMs and specific tuned models for dialogue generation. The framework proposed pilot studies on the exploitation of pre-trained language models and made intriguing findings, which we hope could be inspiring for future research on other PLM-based applications.