[No.57]自动问答及自动摘要技术研讨会

2021年10月29日 13:21
新IT卓越大讲堂

自动问答及自动摘要技术研讨会会议方式：线上线下研讨会会议时间：2021年11月5日9:00-11:30主办单位：山东财经大学计算机科学与技术学院线下地点：燕山校区3号教学楼3306（山东省区块链金融重点实验室会议室）线上会议：腾讯会议ID：143985164，密码：1105会议安排：1. 9:00-9:10：学院领导致辞2. 9:10-10:10：特邀报告（徐蔚然，少样本或零样本场景下知识获取及利用）3. 10:10-10:50：学生报告（孙铭，MMCN: Memory-augmented...

自动问答及自动摘要技术研讨会

会议方式：线上线下研讨会

会议时间：2021年11月5日9:00-11:30

主办单位：山东财经大学计算机科学与技术学院

线下地点：燕山校区3号教学楼3306（山东省区块链金融重点实验室会议室）

线上会议：腾讯会议ID：143985164，密码：1105

会议安排：

1. 9:00-9:10：学院领导致辞

2. 9:10-10:10：特邀报告（徐蔚然，少样本或零样本场景下知识获取及利用）

3. 10:10-10:50：学生报告（孙铭，MMCN: Memory-augmented Multi-modal Co-attention Network for Visual Question Answering）

4. 10:50-11:30：专家指导交流

报告内容：

1. 报告题目：少样本或零样本场景下知识获取及利用

报告人：徐蔚然北京邮电大学副教授

报告人简介：徐蔚然，北京邮电大学模式识别实验室副教授，硕士生导师。主要从事文本数据的机器学习研究，主持、参加2004年以来TREC、TAC、863和COAE等国内外知名的信息检索及信息抽取相关评测，多次取得单项和综合成绩第一名；主持并以主要成员参与多项国家自然科学基金、863项目和国家科技重大专项等；发表论文包括ACL、AAAI、SIGIR，EMNLP等顶会论文40余篇。目前研究内容主要是集中在多轮人机对话与情感分析、自动摘要抽取及生成、面向知识库的问答等方面，擅长构建一系列原型系统，其带领团队研发的面向2022冬残奥视障人群的多轮人机对话与情感分析技术即将应用于2022年冬残奥会视障人群服务系统。

2. 报告题目：MMCN: Memory-augmented Multi-modal Co-attention Network for Visual Question Answering

报告人：孙铭济南大学信息科学与工程学院硕士研究生

报告摘要：Visual Question Answering (VQA) is a multi-modal learning task to infer the correct answers based on a natural language question related to the content of visual images. Most recent VQA models widely utilize transformer-based architecture with an attention mechanism to focus on specific visual and textual information. However, since this kind of models only learn the similarity of input pairs, they ignore the high-level semantic information, which can lead to more accurate inference. To alleviate this problem, we propose a framework of Memory-augmented Multi-modal Co-attention Network (MMCN). The MMCN model aims to learn high-level semantic information and the dense interaction information in inter-modal and intra-modal. The memory-augmented co-attention layer is the core of the whole framework. It contains Memory-augmented Self-Attention (MSA) unit and the Memory-augmented Guided Attention (MGA) unit. The MSA learns high-level semantic information and dense interaction of intra-modal, while the MGA learns high-level semantic information and dense interaction information of inter-modal. To verify the effectiveness of our proposed model, we carried out experiment on the VQA 2.0 dataset. Experimental results demonstrate that MMCN is superior to the state-of-the-art methods.