Bridging Neurons and Symbols for Natural Language Processing and Knowledge Graphs Reasoning @ Coling 2025

Overview

Recent exploration shows that LLMs, e.g., ChatGPT, may pass the Turing test in human-like chatting but have limited capability even for simple reasoning tasks (Biever, 2023). It remains unclear whether LLMs reason or not (Mitchell, 2023). Human reasoning has been characterized as a dual-process phenomenon (see (Sun, 2023) for a general overview) or as mechanisms of fast and slow thinking (Kahneman, 2011). These findings suggest two directions for exploring neural reasoning: starting from existing neural networks to enhance the reasoning performance with the target of symbolic-level reasoning, and starting from symbolic reasoning to explore its novel neural implementation (Dong et al., 2024). These two directions will ideally meet somewhere in the middle and will lead to representations that can act as a bridge for novel neural computing, which qualitatively differs from traditional neural networks, and for novel symbolic computing, which inherits the good features of neural computing. Hence the name of our workshop, with a focus on Natural Language Processing and Knowledge Graph reasoning. This workshop promotes research in both directions, particularly seeking novel proposals from the second direction.

Keynote Speakers

Heng Ji

University of Illinois Urbana-Champaign

Symbolic Bridge between Low-Level Visual Perception and High-Level Language Reasoning

Abstract: Contemporary visual semantic representations predominantly revolve around common objects found in everyday images and videos, ranging from ladybugs and bunnies to airplanes. However, crucial visual cues extend beyond mere object recognition and interaction. They encompass a spectrum of richer semantics, including vector graphics (e.g., angles, mazes), fine-grained attributes and affordances, and scientific charts. Moreover, they entail intricate visual dynamics, such as object interactions, actions, activities and logical reasoning. Regrettably, traditional visual representations relying solely on pixels and regions fail to fully encapsulate these nuances. In this task, I propose to design intermediate symbolic semantic representations to precisely describe and aggregate these low-level visual signals. This augmentation promises to enhance their utility as inputs for large language models or vision-language models, thereby facilitating high-level knowledge reasoning and discovery tasks. I will present several applications range from playful maze solving and fine-grained concept recognition and video activity detection to new drug and material discovery.

Bio: Heng Ji is a professor at Siebel School of Computing and Data Science, and an affiliated faculty member at Electrical and Computer Engineering Department, Coordinated Science Laboratory, and Carl R. Woese Institute for Genomic Biology of University of Illinois Urbana-Champaign. She is an Amazon Scholar. She is the Founding Director of Amazon-Illinois Center on AI for Interactive Conversational Experiences (AICE). She received her B.A. and M. A. in Computational Linguistics from Tsinghua University, and her M.S. and Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing, especially on Multimedia Multilingual Information Extraction, Knowledge-enhanced Large Language Models and Vision-Language Models, and AI for Science. The awards she received include Outstanding Paper Award at ACL2024, two Outstanding Paper Awards at NAACL2024, "Young Scientist" by the World Laureates Association in 2023 and 2024, "Young Scientist" and a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016 and 2017, "Women Leaders of Conversational AI" (Class of 2023) by Project Voice, "AI's 10 to Watch" Award by IEEE Intelligent Systems in 2013, NSF CAREER award in 2009, PACLIC2012 Best paper runner-up, "Best of ICDM2013" paper award, "Best of SDM2013" paper award, ACL2018 Best Demo paper nomination, ACL2020 Best Demo Paper Award, NAACL2021 Best Demo Paper Award, Google Research Award in 2009 and 2014, IBM Watson Faculty Award in 2012 and 2014 and Bosch Research Award in 2014-2018. She was invited to testify to the U.S. House Cybersecurity, Data Analytics, & IT Committee as an AI expert in 2023. She was selected to participate in DARPA AI Forward in 2023. She served as the associate editor for IEEE/ACM Transaction on Audio, Speech, and Language Processing, and the Program Committee Co-Chair of many conferences including NAACL-HLT2018 and AACL-IJCNLP2022. She was elected as the North American Chapter of the Association for Computational Linguistics (NAACL) secretary 2020-2023.

Erhard Hinrichs

University of Tübingen

Large Language Models and the Death of Lexicography?

Abstract: The advent of large language models and generative AI tools has had a profound impact on many scientific disciplines. In this presentation, I examine their impact on the field of lexicography. Some lexicographers expect the imminent death of lexicography. They predict that lexicography as a scholarly practise of human experts will soon be replaced by generative AI that will produce lexica by purely automatic means. In order to evaluate these predictions, I will examine in some detail two tasks that need to be addressed by lexicography: (1) the generation of suitable example sentences that illustrate word senses in context, and (2) the automatic lemmatization of complex words as the basis for inclusion in digital lexica.

Bio: Erhard Hinrichs is Senior professor of General and Computational Linguistics and director of the Computational Linguistics research group at Tübingen University, Germany. He obtained a PhD in Linguistics from The Ohio State University in 1985. His previous positions include Research Fellow in Cognitive Science at the Beckman Institute for Advanced Science and Technology and Assistant Professor in Linguistics, both at the University of Illinois, Urbana-Champaign, and Research Scientist in the Artificial Intelligence Department at Bolt Beranek and Newman Laboratories, Cambridge, Mass. His research interests include the computational modelling of language (particularly of morphology, syntax, and semantics) and of language variation with special emphasis on the use of machine learning approaches. His current research focuses on the use of large language models for natural language processing of German. Erhard Hinrichs is an Honorary Lifetime Member of the Linguistic Society of America, an Honorary Member of the Foundation of Logic, Language, and Information, and the recipient of a Medal of Merit from the Bulgarian Academy of Sciences.

Yansong Feng

Peking University

Can Large Language Models Really Understand Grammar Rules for Low-resource Language Translation?

Abstract: Large language models (LLMs) have shown impressive capabilities in various challenging tasks. Recent studies even try to "teach" LLMs to understand an unseen language thourgh reading dictionaries and grammar books. While promising, it still remains unclear whether LLMs truly comprehend and apply the grammar rules effectively or as we expect. In this talk, we delve into the question of whether LLMs can leverage grammatical rules to improve low-resource language translation. We focus on the Zhuang language, an extremely low-resource language in China, for which we carefully annotate a set of grammatical rules. Our study reveals that current LLMs still struggle to accurately interpret and apply these grammar rules in the context of low-resource language translation. We will discuss the challenges we encountered and potential research directions to bridge this gap.

Bio: Yansong Feng is an associate professor at the Wangxuan Institute of Computer Technology, Peking University. Before that, he obtained his PhD and worked as an RA in ICCS (now ILCC) at the University of Edinburgh. His current research focuses on natural language processing, specially harnessing large language models for complex reasoning and supporting intelligent applications in legal domains. He has published over 90 papers in prestigious journals and conferences, including IEEE TPAMI, Artificial Intelligence, ACL, EMNLP, NAACL, and EACL. He was the program co-chair of NLPCC 2021, CCKS 2022 and EMNLP 2023 System Demos. He has served as senior action editor or area chair for ACL ARR and *ACL conferences. Yansong received the IBM Faculty Award in 2014 and 2015, and the IBM Global Shared University Research Award in 2016.

Yixin Cao

Singapore Management University

Towards complex reasoning by meaningful learning with symbolic & neuron systems

Abstract: Reasoning is a core capability of large language models (LLMs). However, LLMs often perform poorly in complex real-world tasks. We identify two main reasons for this: First, complex tasks often require various reasoning abilities, such as logical reasoning, numerical reasoning, and causal reasoning. How can we teach LLMs the missing reasoning skills or rules? Second, language models are inherently probabilistic and may not always be reliable. The integration of multiple reasoning techniques can exacerbate this unreliability. To address these issues, we introduce a variety of symbolic systems that offer reliable reasoning patterns, which can complement LLMs, forming a hybrid symbolic-neural system. Based on this, we summarize a general learning paradigm—meaningful learning—through a series of works, enabling LLMs to acquire the missing reasoning skills. Meaningful learning emphasizes that reasoning patterns should generalize across different contexts. If correct reasoning is only possible in certain situations, it may indicate that the model is either relying on memorization or has not fully mastered the reasoning skill.

Bio: Cao Yixin, male, is a tenure-track professor at School of Computer Science, Fudan University. He obtained his Ph.D. from Tsinghua University and has held positions as a research fellow, research assistant professor, and assistant professor at the National University of Singapore, Nanyang Technological University, and Singapore Management University. His research areas include natural language processing, knowledge engineering, and multimodal alignment. He has published over 60 papers at international renowned conferences and journals, with more than 6,900 citations on Google Scholar, and has been recognized as an outstanding oral presenter by top international conferences in the field. His research achievements have been awarded the Best Paper/Nomination at two international conferences. He has received Lee Kong Chian Fellowship, Google South Asia & Southeast Asia Awards, the AI2000 Most Influential Scholar honorable mention, and Top 2% Global Scientists of 2024 by Elsevier. Cao Yixin serves as the demonstration program chair or area chair for multiple international conferences, and as a reviewer for international journals.

Tiansi Dong

Fraunhofer IAIS & University of Cambridge

Can Supervised Deep-Learning achieve the rigour of logical reasoning?

Abstract: In this talk, I will argue that supervised deep learning cannot achieve the rigour of syllogistic reasoning, and, thus, will not reach the rigour of logical reasoning. I will spatialise syllogistic statements into part-whole relations between regions and define the neural criterion that is equivalent to the rigour of the symbolic level of syllogistic reasoning. By dissecting Euler Net (EN), a well-designed supervised deep learning system for syllogistic reasoning (reaching 99.8% accuracy on the benchmark dataset), I will show that EN will not reach 100% accuracy, due to the methodology of reasoning through a combination table to establish the mapping from premises to conclusions. As Transformer's Key-Query-Value structure is automatically learned combination tables, they and neural networks built upon them will not reach the rigour of syllogistic reasoning, either. RNNs are Turing complete under unbounded computation time. However, they cannot reach the criterion, as there is no consistent training data that covers all valid syllogistic reasoning types. This talk raises the question: which neural architecture can reach the rigour of syllogistic reasoning?

Bio: Dr. Tiansi Dong is the team lead of neurosymbolic representation learning at Fraunhofer IAIS, a visiting fellow of the Computer Lab at the University of Cambridge, and the primary contact chair of the Neural Reasoning and Mathematical Discovery Workshop (New Mad AI Workshop) at AAAI’25.