[NEW] Our recent benchmark YesBut about humorous contradition understanding was accepted at NeurIPS 2024 for oral presentation. Congratulations to Zhe! [Paper] [Project] [Github] [Hugging Face]
[NEW] Our recent research work AnglE for angle-optimized text embeddings was published at ACL 2024 and achieved over 1 million downloads on Hugging Face last month (over 5 million downloads in under a year). Congrats to Xianming, and thanks to the Hugging Face team! [Paper] [Github] [Hugging Face].
[NEW] Our recent research work with SUSTech, LLM4Decompile, for Decompiling Binary Code with Large Language Models was published at EMNLP 2024 and achieved over 3.2k stars (☆) on Github. Congrats to Hanzhuo! [Paper] [Github].
[NEW] Congratulations to the establishment of PolyU Embodied Artificial Intelligence Lab under PolyU COMP.
I'm excited to work with a wonderful team of research students and staff! Drop me your CV if you are interested in joining our team as a PhD/Research Associate/Research Assistant/Post-doc. I may not be able to reply all emails (but I do read them).!
Dr. Jing Li is an Assistant Professor of the Department of Computing, The Hong Kong Polytechnic University (PolyU) since 2019. She established and currently leads the PolyU Embodied Artificial Intelligence Lab under PolyU-COMP, and is a member of Research Centre of Data Sciences and Artificial Intelligence (RC-DSAI). Before joining PolyU, she worked in the Natural Language Processing Center, Tencent AI Lab as a senior researcher from 2017 to 2019. Jing obtained her PhD degree from the Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong in 2017 under supervision of Professor Kam-Fai Wong. Before that, she received her B.S. degree from Department of Machine Intelligence, Peking University in 2013. Jing has broad research interests on Natural Language Processing (NLP), Computational Social Science (CSS), and Machine Learning (ML). Particularly, she researches novel algorithms for language representation learning, social media language understanding, conversational and social interaction modeling, and language grounding for human-centered applications. The mission of her research is to make NLP models more human-like, with a human-centered focus on understanding humans, communicating with humans, and coexisting with humans in society.
- Pre-training and Language Representation Learning
- Natural Language Understanding (NLU) for Social Media Contents
- Human-Centered Natural Language Processing (NLP)
- Large Language Models for Embodied Agents
Selected Projects (as PI/PC) |
- Aug 2024 - Jul 2029: PolyU Embodied Artificial Intelligence Lab. Gift Fund from Boyushenhang.
- Jul 2024 - Jun 2026: Simultaneous Translation for Multilingual Meetings. Gift Fund from Microware (Co-PIs: Dr. Jibin Wu and Dr. Yancheng Yuan)
- Jun 2024 - May 2025: Socially-Aligned Large Language Model and Its Applications. Gift Fund from Huawei.
- Apr 2023 - Apr 2026: AI-Care: A Multimodal Financial Assistant for the Visually Impaired. ITF (Innovation Technology Fund).
- Jul 2022 - Mar 2024: NLP-Enhanced STEM Education for Hong Kong Adolescents: A Trial Study on Secondary School Students. Gift Fund from HCL.
- Mar 2022 - Feb 2023: Knowledge-Enhanced Automatic Essay Grading Research. Gift Fund from Zhongjiaoyunzhi.
- Jan 2022 - Dec 2024: Social-Transformers: A Deep Pre-training Framework for Social Media Language Understanding. RGC Early Career Scheme (ECS).
- July 2021 - June 2022: Development of a 3-hour Online Programme on Artificial Intelligence and Data Analytics. PolyU Internal Fund under Freshman Seminar for the Online Teaching Development and Educational Research Grant (Co-PI: Dr. Richard Lui). Feel free to explore the AIDA Interactive Playground (only for internal use of PolyU students)!
- Jan 2022 - Dec 2022: Pre-training Methods for Short Texts. CCF-Baidu Open Fund.
- Jan 2021 - Dec 2021: Comment-Aware Weakly-Supervised Classification for Social Media Texts. CCF-Tencent Rhino-Bird Young Faculty Open Research Fund.
- Jan 2021 - Dec 2023: Characterize, Detect, and Neutralize: Context-Aware Computational Methods for Media Bias on Social Platforms. NSFC (Young Scientists Fund).
- Oct 2019 - Sep 2022: Discourse Parsing for Online Conversations. PolyU Internal Fund.
[Google Scholar]
- Zhe Hu, Tuo Liang, Jing Li, Yiren Lu, Yunlai Zhou, Yiran Qiao, Jing Ma, Yu Yin
Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions
NeurIPS 2024 (oral). [Project]
- Zhe Hu, Yixiao Ren, Jing Li, Yin Yu
VIVA: A Benchmark for Vision-Grounded Decision-Making with Human Values
EMNLP 2024. [Project]
- Hanzhuo Tan, Qi Luo, Jing Li, Yuqun Zhang
LLM4Decompile: Decompiling Binary Code with Large Language Models
EMNLP 2024. [Github] (3.2k stars)
- Libo Zhao, Jing Li, Ziqian Zeng
PsFuture: A Pseudo-Future-based Zero-Shot Adaptive Policy for Simultaneous Machine Translation
EMNLP 2024. [Github]
- Erxin Yu, Jing Li, Ming Liao, Siqi Wang, Gao Zuchen, Fei Mi, Lanqing Hong
CoSafe: Evaluating Large Language Model Safety in Multi-Turn Dialogue Coreference
EMNLP 2024 (short paper). [Github]
- Xianming Li and Jing Li
Generative Deduplication For Socia Media Data Selection
EMNLP 2024 (Findings). [Github]
- Siqi Wang, Chao Liang, Yunfan Gao, Yang Liu, Jing Li, Haofen Wang
Decoding Urban Industrial Complexity: Enhancing Knowledge-Driven Insights via IndustryScopeGPT
ACM MM 2024. [Github]
- Erxin Yu, Jing Li, Chunpu Xu
RePALM: Popular Quote Tweet Generation via Auto-Response Augmentation
ACL 2024 (Findings).
- Sirry Chen, Shuo Feng, Liang Songsong, Chen-Chen Zong, Jing Li, Piji Li
CACL: Community-Aware Heterogeneous Graph Contrastive Learning for Social Media Bot Detection
ACL 2024 (Findings). [Github]
- Jiashuo Wang, Chunpu Xu, Chak Tou Leong, Wenjie Li, Jing Li
Muffin: Mitigating Unhelpfulness in Emotional Support Conversations with Multifaceted AI Feedback
ACL 2024 (Findings). [Github]
- Xianming Li and Jing Li
AoE: Angle-optimized Embeddings for Semantic Textual Similarity (a.k.a., AnglE)
ACL 2024. [Github] [Hugging Face] (over 5 million downloads in under a year)
- Luyang Lin, Lingzhi Wang, Xiaoyan Zhao, Jing Li, Kam-Fai Wong
IndiVec: An Exploration of Leveraging Large Language Models for Media Bias Detection with Fine-Grained Bias Indicators
EACL 2024 (Findings). [Github]
- Xianming Li and Jing Li
BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings
NAACL 2024. [Github]
- Erxin Yu, Jing Li, Chunpu Xu
PopALM: Popularity-Aligned Language Models for Social Media Trendy Response Prediction
LREC-COLING 2024. [Github]
- Hanzhuo Tan, Chunpu Xu, Jing Li, Yuqun Zhang, Zeyang Fang, Zeyu Chen, Baohua Lai
HICL: Hashtag-Driven In-Context Learning for Social Media Natural Language Understanding
TNNLS 2024. [Github]
- Yuji Zhang, Jing Li, Wenjie Li
VIBE: Topic-Driven Temporal Adaptation for Twitter Classification
EMNLP 2023. [Github]
- Renzhi Wang, Jing Li, Piji Li
InfoDiffusion: Information Entropy Aware Diffusion Process for Non-Autoregressive Text Generation
EMNLP 2023 (Findings). [Github]
- Chunpu Xu, Jing Li, Piji Li, Min Yang
Topic-Guided Self-Introduction Generation for Social Media Users
ACL 2023 (Findings). [Github]
- Yibing Liu, Haoliang Li, Yangyang Guo, Chengqi Kong, Jing Li, Shiqi Wang
Rethinking Attention-Model Explainability through Faithfulness Violation Test
ICML 2022. [Github]
- Zhengran Zeng, Hanzhuo Tan, Haotian Zhang, Jing Li, Yuqun Zhang, Lingming Zhang
An Extensive Study on Pre-trained Models for Program Understanding and Generation
ISSTA 2022. [Github]
- Chunpu Xu and Jing Li
Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification
EMNLP 2022. [Github]
- Chunpu Xu, Hanzhuo Tan, Jing Li, and Piji Li
Understanding Social Media Cross-Modality Discourse in Linguistic Space
EMNLP 2022 (Findings). [Github]
- Kaifa Zhao, Le Yu, Shiyao Zhou, Jing Li, Xiapu Luo, Yat Fei Aemon Chiu, Yutong Liu
A Fine-grained Chinese Software Privacy Policy Dataset for Sequence Labeling and Regulation Compliant Identification
EMNLP 2022. [Github]
- Xiaoxin Lu, Yubo Zhang, Jing Li, Shi Zong
Doctor Recommendation in Online Health Forums via Expertise Learning
ACL 2022. [Github]
- Lingzhi Wang, Jing Li, Xingshan Zeng, Kam-Fai Wong
Successful New-entry Prediction for Multi-Party Online Conversations via Latent Topics and Discourse Modeling
WWW 2022. [Github]
- Yuji Zhang, Yubo Zhang, Chunpu Xu, Jing Li, Ziyan Jiang, and Baolin Peng
#HowYouTagTweets: Learning User Hashtagging Preferences via Personalized Topic Attention
EMNLP 2021. [Github]
- Xingshan Zeng, Jing Li, Lingzhi Wang, and Kam-Fai Wong
Modeling Global and Local Interactions for Online Conversation Recommendation
ACM TOIS 2021.
- Rong Xiang, Jing Li, Mingyu Wan, Jinghang Gu, Qin Lu, Wenjie Li, Chu-Ren Huang
Affective Awareness in Neural Sentiment Analysis
KBS Journal Volume 226 (2021).
- Zexin Lu, Keyang Ding, Yuji Zhang, Jing Li, Baolin Peng, and Lemao Liu
Engage the Public: Poll Question Generation for Social Media Posts
ACL-IJCNLP 2021. [Github]
- Lu Ji, Zhongyu Wei, Jing Li, Qi Zhang, and Xuanjing Huang.
Discrete Argument Representation Learning for Interactive Argument Pair Identification
NAACL 2021.
- Lei Chen, Zhongyu Wei, Jing Li, Baohua Zhou, Qi Zhang, and Xuanjing Huang.
Modeling Evolution of Message Interaction for Rumor Resolution [Code]
COLING 2020.
- Keyang Ding, Jing Li, and Yuji Zhang.
Hashtags, Emotions, and Comments: A Large-Scale Dataset to Understand Find-Grained Social Emotions to Online Topics [Data]
EMNLP 2020 (short paper).
- Lingzhi Wang, Jing Li, Xingshan Zeng, Haisong Zhang, and Kam-Fai Wong.
Continuity of Topic, Interaction, and Query: Learning to Quote in Online Conversations
EMNLP 2020.
- Yue Wang, Jing Li, Michael R. Lyu, and Irwin King.
Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings [Github]
EMNLP 2020.
- Xingshan Zeng, Jing Li, Lu Wang, Zhiming Mao, and Kam-Fai Wong.
Dynamic Online Conversation Recommendation [Github]
ACL 2020.
- Jichuan Zeng, Jing Li, Yulan He, Cuiyun Gao, Michael R. Lyu, and Irwin King.
What Changed Your Mind: The Roles of Dynamic Topics and Discourse in Argumentation Process [Github]
WWW 2020.
- Ming Liao, Jing Li, Haisong Zhang, Lingzhi Wang, Xixin Wu, and Kam-Fai Wong.
Coupling Global and Local Context for Unsupervised Aspect Extraction
EMNLP 2019.
- Xingshan Zeng, Jing Li, Lu Wang, and Kam-Fai Wong.
Neural Conversation Recommendation with Online Interaction Modeling [Github]
EMNLP 2019
- Yue Wang, Jing Li, Hou Pong Chan, Irwin King, Michael R. Lyu, and Shuming Shi.
Topic-Aware Neural Keyphrase Generation for Social Media Language
[Github]
ACL 2019.
- Xingshan Zeng, Jing Li, Lu Wang, and Kam-Fai Wong.
Joint Effects of Context and User History for Predicting Online Conversation Re-entries
[Github]
ACL 2019.
- Yue Wang, Jing Li, Irwin King, Michael R. Lyu, and Shuming Shi.
Microblog Hashtag Generation via Encoding Conversation Contexts
NAACL 2019.
- Jichuan Zeng, Jing Li, Yulan He, Cuiyun Gao, Michael R. Lyu, and Irwin King
What You Say and How You Say it: Joint Modeling of Topics and Discourse in Microblog Conversations [Github]
TACL 2019 (presented in ACL 2019).
- Jing Li, Yan Song, Zhongyu Wei, and Kam-Fai Wong
A Joint Model of Conversational Discourse and Latent Topics on Microblogs
CL 2018. (Volume 44, Issue 4)
- Jichuan Zeng, Jing Li, Yan Song, Cuiyun Gao, Michael R. Lyu, and Irwin King
Topic Memory Networks for Short Text Classification [Code]
EMNLP 2018.
- Dingmin Wang, Yan Song, Jing Li, Jialong Han, and Haisong Zhang
A Hybrid Approach to Automatic Corpus Generation for Chinese Spelling Check [Github]
EMNLP 2018.
- Xingshan Zeng, Jing Li, Lu Wang, Nicholas Beauchamp, Sarah Schugars, and Kam-Fai Wong
Microblog Conversation Recommendation via Joint Modeling of Topics and Discourse
[Data]
NAACL 2018.
- Yingyi Zhang, Jing Li, Yan Song, and Chengzhi Zhang
Encoding Conversation Context for Neural Keyphrase Extraction from Microblog Posts
[Data]
NAACL 2018.
- Jing Li
Microblog Summarization Using Conversation Structures
PhD thesis. 2017
- Jing Li, Ming Liao, Wei Gao, Yulan He, and Kam-Fai Wong
Topic Extraction from Microblog Posts Using Conversation Structures [Data] [Code]
ACL 2016.
- Jing Li, Wei Gao, Zhongyu Wei, Baolin Peng, and Kam-Fai Wong
Using Content-level Structures for Summarizing Microblog Repost Trees [Data]
EMNLP 2015.
Research Experience
- Visiting PhD at Aston University, Birmingham, UK, Jan - Apr, 2016, Supervisor: Prof. Yulan He (Now with King's College London.)
- Visiting PhD at Northeastern University, Boston, USA, Feb - May, 2017, Supervisor: Prof. Lu Wang (Now with University of Michigan.)
Organization Committee Member:
- 2024: ACL (D&I), LREC-COLING (sponsorship), NLPCC (tutorial), EMNLP (Internal Communication)
- 2023: NLPCC (student workshop), EMNLP (D&I)
- 2022: AACL (D&I)
- 2021: ACL-IJCNLP (sponsroship), NLPCC (publicity)
- 2020: EMNLP (publication/findings), ICONIP (tutorial)
Programme Committee Member (including area chairs and senior members):
- 2025: NAACL (Senior Area Chair)
- 2024: NAACL (area chair), ACL (area chair), LREC-COLING (area chair), NLPCC (area achair), IJCAI
- 2023: EACL, AAAI (senior member), ICASSP (meta reviewer), ACL (area chair), EMNLP (area chair)
- 2022: ACL (area chair), ICASSP (meta reviewer), AAAI, EMNLP.
- 2021: ACL (area chair), IJCAI (senior member), CCL (area chair), AAAI, EACL, NAACL
- 2020: AAAI, ACL, ICONIP (senior member)
- 2019: ACL, EMNLP, NAACL, AAAI
- 2018: ACL and EMNLP (Best reviewer award in EMNLP 2018)
- 2017: EACL and EMNLP
- 2016: EMNLP
- 2015: EMNLP
Reviewer
- CL: Sep 2024 - now.
- TACL: July 2021 - now.
- ACL Rolling Review (action edtior/area chair).
Faculty of Engineering Merit Award in Teaching 2023 (Team award as team leader with Dr. Richard Lui)
- Spring 2025: [COMP1433] Introduction to Data Analytics (co-teaching)
- Fall 2024: [COMP5423] Natural Language Processing
- Spring 2024: [COMP1433] Introduction to Data Analytics (co-teaching)
- Spring 2024: [COMP1433] Introduction to Data Analytics (co-teaching)
- Fall 2023: [COMP5423] Natural Language Processing
- Spring 2023: [COMP1433] Introduction to Data Analytics (co-teaching)
- Spring 2023: [COMP1004] Introduction to Artificial Intelligence and Data Analytics (co-teaching)
- Fall 2022: [COMP1004] Introduction to Artificial Intelligence and Data Analytics (co-teaching)
- Spring 2022: [COMP1433] Introduction to Data Analytics
- Spring 2021: [FH6051] Computational Linguistics (co-teaching)
- Spring 2021: [COMP5511] Artifical Intelligence Concepts
- Spring 2021: [COMP1433] Introduction to Data Analytics
- Spring 2020: [COMP1433] Introduction to Data Analytics
- Fall 2019: [COMP6701] Advanced Topics in Computer Algorithms (co-teaching)
- Fall 2019: [COMP4122] Game Design and Development (co-teaching)
Research Students and Staffs at PolyU
- Xuan Luo, PhD student (dual-degree programme with HIT). Sep 2024-now. Co-supervisor: Prof. Ruifeng Xu
- Guanzhong Liu, Research Associate. Sep 2024 - now.
- Yuanhang Yang, Research Associate. Sep 2024 - now.
- Andrew Tsz Fung Lee, Research Associate. Aug 2024 - now.
- Yixiao Ren, Undergraduate Student (PolyU-URIS programme). June 2024 - now. [EMNLP 2024]
- Zhe Hu, PhD student. Jan 2024-now. Publications: [NeurIPS 2024] [EMNLP 2024]
- Ming Liao, Postdoc Fellow. Nov 2023 - now. Publication: [EMNLP 2024].
- Qiqun Geng, Research Associate. Nov 2023 - now.
- Rong Xiang, Postdoc Fellow. May 2022-now. Publication: [KBS 2021].
- Xianming Li, PhD student. Sep 2023-now. Publication: [EMNLP 2024 (Findings)] [ACL 2024] [NAACL 2024]
- Yi Zhao, PhD student. Sep 2023-now.
- Siqi Wang, PhD student (dual-degree programme with Tongji). Sep 2023-now. Co-supervisor: Prof. Haofen Wang (Tongji). Publication: [ACM MM 2024] [EMNLP 2024]
- Libo Zhao, PhD student (dual-degree programme with SCUT). Sep 2023-now. Co-supervisor: Prof. Ziqian Zeng (SCUT). Publication: [EMNLP 2024].
- Erxin Yu, PhD student. Sep 2022-now. Co-supervisor: Prof. Maggie Wenjie Li. Publication: [EMNLP 2024] [ACL 2024 (Findings)] [COLING 2024].
- Zuchen Gao, PhD student (collaborative programme with SUSTech). Sep 2022-now. Co-supervisors: Prof. Daniel Xiapu Luo and Prof. Yuqun Zhang (SUSTech). Publication: [EMNLP 2024].
- Hanzhuo Tan, PhD student (collaborative programme with SUSTech). Jan 2022-now. Co-supervisors: Chair Prof. Changwen Chen and Prof. Yuqun Zhang (SUSTech). Publications: [EMNLP 2024] [TNNLS 2024] [EMNLP 2022 (Findings)].
- Chunpu Xu, PhD student. Aug 2021-now. Co-supervisor: Prof. Daniel Xiapu Luo. Publications: [ACL 2024 (Findings)] [ACL 2024 (Findings)] [COLING 2024] [TNNLS 2024] [EMNLP 2022] [EMNLP 2022 (Findings)] [EMNLP 2021].
Alumni at PolyU:
- Yuji Zhang, PhD student. Aug 2020-Aug 2024. Thesis: Forecast the future : dynamic natural language understanding in evolving social media environments. Co-supervisor: Prof. Maggie Wenjie Li. Publications: [EMNLP 2023] [EMNLP 2021] [ACL-IJCNLP 2021] [EMNLP 2020]. Now a Postdoc at UIUC.
- Zexin Lu, PhD student. Sep 2019 - Nov 2022. Thesis: Machine-Aided Online User Engagements. Co-supervisor: Chair Prof. Qing Li Publications: [ACL-IJCNLP 2021] [SLT 2021]. Now a Researcher at Huawei.
- Yilin Zhang, Research Assistant, Jan 2024 - Jul 2024. Now a Master Student at CMU.
- Yubo Zhang, Undergraduate Student (under PolyU URIS project), Aug 2020-Aug 2023. Publications: [ACL 2022] [EMNLP 2021]. Now a PhD student at USC.
- Xiaoxin Lu, Master student and Project Assistant. Jan 2021-June 2022. Dissertation: Doctor Recommendation in Online Health Forums via Expertise Learning. Publication: [ACL 2022]. Now a PhD student at PSU.
- Yibing Liu, Research Asisstant. Apr 2021 - June 2021. Now a PhD student at CityU. Publication: [ICML 2022].
- Keyang Ding, Master Student and Research Assistant. Apr 2021 - Aug 2022. Dissertation: Hashtags, Emotions, and Comments: A Large-Scale Dataset to Understand Fine-Grained Social Emotions to Online Topics. Publications: [EMNLP 2020] [ACL-IJCNLP 2021]. Now a PhD student at HIT.
- Bing Wang, Visiting PhD student from Oxford. Mar 2021 - June 2021. Now an Assistant Professor at PolyU.
- Junfeng Jiang, Master student. Oct 2019-Mar 2021. Dissertation: Online Medical-consultation Recommendation System with Topic Model. Now a software engineer at Oppo.
- Hongliang Sun, Master student. Oct 2019-Mar 2021. Dissertation: Domain-Specific Language Model Continue Pretraining for Chinese Weibo. Now a PhD student at HIT.
- Jiancheng Wen, Research Assistant. Aug 2020-Feb 2021.
Previous Intern Students at Tencent:
- Yingyi Zhang, PhD student from NJUST. Oct 2017 - May 2018. Publications: [NAACL 2018] [JASIST 2019]. Now a Lecturer at Soochow University.
- Jichuan Zeng, PhD student from CUHK. Dec 2017 - Aug 2019. Publications: [EMNLP 2018 ] [TACL 2019] [WWW 2020]. Now a senior research engineer at ByteDance.
- Yue Wang, PhD student from CUHK. May 2018 - Aug 2019. Publications: [NAACL 2019] [ ACL 2019] [EMNLP 2020]. Now a senior research scientist at Salesforce.
- Lu Ji, Master student from Fudan. Apr - Aug 2019. Publication: [NAACL 2021]. Tencent Rhino-Bird Elite Training Program. Now an engineer at Pinduoduo.
- Ming Liao, PhD student from CUHK. May - Sep 2018. Publication: [EMNLP 2019]. Now a Postdoc Fellow at PolyU.
- Xiaoxue Liu, Master student from Nanjing University. May - Sep 2018. Now an engineer at Tencent.
Other Student Collaborators
|