[NEW] Our ongoing work, AnglE, for angle-optimized text embeddings has achieved over 400k downloads on Hugging Face last month (peaked at over 500k). Congrats to Xianming, and thanks to the Hugging Face team! [Manuscript] [Github] [Hugging Face].
[NEW] Our ongoing work with SUSTech, LLM4Decompile, for Decompiling Binary Code with Large Language Models has achieved over 2.5k stars (☆) on Github. Congrats to Hanzhuo! [Manuscript] [Github].
[NEW] Our new work, HICL, a new pre-trained model for social media in context learning is accepted at TNNLS journal. Congrats to Hanzhuo! [Paper] [Github]
[NEW] One new work BeLLM, a Backward Dependency Enhanced Large Language Model for Sentence Embeddings is accepted at NAACL 2024. Congrats to Xianming! [Paper] [Github]
[NEW] One our new work PopALM, popularity-aligned LLM for popular comment prediction is accepted at LREC-COLING 2024. Congrats to Erxin! [Paper] [Github].
I'll be more than happy to work with a self-motivated student/staff. Drop me your CV if you are interested in joining our team as a PhD/Research Associate/Research Assistant/Post-doc. I may not be able to reply all emails (but I do read them)!
Dr. Jing Li is an Assistant Professor of the Department of Computing, The Hong Kong Polytechnic University (PolyU) since 2019. She is a member of Research Centre of Data Sciences and Artificial Intelligence (RC-DSAI). Before joining PolyU, she worked in the Natural Language Processing Center, Tencent AI Lab as a senior researcher from 2017 to 2019. Jing obtained her PhD degree from the Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong in 2017 under supervision of Professor Kam-Fai Wong. Before that, she received her B.S. degree from Department of Machine Intelligence, Peking University in 2013. Jing has broad research interests on Natural Language Processing (NLP), Computational Social Science (CSS), and Machine Learning (ML). Particularly, she works on novel algorithms for language representation learning, social media language understanding, conversation and social interaction modeling, and robust NLP and multimodal applications in the noisy real-world applications.
- Pre-training and Language Representation Learning
- Natural Language Understanding (NLU) for Social Media Contents
- Natural Language Processing (NLP) and Multimodal Applications
- Large Language Models and Embodied Agents
Selected Projects (as PI/PC) |
- Apr 2023 - Apr 2026: AI-Care: A Multimodal Financial Assistant for the Visually Impaired. ITF (Innovation Technology Fund).
- Mar 2022 - Feb 2023: Knowledge-Enhanced Automatic Essay Grading Research. Gift Fund from Zhongjiaoyunzhi (matched with RGC-RMGS).
- Jan 2022 - Dec 2024: Social-Transformers: A Deep Pre-training Framework for Social Media Language Understanding. RGC Early Career Scheme (ECS).
- July 2021 - June 2022: Development of a 3-hour Online Programme on Artificial Intelligence and Data Analytics. PolyU Internal Fund under Freshman Seminar for the Online Teaching Development and Educational Research Grant as PI with Co-PI Dr. Richard Lui. Feel free to explore the AIDA Interactive Playground (only for internal use of PolyU students)!
- Jan 2022 - Dec 2022: Pre-training Methods for Short Texts. CCF-Baidu Open Fund (matched with RGC-RMGS).
- Jan 2021 - Dec 2021: Comment-Aware Weakly-Supervised Classification for Social Media Texts. CCF-Tencent Rhino-Bird Young Faculty Open Research Fund (matched with RGC-RMGS).
- Jan 2021 - Dec 2023: Characterize, Detect, and Neutralize: Context-Aware Computational Methods for Media Bias on Social Platforms. NSFC (Young Scientists Fund).
- Oct 2019 - Sep 2022: Discourse Parsing for Online Conversations. PolyU Internal Fund.
[Google Scholar]
- Xianming Li and Jing Li
BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings
NAACL 2024. [Github]
- Erxin Yu, Jing Li, Chunpu Xu
PopALM: Popularity-Aligned Language Models for Social Media Trendy Response Prediction
LREC-COLING 2024. [Github]
- Hanzhuo Tan, Chunpu Xu, Jing Li, Yuqun Zhang, Zeyang Fang, Zeyu Chen, Baohua Lai
HICL: Hashtag-Driven In-Context Learning for Social Media Natural Language Understanding
TNNLS 2024. [Github]
- Yuji Zhang, Jing Li, Wenjie Li
VIBE: Topic-Driven Temporal Adaptation for Twitter Classification
EMNLP 2023. [Github]
- Renzhi Wang, Jing Li, Piji Li
InfoDiffusion: Information Entropy Aware Diffusion Process for Non-Autoregressive Text Generation
EMNLP 2023 (Findings). [Github]
- Chunpu Xu, Jing Li, Piji Li, Min Yang
Topic-Guided Self-Introduction Generation for Social Media Users
ACL 2023 (Findings). [Github]
- Yibing Liu, Haoliang Li, Yangyang Guo, Chengqi Kong, Jing Li, Shiqi Wang
Rethinking Attention-Model Explainability through Faithfulness Violation Test
ICML 2022. [Github]
- Zhengran Zeng, Hanzhuo Tan, Haotian Zhang, Jing Li, Yuqun Zhang, Lingming Zhang
An Extensive Study on Pre-trained Models for Program Understanding and Generation
ISSTA 2022. [Github]
- Chunpu Xu and Jing Li
Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification
EMNLP 2022. [Github]
- Chunpu Xu, Hanzhuo Tan, Jing Li, and Piji Li
Understanding Social Media Cross-Modality Discourse in Linguistic Space
EMNLP 2022 (Findings). [Github]
- Kaifa Zhao, Le Yu, Shiyao Zhou, Jing Li, Xiapu Luo, Yat Fei Aemon Chiu, Yutong Liu
A Fine-grained Chinese Software Privacy Policy Dataset for Sequence Labeling and Regulation Compliant Identification
EMNLP 2022. [Github]
- Xiaoxin Lu, Yubo Zhang, Jing Li, Shi Zong
Doctor Recommendation in Online Health Forums via Expertise Learning
ACL 2022. [Github]
- Lingzhi Wang, Jing Li, Xingshan Zeng, Kam-Fai Wong
Successful New-entry Prediction for Multi-Party Online Conversations via Latent Topics and Discourse Modeling
WWW 2022. [Github]
- Yuji Zhang, Yubo Zhang, Chunpu Xu, Jing Li, Ziyan Jiang, and Baolin Peng
#HowYouTagTweets: Learning User Hashtagging Preferences via Personalized Topic Attention
EMNLP 2021. [Github]
- Xingshan Zeng, Jing Li, Lingzhi Wang, and Kam-Fai Wong
Modeling Global and Local Interactions for Online Conversation Recommendation
ACM TOIS 2021.
- Rong Xiang, Jing Li, Mingyu Wan, Jinghang Gu, Qin Lu, Wenjie Li, Chu-Ren Huang
Affective Awareness in Neural Sentiment Analysis
KBS Journal Volume 226 (2021).
- Zexin Lu, Keyang Ding, Yuji Zhang, Jing Li, Baolin Peng, and Lemao Liu
Engage the Public: Poll Question Generation for Social Media Posts
ACL-IJCNLP 2021. [Github]
- Lu Ji, Zhongyu Wei, Jing Li, Qi Zhang, and Xuanjing Huang.
Discrete Argument Representation Learning for Interactive Argument Pair Identification
NAACL 2021.
- Lei Chen, Zhongyu Wei, Jing Li, Baohua Zhou, Qi Zhang, and Xuanjing Huang.
Modeling Evolution of Message Interaction for Rumor Resolution [Code]
COLING 2020.
- Keyang Ding, Jing Li, and Yuji Zhang.
Hashtags, Emotions, and Comments: A Large-Scale Dataset to Understand Find-Grained Social Emotions to Online Topics [Data]
EMNLP 2020 (short paper).
- Lingzhi Wang, Jing Li, Xingshan Zeng, Haisong Zhang, and Kam-Fai Wong.
Continuity of Topic, Interaction, and Query: Learning to Quote in Online Conversations
EMNLP 2020.
- Yue Wang, Jing Li, Michael R. Lyu, and Irwin King.
Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings [Github]
EMNLP 2020.
- Xingshan Zeng, Jing Li, Lu Wang, Zhiming Mao, and Kam-Fai Wong.
Dynamic Online Conversation Recommendation [Github]
ACL 2020.
- Jichuan Zeng, Jing Li, Yulan He, Cuiyun Gao, Michael R. Lyu, and Irwin King.
What Changed Your Mind: The Roles of Dynamic Topics and Discourse in Argumentation Process [Github]
WWW 2020.
- Ming Liao, Jing Li, Haisong Zhang, Lingzhi Wang, Xixin Wu, and Kam-Fai Wong.
Coupling Global and Local Context for Unsupervised Aspect Extraction
EMNLP 2019.
- Xingshan Zeng, Jing Li, Lu Wang, and Kam-Fai Wong.
Neural Conversation Recommendation with Online Interaction Modeling [Github]
EMNLP 2019
- Yue Wang, Jing Li, Hou Pong Chan, Irwin King, Michael R. Lyu, and Shuming Shi.
Topic-Aware Neural Keyphrase Generation for Social Media Language
[Github]
ACL 2019.
- Xingshan Zeng, Jing Li, Lu Wang, and Kam-Fai Wong.
Joint Effects of Context and User History for Predicting Online Conversation Re-entries
[Github]
ACL 2019.
- Yue Wang, Jing Li, Irwin King, Michael R. Lyu, and Shuming Shi.
Microblog Hashtag Generation via Encoding Conversation Contexts
NAACL 2019.
- Jichuan Zeng, Jing Li, Yulan He, Cuiyun Gao, Michael R. Lyu, and Irwin King
What You Say and How You Say it: Joint Modeling of Topics and Discourse in Microblog Conversations [Github]
TACL 2019 (presented in ACL 2019).
- Jing Li, Yan Song, Zhongyu Wei, and Kam-Fai Wong
A Joint Model of Conversational Discourse and Latent Topics on Microblogs
CL 2018. (Volume 44, Issue 4)
- Jichuan Zeng, Jing Li, Yan Song, Cuiyun Gao, Michael R. Lyu, and Irwin King
Topic Memory Networks for Short Text Classification [Code]
EMNLP 2018.
- Dingmin Wang, Yan Song, Jing Li, Jialong Han, and Haisong Zhang
A Hybrid Approach to Automatic Corpus Generation for Chinese Spelling Check [Github]
EMNLP 2018.
- Xingshan Zeng, Jing Li, Lu Wang, Nicholas Beauchamp, Sarah Schugars, and Kam-Fai Wong
Microblog Conversation Recommendation via Joint Modeling of Topics and Discourse
[Data]
NAACL 2018.
- Yingyi Zhang, Jing Li, Yan Song, and Chengzhi Zhang
Encoding Conversation Context for Neural Keyphrase Extraction from Microblog Posts
[Data]
NAACL 2018.
- Jing Li
Microblog Summarization Using Conversation Structures
PhD thesis. 2017
- Jing Li, Ming Liao, Wei Gao, Yulan He, and Kam-Fai Wong
Topic Extraction from Microblog Posts Using Conversation Structures [Data] [Code]
ACL 2016.
- Jing Li, Wei Gao, Zhongyu Wei, Baolin Peng, and Kam-Fai Wong
Using Content-level Structures for Summarizing Microblog Repost Trees [Data]
EMNLP 2015.
Research Experience
- Visiting PhD at Aston University, Birmingham, UK, Jan - Apr, 2016, Supervisor: Prof. Yulan He (Now with King's College London.)
- Visiting PhD at Northeastern University, Boston, USA, Feb - May, 2017, Supervisor: Prof. Lu Wang (Now with University of Michigan.)
Organization Committee Member:
- 2024: ACL (D&I), LREC-COLING (sponsorship), NLPCC (tutorial)
- 2023: NLPCC (student workshop), EMNLP (D&I)
- 2022: AACL (D&I)
- 2021: ACL-IJCNLP (sponsroship), NLPCC (publicity)
- 2020: EMNLP (publication/findings), ICONIP (tutorial)
Programme Committee Member (including area chairs and senior members):
- 2024: NAACL (area chair), ACL (area chair), LREC-COLING (area chair), NLPCC (area achair), IJCAI
- 2023: EACL, AAAI (senior member), ICASSP (meta reviewer), ACL (area chair), EMNLP (area chair)
- 2022: ACL (area chair), ICASSP (meta reviewer), AAAI, EMNLP.
- 2021: ACL (area chair), IJCAI (senior member), CCL (area chair), AAAI, EACL, NAACL
- 2020: AAAI, ACL, ICONIP (senior member)
- 2019: ACL, EMNLP, NAACL, AAAI
- 2018: ACL and EMNLP (Best reviewer award in EMNLP 2018)
- 2017: EACL and EMNLP
- 2016: EMNLP
- 2015: EMNLP
Reviewer
- TACL: July 2021 - June 2023.
- ACL Rolling Review (action edtior/area chair).
- Spring 2024: [COMP1433] Introduction to Data Analytics (co-teaching)
- Spring 2024: [COMP1433] Introduction to Data Analytics (co-teaching)
- Fall 2023: [COMP5423] Natural Language Processing
- Spring 2023: [COMP1433] Introduction to Data Analytics (co-teaching)
- Spring 2023: [COMP1004] Introduction to Artificial Intelligence and Data Analytics (co-teaching)
- Fall 2022: [COMP1004] Introduction to Artificial Intelligence and Data Analytics (co-teaching)
- Spring 2022: [COMP1433] Introduction to Data Analytics
- Spring 2021: [FH6051] Computational Linguistics (co-teaching)
- Spring 2021: [COMP5511] Artifical Intelligence Concepts
- Spring 2021: [COMP1433] Introduction to Data Analytics
- Spring 2020: [COMP1433] Introduction to Data Analytics
- Fall 2019: [COMP6701] Advanced Topics in Computer Algorithms (co-teaching)
- Fall 2019: [COMP4122] Game Design and Development (co-teaching)
Research Students and Staffs at PolyU
- Rong Xiang, Postdoc Fellow. May 2022-now. Publication: [KBS 2021].
- Ming Liao, Postdoc Fellow. Nov 2023 - now.
- Qiqun Geng, Research Associate. Nov 2023 - now.
- Yilin Zhang, Research Assistant. Jan 2024 - now.
- Zhe Hu, PhD student. Jan 2024-now.
- Xianming Li, MPhil student. Sep 2023-now. Publication: [NAACL 2024]
- Yi Zhao, PhD student. Sep 2023-now.
- Siqi Wang, PhD student (dual-degree programme with Tongji). Sep 2023-now. Co-supervisor: Prof. Haofen Wang (Tongji).
- Libo Zhao, PhD student (dual-degree programme with SCUT). Sep 2023-now. Co-supervisor: Prof. Ziqian Zeng (SCUT).
- Erxin Yu, PhD student. Sep 2022-now. Co-supervisor: Prof. Maggie Wenjie Li. Publication: [COLING 2024].
- Zuchen Gao, PhD student (collaborative programme with SUSTech). Sep 2022-now. Co-supervisors: Prof. Daniel Xiapu Luo and Prof. Yuqun Zhang (SUSTech).
- Hanzhuo Tan, PhD student (collaborative programme with SUSTech). Jan 2022-now. Co-supervisors: Chair Prof. Changwen Chen and Prof. Yuqun Zhang (SUSTech). Publications: [TNNLS 2024] [EMNLP 2022 (Findings)].
- Chunpu Xu, PhD student. Aug 2021-now. Co-supervisor: Prof. Daniel Xiapu Luo. Publications: [COLING 2024] [TNNLS 2024] [EMNLP 2022] [EMNLP 2022 (Findings)] [EMNLP 2021].
- Yuji Zhang, PhD student (Now visiting UIUC). Aug 2020-now. Co-supervisor: Prof. Maggie Wenjie Li. Publications: [EMNLP 2023] [EMNLP 2021] [ACL-IJCNLP 2021] [EMNLP 2020].
Alumni at PolyU:
- Yubo Zhang, Undergraduate Student (under PolyU URIS project), Aug 2020-Aug 2023 . Publications: [ACL 2022] [EMNLP 2021]. Now a PhD student at USC.
- Xiaoxin Lu, Master student and Project Assistant. Jan 2021-June 2022. Dissertation: Doctor Recommendation in Online Health Forums via Expertise Learning. Publication: [ACL 2022]. Now a PhD student at PSU.
- Zexin Lu, PhD student. Sep 2019 - Nov 2022. Thesis: Machine-Aided Online User Engagements. Co-supervisor: Chair Prof. Qing Li Publications: [ACL-IJCNLP 2021] [SLT 2021]. Now a Postdoc Fellow at PolyU.
- Yibing Liu, Research Asisstant. Apr 2021 - June 2021. Now a PhD student at CityU. Publication: [ICML 2022].
- Keyang Ding, Master Student and Research Assistant. Apr 2021 - Aug 2022. Dissertation: Hashtags, Emotions, and Comments: A Large-Scale Dataset to Understand Fine-Grained Social Emotions to Online Topics. Publications: [EMNLP 2020] [ACL-IJCNLP 2021]. Now a PhD student at HIT.
- Bing Wang, Visiting PhD student from Oxford. Mar 2021 - June 2021. Now an Assistant Professor at PolyU.
- Junfeng Jiang, Master student. Oct 2019-Mar 2021. Dissertation: Online Medical-consultation Recommendation System with Topic Model. Now a software engineer at Oppo.
- Hongliang Sun, Master student. Oct 2019-Mar 2021. Dissertation: Domain-Specific Language Model Continue Pretraining for Chinese Weibo. Now a PhD student in HIT.
- Jiancheng Wen, Research Assistant. Aug 2020-Feb 2021.
Previous Intern Students at Tencent:
- Yingyi Zhang, PhD student from NJUST. Oct 2017 - May 2018. Publications: [NAACL 2018] [JASIST 2019]. Now a Lecturer at Soochow University.
- Jichuan Zeng, PhD student from CUHK. Dec 2017 - Aug 2019. Publications: [EMNLP 2018 ] [TACL 2019] [WWW 2020]. Now a senior research engineer at ByteDance.
- Yue Wang, PhD student from CUHK. May 2018 - Aug 2019. Publications: [NAACL 2019] [ ACL 2019] [EMNLP 2020]. Now a senior research scientist at Salesforce.
- Lu Ji, Master student from Fudan. Apr - Aug 2019. Publication: [NAACL 2021]. Tencent Rhino-Bird Elite Training Program. Now an engineer at Pinduoduo.
- Ming Liao, PhD student from CUHK. May - Sep 2018. Publication: [EMNLP 2019]. Now a Postdoc Fellow at PolyU.
- Xiaoxue Liu, Master student from Nanjing University. May - Sep 2018. Now an engineer at Tencent.
Other Student Collaborators
|