|
Lei Zhang Chair
Professor of Computer Vision and Image Analysis Fellow of IEEE Office: PQ816 I am also with OPPO Research Institute. |
|
Education
|
3/1998~10/2001 |
PhD |
Dept. of Automatic Control,
Northwestern Polytechnical University,
Xi'an, China. |
|
9/1995~3/1998 |
M.Sc |
Dept. of Automatic Control,
Northwestern Polytechnical University,
Xi'an, China. |
|
9/1991~7/1995 |
B.Sc |
Dept. of Aeronautical
Engineering, Shenyang
Inst. of Aeronautical Engineering, Shenyang, China. |
Work Experience
|
7/2017~present |
Chair Professor, Dept. of
Computing, Hong Kong Polytechnic University, Hong Kong. |
|
7/2015~6/2017 |
Professor, Dept. of
Computing, Hong Kong Polytechnic University, Hong Kong. |
|
9/2010~6/2015 |
Associate Professor, Dept.
of Computing, Hong Kong Polytechnic University, Hong Kong. |
|
1/2006~8/2010 |
Assistant Professor, Dept. of
Computing, Hong Kong Polytechnic University, Hong Kong. |
|
1/2003~1/2006 |
Postdoctoral Fellow, Dept. of Electrical and Computer
Engineering, McMaster University,
Canada. |
|
1/2001~1/2003 |
Research
Assistant/Associate, Dept. of Computing, Hong Kong Polytechnic University,
Hong Kong. |
|
Visual Computing Lab (our
mission): Y learning and beyond: for future visual enhancement and
understanding. |
My Google Scholar Citation Profile:
http://scholar.google.com/citations?user=tAK5l1IAAAAJ
|
|
|
News
|
1.
Several PhD Student positions jointly trained with OPPO Research Institute are available.
The research topics include Image/Video
Restoration/Enhancement, Image/Video Generation, LLM/VLM, Mobile MLLM, etc. Please send me your CV if you have interest. |
|
2.
Several Postdoctoral Fellow or Research Associate positions on Image/Video Generation and Restoration, LLM/VLM, Visual Understanding are
available. Please send me your CV if you have interest. |
|
3.
Research
Interns on Image/Video Enhancement, Image/Video Quality Assessment, Image/Video
Generation, Unified Models, Mobile MLLM, etc., are available at OPPO
Research Institute. Please send me your CV if
you have interest. |
Newly accepted
|
1.
Y.
Wu, C. Xie, R. Li, L. Chen, Q. Yi, L. Zhang, "CoCoEdit:
Content-Consistent Image Editing via Region Regularized Reinforcement
Learning," in ICML 2026. (paper) (code) (Edit the image as you instruct without changing the background
details!) |
|
2.
T.
Wu, R. Li, L. Zhang, K. Ma, "Diversity-Preserved Distribution Matching
Distillation for Fast Visual Synthesis," in ICML 2026. (paper) (code) (Completely address the loss of diversity in DMD distillation!) |
|
3.
G.
Li, K. Cen, B. Zhao, Y. Xin, S. Luo, G. Zhai, L. Zhang, X. Liu,
"LayerT2V: A Unified Multi-Layer Video Generation Framework," in
ICML 2026. (paper) (code) (Generating videos with editable layers!) |
|
4.
R.
Wu, L. Sun, Z. Zhang, X. Kong, J. Zhao, S. Wang, L. Zhang, "VOSR: A
Vision-Only Generative Model for Image Super-Resolution," in CVPR 2026. (paper) (code) (Train your strong generative SR models
from scratch without using text-image pairs!) |
|
5.
Q.
Yi, S. Li, R. Wu, L. Sun, Z. Zhang, L. Zhang, "GDPO-SR: Group Direct
Preference Optimization for One-Step Generative Image Super-Resolution,"
in CVPR 2026. (paper) (code) (Can we apply RL to one-step diffusion SR
models?) |
|
6.
C.
Xiao, Z. Zhang, L. Zhang, "BinaryAttention: One-Bit QK-Attention for
Vision and Diffusion Transformers," in CVPR 2026. (paper) (code) (Extremely low-bit attention without performance degradation!) |
|
7.
L.
Chen, P. Wang, G. Zhang, Z. Ma, L. Zhang, "Omni-3DEdit: Generalized
Versatile 3D Editing in One-Pass," in CVPR 2026. (Highlight!) (paper) (code) (The first generalized 3D editing model,
with fast speed!) |
|
8.
X.
Wei, K. Cen, H. Wei, Z. Guo, B. Li, Z. Wang, J. Zhang, L. Zhang,
"MICo-150K: A Comprehensive Dataset Advancing Multi-Image
Composition," in CVPR 2026. (paper) (code) (An elaborately constructed dataset and a
strong baseline model for multi-image composition!) |
|
9.
S.
Wang, G. Chen, D. Huang, Z. Li, M. Li, G. Li, J.M. Alvarez, L. Zhang, Z. Yu,
"VideoITG: Improving Multimodal Video Understanding with Instructed
Temporal Grounding," in CVPR 2026. (Highlight!) (paper) (code) (A plug and play approach and a dataset to
improve video understanding tasks!) |
|
10.
X. Liang,
Z. Ma, L. Sun, Y. Guo, L. Zhang, "Photo3D:
Advancing Photorealistic 3D Generation through Structure‑Aligned Detail
Enhancement," in CVPR 2026. (paper) (code) (To make 3D generation results more realistic!) |
|
11.
W.
Zhu, Y. Zhang, X. Jin, W. Zeng, L. Zhang, "ANTS: Shaping the Adaptive
Negative Textual Space by MLLM for OOD Detection," in CVPR 2026. (Oral!) (paper) (code) (Can MLLM help OOD detection?) |
|
12.
L.
Qu, S. Zhou, J. Liang, H. Zeng, L. Zhang, J. Yang, "It Takes Two: A Duet
of Periodicity and Directionality for Burst Flicker Removal," in CVPR
2026. (paper) (code) (To capture your precious moment without annoying flickers!) |
Preprint
|
1.
S.
Wang, S. Liu, Y. Kuang, X. Wei, Y. Liu, Z. Li, Y. Man, G. Chen, A. Tao, J.
Kautz, G. Liu, L. Zhang, Z. Yu, "LocateAnything: Fast and High-Quality
Vision-Language Grounding with Parallel Box Decoding," preprint. (paper) (code) (Fast and Accurate Object Grounding: A New Paradigm!) |
|
2.
X.
Kong, J. Zhao, L. Sun, R. Wu, L. Zhang, "GGT-100K: Generative Ground
Truth for Generalizable Real-World Image Restoration," preprint. (paper) (code) (Can multimodal foundation models be the
solution for generalizable real-world image restoration?) |
|
3.
R.
Li, T. Yang, F. Ai, T. Wu, S. Wen, B. Peng, Lei Zhang, "Long-Horizon
Streaming Video Generation via Hybrid Attention with Decoupled
Distillation," preprint. (paper) (code) (Video generation at 29.5 FPS (832x480) on
a single H100 GPU without quantization or model compression!) |
|
4.
Y.
Guo, Z. Zhang, P. Wang, X. Liang, Z. Ma, L. Zhang, "Memorize When
Needed: Decoupled Memory Control for Spatially Consistent Long-Horizon Video
Generation," preprint. (paper) (code) (Efficient training for spatially
consistent long-horizon video generation!) |
|
5.
W.
Li, Z. Qi, Z. Zhao, K. Zhang, L. Zhang, "Weighted Reverse Convolution
for Feature Upsampling," preprint. (paper) (code) (Making the features of vision foundation
models stronger!) |
|
6.
Z.
Zheng, C. He, S. Wang, Y. Li, M. Cheng, L. Zhang, "DEL: Digit Entropy
Loss for Numerical Learning of Large Language Models," preprint. (paper) (code) (A simple yet effective loss to improve the
numerical learning capability of LLMs!) |
|
7.
H.
Wang, C. Shen, L. Zhang, Z. Cheng, "ATSS: Detecting AI-Generated Videos
via Anomalous Temporal Self-Similarity," preprint. (paper) (code) (A highly effective algorithm to detect
AI-generated videos!) |
|
8.
L.
Sun, R. Wu, Z. Zhang, R. Li, Y. Sun, S. Liu, L. Zhang,
"Self-transcendence: Is External Feature Guidance Indispensable for
Accelerating Diffusion Transformer Training?" preprint. (paper) (code) (Do we really need pre-trained external
feature representations to accelerate DiT training?) |
|
9.
J.
Zhang, C. Xiao, A. Wu, X. Zhang, L. Zhang, "Pretraining
A Large Language Model using Distributed GPUs: A Memory-Efficient
Decentralized Paradigm," preprint. (paper) (code) (Can we train large-scale LLMs using GPUs
with low memory? ) |
|
10. Z. Wang, K. Wang, L. Zhang,
"PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V
Models," preprint. (paper) (code) (Is the generated video physically plausible and why?) |
|
11. Z. Wang, X. Wei, B. Li, Z. Guo, J. Zhang,
H. Wei, K. Wang, L. Zhang, "VideoVerse: Does Your T2V Generator Have
World Model Capability to Synthesize Videos?" preprint. (paper) (code) (To evaluate how strong your T2V model is!) |
|
12. X. Kong, R. Wu, S. Liu, L. Sun, L. Zhang,
"NSARM: Next-Scale Autoregressive Modeling for Robust Real-World Image
Super-Resolution," preprint. (paper) (code) (An efficient and robust AR model for
real-world super-resolution!) |
|
13. X. Wei, J. Zhang, Z. Wang, H. Wei, Z.
Guo, L. Zhang, "TIIF-Bench: How Does Your T2I Model Follow Your
Instructions?" preprint. (paper) (code) (To accurately evaluate T2I models' real
performance!) |