Homepage

I am a Machine Learning Engineer at Airbnb. I received my PhD degree from the Department of Computer Science at Rice University, working with Dr. Xia (Ben) Hu. My thesis is Efficient Methods for Deep Reinforcement Learning: Algorithms and Applications. I was a Research Intern at Meta in Summer 2021 and Summer 2022, working on reinforcement learning and machine learning systems with Dr. Louis Feng, Dr. Yuandong Tian, Dr. Liang Luo, etc. I was a Research Intern at Seattle AI Lab of Kuai Inc. in Summer 2020, working with Dr. Wenye Ma and Dr. Ji Liu. I received my Bachelor degree in Computer Science from Wuhan University in 2018, working with Dr. Chenliang Li.

My research mainly focuses on Data Mining and Reinforcement Learning (RL), with interests in Anomaly and Outlier Detection, Graph Neural Networks, Time-Series Analysis, Recommender Systems, and Machine Learning Systems, etc.

📢News: Are you interested in data-centric AI? Please check out our newly released data-centric AI survey and awesome data-centric AI resources!

Open-Source Projects

RLCard: A Toolkit for Reinforcement Learning in Card Games

Presented in IJCAI 2020
GitHub Repo stars GitHub Repo forks  [Website] | [Paper] | [Demo] | [Code] | [Video]

DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning

Presented in ICML 2021
GitHub Repo stars GitHub Repo forks  [Demo] | [Paper] | [Code] | [Video]

AutoVideo: An Automated Video Action Recognition System)

Presented in IJCAI 2022
GitHub Repo stars GitHub Repo forks  [Paper] | [Code] | [Video]

TODS: An Automated Time-series Outlier Detection System

Presented in AAAI 2021
GitHub Repo stars GitHub Repo forks  [Website] | [Paper] | [Code] | [Video]

BED: A Real-Time Object Detection System for Edge Devices

Presented in CIKM 2022
GitHub Repo stars GitHub Repo forks  [Paper] | [Code] | [Video]

PyODDS: An End-to-end Outlier Detection System

Presented in WWW 2020
GitHub Repo stars GitHub Repo forks  [Website] | [Paper] | [Code]

DiscoverPath: A Knowledge Refinement and Retrieval System for Interdisciplinarity on Biomedical Research

Presented in CIKM 2023
GitHub Repo stars GitHub Repo forks  [Demo] | [Video] | [Paper] | [Code]

OpenGSL: A Comprehensive Benchmark for Graph Structure Learning

Presented in NeurIPS 2023
GitHub Repo stars GitHub Repo forks  [Website] | [Paper] | [Code]

FinGPT: Democratizing Internet-scale Data for Financial Large Language Models

Presented in NeurIPS 2023
GitHub Repo stars GitHub Repo forks  [Website] | [Paper] | [Code]

FinRL-Meta: A Metaverse of Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning

Presented in ACM ICAIF 2023
GitHub Repo stars GitHub Repo forks  [Website] | [Paper] | [Code] | [FinRL Contest]

Tutorials

Data-centric AI: Techniques and Future Perspectives

Presented in KDD 2023
[Website] | [Slide] | [Video] | [Paper]

Publications

[Google Schorlar]
* Equal contribution

2024

GAugLLM: Improving Graph Contrastive Learning for Text-Attributed Graphs with Large Language Models

Yi Fang, Dongzhe Fan, Daochen Zha, and Qiaoyu Tan
KDD 2024, ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Modality-Aware Integration with Large Language Models for Knowledge-Based Visual Question Answering

Junnan Dong, Qinggang Zhang, Huachi Zhou, Daochen Zha, Pai Zheng, Xiao Huang
ACL 2024, Association for Computational Linguistics
[Paper]

Denoising-Aware Contrastive Learning for Noisy Time Series

Shuang Zhou, Daochen Zha, Xiao Shen, Xiao Huang, Rui Zhang, Korris Chung
IJCAI 2024, International Joint Conferences on Artificial Intelligence

Enhanced DouDiZhu Card Game Strategy Using Oracle Guiding and Adaptive Deep Monte Carlo Method

Qian Luo, Tien Ping Tan, Daochen Zha, Tianqiao Zhang
IJCAI 2024, International Joint Conferences on Artificial Intelligence

DCAI: Data-centric Artificial Intelligence

Wei Jin, Haohan Wang, Daochen Zha, Qiaoyu Tan, Yao Ma, Sharon Li, Su-In Lee
WWW 2024, Web Conference, Workshop
[Paper]

Dynamic Datasets and Market Environments for Financial Reinforcement Learning

Xiao-Yang Liu, Ziyi Xia, Hongyang Yang, Jiechao Gao, Daochen Zha, Ming Zhu, Christina Dan Wang, Zhaoran Wang, Jian Guo
Machine Learning Journal
[Paper] | [Code]

2023

Data-centric Artificial Intelligence: A Survey

Daochen Zha, Zaid Pervaiz Bhat, Kwei-Herng Lai, Fan Yang, Zhimeng Jiang, Shaochen Zhong, Xia Hu
Arxiv 2023, Preprint
[Paper] | [Resource]

Auto-PINN: Understanding and Optimizing Physics-Informed Neural Architecture

Yicheng Wang, Xiaotian Han, Chia-Yuan Chang, Daochen Zha, Ulisses Braga-Neto, Xia Hu
NeurIPS Workshop 2023, NeurIPS-2023 Workshop on AI for Science: from Theory to Practice
[Paper]

FinGPT: Democratizing Internet-scale Data for Financial Large Language Models

Xiao-Yang Liu, Guoxuan Wang, Hongyang Yang, Daochen Zha
NeurIPS Workshop 2023, NeurIPS-2023 Workshop on Instruction Tuning and Instruction Following
[Paper] | [Code]

Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model

Zirui Liu*, Guanchu Wang*, Shaochen Zhong, Zhaozhuo Xu, Daochen Zha, Ruixiang Tang, Zhimeng Jiang, Kaixiong Zhou, Vipin Chaudhary, Shuai Xu, Xia Hu
NeurIPS 2023, Neural Information Processing Systems
[Paper]

OpenGSL: A Comprehensive Benchmark for Graph Structure Learning

Zhiyao Zhou, Sheng Zhou, Bochao Mao, Xuanyi Zhou, Jiawei Chen, Qiaoyu Tan, Daochen Zha, Can Wang, Yan Feng, Chun Chen
NeurIPS 2023, Neural Information Processing Systems
[Paper] | [Code]

One Less Reason for Filter Pruning: Gaining Free Adversarial Robustness with Structured Grouped Kernel Pruning

Shaochen Zhong, Zaichuan You, Jiamu Zhang, Sebastian Zhao, Zachary LeClaire, Zirui Liu, Daochen Zha, Vipin Chaudhary, Shuai Xu, and Xia Hu
NeurIPS 2023, Neural Information Processing Systems

Enhanced Generalization through Prioritization and Diversity in Self-Imitation Reinforcement Learning over Procedural Environments with Sparse Rewards

Alain Andres, Daochen Zha, Javier Del Ser
SSCI 2023, IEEE Symposium Series on Computational Intelligence

Double Wins: Boosting Accuracy and Efficiency of Graph Neural Networks by Reliable Knowledge Distillation

Qiaoyu Tan, Daochen Zha, Ninghao Liu, Soo-Hyun Choi, Li Li, Rui Chen and Xia Hu
ICDM 2023, IEEE International Conference on Data Mining

Tackling Diverse Minorities in Imbalance Classification

Kwei Herng Lai, Daochen Zha, Huiyuan Chen, Mangesh Bendre, Yuzhong Chen, Mashweta Das, Hao Yang and Xia Hu
CIKM 2023, ACM International Conference on Information and Knowledge Management
[Paper]

DiscoverPath: A Knowledge Refinement and Retrieval System for Interdisciplinarity on Biomedical Research

Yu-Neng Chuang, Guanchu Wang, Chia-Yuan Chang, Kwei-Herng Lai, Daochen Zha, Ruixiang Tang, Fan Yang, Alfredo Costilla Reyes, Kaixiong Zhou, Xiaoqian Jiang and Xia Hu
CIKM 2023, ACM International Conference on Information and Knowledge Management, demo track
Best Demo Paper Honorable Mention
[Paper] | [Code] | [Demo] | [Video]

SurCo: Learning SURrogate costs for COmbinatorial Nonlinear Optimization Problems

Aaron M Ferber, Taoan Huang, Daochen Zha, Martin Schubert, Benoit Steiner, Bistra Dilkina, Yuandong Tian
ICML 2023, International Conference on Machine Learning
[Paper] | [Code]

RSC: Accelerating Graph Neural Networks Training via Randomized Sparse Computations

Zirui Liu, Shengyuan Chen, Kaixiong Zhou, Daochen Zha, Xiao Huang, Xia Hu
ICML 2023, International Conference on Machine Learning
[Paper] | [Code]

Adaptive Popularity Debiasing Aggregator for Graph Collaborative Filtering

Huachi Zhou, Hao Chen, Junnan Dong, Daochen Zha, Chuang Zhou and Xiao Huang
SIGIR 2023, ACM SIGIR Conference on Research and Development in Information Retrieval
[Paper]

Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models

Daochen Zha, Louis Feng, Liang Luo, Bhargav Bhushanam, Zirui Liu, Yusuo Hu, Jade Nie, Yuzhen Huang, Yuandong Tian, Arun Kejariwal, Xia Hu
MLSys 2023, Conference on Machine Learning and Systems
[Paper] | [Slide] | [Poster] | [Code]

Towards Handling Metastable Failures in Distributed Systems with Offline Reinforcement Learning

Yueying Li, Daochen Zha, Tianjun Zhang, Francis Y. Yan, G. Edward Suh, Christina Delimitrou
ICLR 2023, International Conference on Learning Representations, tiny papers track
[Paper]

Data-centric AI: Perspectives and Challenges

Daochen Zha, Zaid Pervaiz Bhat, Kwei-Herng Lai, Fan Yang, Xia Hu
SDM 2023, SIAM International Conference on Data Mining
[Paper] | [Slide] | [Poster]

Bring Your Own View: Graph Neural Networks for Link Prediction with Personalized Subgraph Selection

Qiaoyu Tan, Xin Zhang, Ninghao Liu, Daochen Zha, Li Li, Rui Chen, Soo-Hyun Choi, Xia Hu
WSDM 2023, ACM International Conference on Web Search and Data Mining
[Paper]

Active Ensemble Learning for Knowledge Graph Error Detection

Junnan Dong, Qinggang Zhang, Xiao Huang, Qiaoyu Tan, Daochen Zha, Zihao Zhao
WSDM 2023, ACM International Conference on Web Search and Data Mining
[Paper]

2022

DreamShard: Generalizable Embedding Table Placement for Recommender Systems

Daochen Zha, Louis Feng, Qiaoyu Tan, Zirui Liu, Kwei-Herng Lai, Bhargav Bhushanam, Yuandong Tian, Arun Kejariwal, Xia Hu
NeurIPS 2022, Neural Information Processing Systems
[Paper] | [Slide] | [Poster] | [Code]

Towards Automated Imbalanced Learning with Deep Hierarchical Reinforcement Learning

Daochen Zha, Kwei-Herng Lai, Qiaoyu Tan, Sirui Ding, Na Zou, Xia Hu
CIKM 2022, ACM International Conference on Information and Knowledge Management
[Paper] | [Slide] | [Poster] | [Code]

BED: A Real-Time Object Detection System for Edge Devices

Guanchu Wang*, Zaid Pervaiz Bhat*, Zhimeng Jiang*, Yi-Wei Chen*, Daochen Zha*, Alfredo Costilla Reyes*, Afshin Niktash, Gorkem Ulkar, Erman Okman, Xuanting Cai, Xia Hu
CIKM 2022, ACM International Conference on Information and Knowledge Management, demo
Best Demo Paper Award
[Paper] | [Code] | [Video]

AutoShard: Automated Embedding Table Sharding for Recommender Systems

Daochen Zha, Louis Feng, Bhargav Bhushanam, Dhruv Choudhary, Jade Nie, Yuandong Tian, Jay Chae, Yinbin Ma, Arun Kejariwal, Xia Hu
KDD 2022, ACM SIGKDD Conference on Knowledge Discovery and Data Mining
[Paper] | [Slide] | [Poster] | [Code]

Towards Learning Disentangled Representations for Time Series

Yuening Li, Zhengzhang Chen, Daochen Zha, Mengnan Du, Jingchao Ni, Denghui Zhang, Haifeng Chen, Xia Hu
KDD 2022, ACM SIGKDD Conference on Knowledge Discovery and Data Mining
[Paper]

AutoVideo: An Automated Video Action Recognition System

Daochen Zha*, Zaid Pervaiz Bhat*, Yi-Wei Chen*, Yicheng Wang*, Sirui Ding*, Jiaben Chen*, Kwei-Herng Lai*, Mohammad Qazim Bhat*, Anmoll Kumar Jain, Alfredo Costilla Reyes, Na Zou, Xia Hu
IJCAI 2022, International Joint Conference on Artificial Intelligence, demo track
[Paper] | [Code] | [Poster] [Video]

Fairly Predicting Graft Failure in Liver Transplant for Organ Assigning

Sirui Ding, Ruixiang Tang, Daochen Zha, Na Zou, Kai Zhang, Xiaoqian Jiang, Xia Hu
AMIA 2022, AMIA Annual Symposium
Best Student Paper Finalists

In-Processing Modeling Techniques for Machine Learning Fairness: A Survey

Mingyang Wan, Daochen Zha, Ninghao Liu, Na Zou
TKDD 2022, ACM Transactions on Knowledge Discovery from Data
[Paper]

Towards Similarity-Aware Time-Series Classification

Daochen Zha, Kwei-Herng Lai, Kaixiong Zhou, Xia Hu
SDM 2022, SIAM International Conference on Data Mining
[Paper] | [Slide] | [Poster] | [Code]

2021

Dirichlet Energy Constrained Learning for Deep Graph Neural Networks

Kaixiong Zhou, Xiao Huang, Daochen Zha, Rui Chen, Li Li, Soo-Hyun Choi, Xia Hu
NeurIPS 2021, Neural Information Processing Systems
[Paper] | [Code]

Revisiting Time Series Outlier Detection: Definitions and Benchmarks

Kwei-Herng Lai, Daochen Zha, Junjie Xu, Yue Zhao, Guanchu Wang, Xia Hu
NeurIPS 2021, Neural Information Processing Systems, Datasets and Benchmarks Track
[Paper] | [Slide] | [Code]

DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning

Daochen Zha, Jingru Xie, Wenye Ma, Sheng Zhang, Xiangru Lian, Xia Hu, Ji Liu
ICML 2021, International Conference on Machine Learning
[Paper] | [Code] | [Demo] | [Slide] | [Poster]

AutoAD: Automated Anomaly Detection via Curiosity-guided Search and Self-imitation Learning

Yuening Li, Zhengzhang Chen, Daochen Zha, Kaixiong Zhou, Haifeng Jin, Haifeng Chen, Xia Hu
TNNLS 2021, IEEE Transactions on Neural Networks and Learning Systems
[Paper]

Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments

Daochen Zha, Wenye Ma, Lei Yuan, Xia Hu, Ji Liu
ICLR 2021, International Conference on Learning Representations
[Paper] | [Slide] | [Poster] | [Code]

AutoOD: Neural Architecture Search for Outlier Detection

Yuening Li, Zhengzhang Chen, Daochen Zha, Kaixiong Zhou, Haifeng Jin, Haifeng Chen, and Xia Hu
ICDE 2021, IEEE International Conference on Data Engineering
[PDF]

TODS: An Automated Time Series Outlier Detection System

Kwei-Herng Lai*, Daochen Zha*, Guanchu Wang, Junjie Xu, Yue Zhao, Devesh Kumar, Yile Chen, Purav Zumkhawaka, Mingyang Wan, Diego Martinez, Xia Hu
AAAI 2021, AAAI Conference on Artificial Intelligence, demo track
[Paper] | [Code] | [Video]

2020

Towards Deeper Graph Neural Networks with Differentiable Group Normalization

Kaixiong Zhou, Xiao Huang, Yuening Li, Daochen Zha, Rui Chen, and Xia Hu
NeurIPS 2020, Neural Information Processing Systems
[Paper] | [Code]

Meta-AAD: Active Anomaly Detection with Deep Reinforcement Learning

Daochen Zha, Kwei-Herng Lai, Mingyang Wan, and Xia Hu
ICDM 2020, IEEE International Conference on Data Mining
[Paper] | [Slide] | [Code]

PolicyGNN: Aggregation Optimization for Graph Neural Networks

Kwei-Herng Lai, Daochen Zha, Kaixiong Zhou, and Xia Hu
KDD 2020, ACM SIGKDD Conference on Knowledge Discovery and Data Mining
[Paper]

RLCard: A Platform for Reinforcement Learning in Card Games

Daochen Zha*, Kwei-Herng Lai*, Songyi Huang∗, Yuanpu Cao, Keerthana Reddy, Juan Vargas, Alex Nguyen, Ruzhe Wei, Junyu Guo, and Xia Hu
IJCAI 2020, International Joint Conference on Artificial Intelligence, demo track
[Paper] | [Poster] | [Code] | [Demo] | [Video]

Dual Policy Distillation

Daochen Zha*, Kwei-Herng Lai* Yuening Li, and Xia Hu
IJCAI 2020, International Joint Conference on Artificial Intelligence
[Paper] | [Code]

Multi-Channel Graph Neural Networks

Kaixiong Zhou, Qingquan Song, Xiao Huang, Daochen Zha, Na Zou, and Xia Hu
IJCAI 2020, International Joint Conference on Artificial Intelligence
[Paper]

PyODDS: An End-to-end Outlier Detection System with Automated Machine Learning

Yuening Li, Daochen Zha, Praveen Venugopal, Na Zou, and Xia Hu
WWW 2020 Web Conference, demo track
[Paper] | [Code]

RLCard: A Toolkit for Reinforcement Learning in Card Games

Daochen Zha, Kwei-Herng Lai, Yuanpu Cao, Songyi Huang, Ruzhe Wei, Junyu Guo, and Xia Hu
AAAI-Workshop 2020, AAAI-20 Workshop on Reinforcement Learning in Games
[Paper] | [Poster] | [Code] | [Video]

2019

Multi-Label Dataless Text Classification with Topic Modeling

Daochen Zha, and Chenliang Li
KAIS 2019, Knowledge and Information Systems Journal
[Paper] | [Code]

Experience Replay Optimization

Daochen Zha, Kwei-Herng Lai, Kaixiong Zhou, and Xia Hu
IJCAI 2019, International Joint Conferences on Artificial Intelligence
[Paper] | [Slide] | [Poster]

Service

Organizer: NewInML@NeurIPS’22, DCAI Workshop@WWW’24
Session Chair: KDD’23 Reinforcement Learning, IJCAI’20 Demo Track
Program Committee Member: NeurIPS’24 (Main and Dataset/Benchmark Tracks), KDD’24, ECCV’24, ICML’24, ICLR’24, CIKM’24, SDM’24, AAAI’24, IJCAI’24, CVPR’24, ASONAM’24, CIKM’23 (Long, Short, and Demo Tracks), KDD’23 (Research and ADS Tracks), NeurIPS’23 (Main and Dataset/Benchmark Tracks), ICLR’23, AAAI’23, IJCAI’23, CVPR’23, LOG’23, ECMLPKDD’23, Interactive Learning with Implicit Human Feedback@ICML’23, LOG’22, NeurIPS’22 (Main and Dataset/Benchmark Tracks), KDD’22, ICML’22, ICLR’22, WSDM 2022, IJCAI’22, AAAI’22, RL4RealLife@NeurIPS’22, NeurIPS’21, NeurIPS’21 Datasets and Benchmarks Track, RL4RealLife@ICML’21, OOD@KDD’21.

Journal editorial board: Frontiers in Big Data, Frontiers in Artificial Intelligence, American Journal of Information Science and Technology

Journal reviewer: Transactions on Information Systems, Machine Learning, Transactions on Artificial Intelligence, Journal of Artificial Intelligence Research, Journal of Big Data, Expert Systems with Applications, Discrete Applied Mathematics, International Journal of Data Science and Analytics, Jordanian Journal of Computers and Information Technology, IEEE/CAA Journal of Automatica Sinica, Wireless Communications and Mobile Computing, Journal of Supercomputing, Journal of Computer Science and Technology, Multimedia Tools and Applications, International Journal of Machine Learning and Cybernetics, Journal of Experimental & Theoretical Artificial Intelligence