Hi there!
I am currently a final-year Ph.D. student in the Department of Computer Science and Engineering at Texas A&M University. I am working at the DATA Lab under the supervision of Prof. Xia (Ben) Hu since 2019. My research interests lie in the general area of artifiicial intelligence, machine learning and data science, and recently Large Language Models.
News
- 2024.04: Honored to receive the Jane Street Graduate Research Fellowship Award Honorable Mentions.
- 2024.04: My LiteLLaMa has been downloaded over 120K times on HuggingFace!
- 2024.03: Implemented Triton based flash self-extend. Please try FlashSelfExtend to enjoy our self-extend!
- 2024.01: One paper on Fairness Benchmark accepted by ICLR2024!
- 2024.01: Our Survey on LLMs accepted by TKDD!
- 2024.01: New preprint LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning!
- 2023.12: One paper accepted by AAAI2024-SRRAI.
- 2023.09: One paper accepted by NeurIPS2023.
- 2023.07: 🔥🔥 Thrilled to release my LiteLLaMa on HuggingFace, try it out!
- 2023.07: Our paper LLM for Clinical Text Mining accepted by AMIA2023!
- 2023.05: One paper accepted by TMLR, Retiring ∆DP!
- 2023.05: Thrilled to start my internship at Amazon.
- 2023.01: One paper accepted by ICLR2023, MLPInit.
- 2022.09: Thrilled to start my internship at Meta, work with Qifan Wang.
- 2022.07: Our Paper $\mathcal{G}$-Mixup is awarded an Outstanding Paper Award at ICML 2022!
- 2022.05: Thrilled to start my internship at Snap Inc., work with Neil Shah.
- 2022.05: One paper accepted by ICML2022 (Oral).
- 2022.01: One paper accepted by ICLR2022.
-
2022.01: One paper accepted by TheWebConf2022.
-
More
- 2020.05: One paper accepted by RecSys2020.
I am on the 2023-2024 academic job market and am actively seeking a tenure-track faculty position. Please kindly contact me if there is a good fit.
Links:
<Google Schoolar> <CV> <Research Statement> <Writing Smaple 1> <Writing Smaple 2> <Writing Smaple 3> <Writing Smaple 4>
Research Interest:
I envision democratizing cutting-edge machine learning for high-impact societal applications with limited resources, In doing so, I aim to democratize cutting-edge machine learning techniques, unlocking their potential for wider applications and fostering substantial societal impact.
- Large Language Models:Long LLMs (Self-Extend), Efficient LLM pre-training (Preprint), Lite LLM Model (LiteLlama), , LLM Survey (LLM Evolution Tree), LLM for Healthcare (AMIA2023).
- Efficient Machine Learning: Data-Efficient ML (ICML2022, WWW2022), Computation-Efficient ML (ICLR2023).
- Trustworthy Machine Learning: Fair ML (NeurIPS2023, KDDExplo), Fairness Evaluation (TMLR, ICLR2022, ICLR2024 Submission).
Research Highlights:
-
- ~ $\mathbf{20}$ ($\mathbf{11}$ first-author) peer-reviewed research papers published in ICML, ICLR, NeurIPS, WWW, KDD, AAAI, TMLR, etc.
- Train an LLM LiteLLaMa from scratch using 1 trillion tokens, $120,000+$ downloads to date.
Selected Awards & Honors:
- ICML2022 Outstanding Paper Award (first author)
- Excellent Ph.D. Student Award (One Per Year), CSE @ Texas A&M University, 2023
- Best Paper Awards, ADMA2018
- Outstanding Reviewer Award, ICML2022.
- Best Reviewer Award, CCF Transactions on Pervasive Computing and Interaction 2020
Selected Publications:
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Hongye Jin*, Xiaotian Han*, Jingfeng Yang, Zhimeng Jiang, Zirui Liu, Chia-Yuan Chang, Huiyuan Chen, Xia Hu
- This work elicits LLMs’ inherent ability to handle long contexts without fine-tuning.
GrowLength: Accelerating LLMs Pretraining by Progressively Growing Training Length
Xiaotian Han*, Hongye Jin*, Jingfeng Yang, Zhimeng Jiang, Chia-Yuan Chang, Xia Hu
- This paper introduces a novel, simple, and effective method named “GrowLength” to accelerate the pretraining process of LLMs.
FFB: A Fair Fairness Benchmark for In-Processing Group Fairness Methods
Xiaotian Han, Jianfeng Chi, Yu Chen, Qifan Wang, Han Zhao, Na Zou, Xia Hu, ICLR2024
- This paper introduces the Fair Fairness Benchmark (FFB), a benchmarking framework for in-processing group fairness methods.
$\mathcal{G}$-Mixup: Graph Augmentation for Graph Classification
Xiaotian Han, Zhimeng Jiang, Ninghao Liu, Xia Hu, ICML2022
- This work aims to slove the grpah data scarcity problem.
Publications
ICLR2024
FFB: A Fair Fairness Benchmark for In-Processing Group Fairness Methods [PDF] [Github]- Xiaotian Han, Jianfeng Chi, Yu Chen, Qifan Wang, Han Zhao, Na Zou, Xia Hu.
- ICLR2024
TKDD
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond [PDF] [Github]- Jingfeng Yang*, Hongye Jin*, Ruixiang Tang*, Xiaotian Han*, Qizhang Feng*, Haoming Jiang, Bing Yin, Xia Hu
- TKDD, 2023
NeurIPS2023
Chasing Fairness under Distribution Shift: a Model Weight Perturbation Approach [PDF]- Xiaotian Han*, Zhimeng Jiang*, Hongye Jin, Guanchu Wang, Rui Chen, Na Zou, Xia Hu.
- NeurIPS2023
KDDExp2022
Marginal Nodes Matter: Towards Structure Fairness in Graphs. [PDF]- Xiaotian Han, Kaixiong Zhou, Ting-Hsiang Wang, Jundong Li, Fei Wang, Na Zou
- KDD Explorations, 2022
AMIA2023
Does Synthetic Data Generation of LLMs Help Clinical Text Mining? [PDF]- Xiaotian Han*, Ruixiang Tang*, Xiaoqian Jiang, Xia Hu
- AMIA, 2023
TMLR2023
Retiring ∆DP: New Distribution-Level Metrics for Demographic Parity. [PDF]- Xiaotian Han*, Zhimeng Jiang*, Hongye Jin*, Zirui Liu, Na Zou, Qifan Wang, Xia Hu
- TMLR, 2023
ICLR2023
MLPInit: Embarrassingly Simple GNN Training Acceleration with MLP Initialization. [PDF] [SLIDES][CODE]- Xiaotian Han, Tong Zhao, Yozen Liu, Xia Hu, Neil Shah
- ICLR2023
ICML2022
$\mathcal{G}$-Mixup: Graph Augmentation for Graph Classification. [PDF] [SLIDES]- Xiaotian Han, Zhimeng Jiang, Ninghao Liu, Xia Hu.
- ICML2022, Outstanding Paper Award
WWW2022
Geometric Graph Representation Learning via Maximizing Rate Reduction. [PDF] [SLIDES] [CODE]- Xiaotian Han, Zhimeng Jiang, Ninghao Liu, Qingquan Song, Jundong Li, Xia Hu.
- TheWebConf2022
IJCAI2018
Aspect-Level Deep Collaborative Filtering via Heterogeneous Information Networks. [PDF] [CODE]- Xiaotian Han, Chuan Shi, Senzhang Wang, S Yu Philip, Li Song.
- IJCAI2018
APWeb2018
Representation Learning with Depth and Breadth for Recommendation using Multi-view Data. [PDF]- Xiaotian Han, Chuan Shi, Lei Zheng, S Yu Philip, Jianxin Li, Yuanfu Lu.
- APWeb-WAIM2018
- Do We Really Achieve Fairness with Explicit Sensitive Atrributes?
- Xiaotian Han, Zhimeng Jiang, Ninghao Liu, Na Zou, Qifan Wang, Xia Hu
- Under review, 2023
- You Only Debias Once: Towards Flexible Accuracy-Fairness Trade-offs at Inference Time
- Xiaotian Han, Tianlong Chen, Kaixiong Zhou, Zhimeng Jiang, Zhangyang Wang, Xia Hu.
- 2023
- GrowLength: Accelerating LLMs Pretraining by Progressively Growing Training Length [PDF]
- Xiaotian Han , Hongye Jin, Jingfeng Yang, Zhimeng Jiang, Chia-Yuan Chang, Xia Hu
- 2023
AAAI2024-SRRAI
Chasing Fairness in Graphs: A GNN Architecture Perspective- Zhimeng Jiang, Xiaotian Han, Chao Fan, Zirui Liu, Na Zou, Ali Mostafavi, Xia Hu.
- AAAI, 2024, Special Track on Safe, Robust and Responsible AI (SRRAI).
- Zhimeng Jiang, Xiaotian Han, Chao Fan, Zirui Liu, Na Zou, Ali Mostafavi, Xia Hu.
ICLR2022
Generalized Demographic Parity for Group Fairness. [PDF]- Zhimeng Jiang, Xiaotian Han, Chao Fan, Fan Yang, Ali Mostafavi, Xia Hu.
- ICLR2022
Recsys2020
AutoRec: An Automated Recommender System. [PDF] [CODE]- Ting-Hsiang Wang, Qingquan Song, Xiaotian Han, Zirui Liu, Jin Haifeng, Xia Hu.
- Recsys2020, Demo
AAAI2020
FlowScope: Spotting Money Laundering Based on Graphs. [PDF]- Xiangfeng Li, Shenghua Liu, Zifeng Li, Xiaotian Han, Chuan Shi, Bryan Hooi, He Huang, Xueqi Cheng.
- AAAI2020
WWWJ2020
Embedding Geographic Information for Anomalous Trajectory Detection. [PDF]- Ding Xiao, Li Song, Ruijia Wang, Xiaotian Han, Yanan Cai, Chuan Shi.
- World Wide Web 2020
KDD2019
Metapath-guided Heterogeneous Graph Neural Network for Intent Recommendation. [PDF]- Shaohua Fan, Junxiong Zhu, Xiaotian Han, Chuan Shi, Linmei Hu, Biyu Ma, Yongliang Li.
- KDD2019
TKDE2019
Deep Collaborative Filtering with Multi-aspect Information in Heterogeneous Networks. [PDF]- Chuan Shi, Xiaotian Han, Li Song, Xiao Wang, Senzhang Wang, Junping Du, Philip, S Yu.
- TKDE2019
ADMA2018
Anomalous Trajectory Detection Using Recurrent Neural Network. [PDF]- Li Song, Ruijia Wang, Ding Xiao, Xiaotian Han, Yanan Cai, Chuan Shi.
- ADMA2018, Best Paper Award
- LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning [PDF]
- Hongye Jin*, Xiaotian Han*, Jingfeng Yang, Zhimeng Jiang, Zirui Liu, Chia-Yuan Chang, Huiyuan Chen, Xia Hu
- Preprint, 2024
- Fair Graph Message Passing with Transparency.
- Zhimeng Jiang, Xiaotian Han, Chao Fan, Zirui Liu, Na Zou, Ali Mostafavi, Xia Hu
- Under review, 2022
- Topology Matters in Fair Graph Learning: a Theoretical Pilot Study.
- Zhimeng Jiang, Xiaotian Han, Chao Fan, Zirui Liu, Xiao Huang, Na Zou, Ali Mostafavi, Xia Hu
- Under review, 2022
- Towards Assumption-free Bias Mitigation**
- Chia-Yuan Chang, Yu-Neng Chuang, Kwei-Herng Lai, Xiaotian Han, Xia Hu, Na Zou.
- 2023
- Gradient Rewiring for Editable Graph Neural Network Training**
- Zhimeng Jiang, Zirui Liu, Xiaotian Han, Qizhang Feng, Hongye Jin, Qiaoyu Tan, Kaixiong Zhou, Na Zou, Xia Hu.
- 2023
- Beyond Fairness: Age-Harmless Parkinson’s Detection via Voice**
- Yicheng Wangang, Xiaotian Han, Leisheng Yu, Na Zou.
- 2023
- PokeMQA: Programmable knowledge editing for Multi-hop Question Answering
- Hengrui Gu, Kaixiong Zhou, Xiaotian Han, Ninghao Liu, Ruobing Wang, Xin Wang.
- 2023
- Reducing Communication Overhead in Distributed GNN Training via Client-Server Knowledge Distillation
- Song Jiang, Xiaotian Han, Yinglong Xia, Qifan Wang, Yizhou Sun.
- 2023
Educations
- Aug. 2019 - now, Ph.D. Student, Computer Science, Texas A&M University.
- Sept. 2016 - Jun. 2019, Master Degree, Computer Science, Beijing Univ. of Posts and Telecommunications.
- Sept. 2011 - Jun. 2015, Bacheler Degree, Communication Engineering, Shandong University.
Internships
- Amazon, Palo Alto, CA. May 2023 – Aug 2023
- Research Intern
- Large Language Model
- Work with Jingfeng Yang, Haoming Jiang, Qingyu Yin, Bin Bi, Chao Zhang.
- Meta, Menlo Park, CA. Sept. 2022 – April 2023
- Research Intern
- Understanding graph neural networks
- Work with: Qifan Wang
- Snap Research, Seattle, WA. Mar. 2022 - Aug. 2022
- Microsoft Research Asia, Beijing, China. Mar. 2019 - May. 2019
- Research Intern
- Hyperparameter Optimization and AutoML.
- Alibaba Group, Hangzhou, China. Jun. 2018 - Sept. 2018
- Research Intern
- Query recomendataion in Taobao App.
Awards & Honors
- Outstanding Paper Award, ICML2022
- Excellent Ph.D. Student Award (One Per Year), Department of CSE, Texas A&M University, 2023
- NeurIPS2023 Scholar Award
- Grad School Research and Presentation Travel Award, Texas A&M University, 2023
- Best Paper Awards, ADMA2018
- Travel Grant, Department of Computer Science & Engineering, Texas A&M University, 2022, 2023
- Travel Award, ICML2022.
- Outstanding Reviewer Award, ICML2022.
- Best Reviewer Award, CCF Transactions on Pervasive Computing and Interaction 2020
- 1st National Graduate Scholarship, Beijing University of Posts and Telecommunications, 2018
- 1st Student Scholarship, Beijing University of Posts and Telecommunications, 2017
Professional Acitivities
- Conference Reviewer:
- International Conference on Learning Representations (ICLR) - 2024
- ACM International Conference on Web Search and Data Mining (WSDM) - 2024
- Conference on Information and Knowledge Management (CIKM) - 2023
- International Conference on Machine Learning (ICML) - 2022, 2023
- Annual Conference on Neural Information Processing Systems (NeurIPS) - 2022, 2023
- AAAI Conference on Artificial Intelligence (AAAI) - 2021, 2022, 2023, 2024
- International Joint Conference on Artificial Intelligence (IJCAI) - 2021, 2023
- The ACM Web Conference (WWW) - 2023
- Empirical Methods in Natural Language Processing (EMNLP) - 2023
- International Conference on Data Mining (ICDM) - 2022
- ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) - 2023
- Session Chair:
- The ACM Web Conference (WWW) - 2023
- International Conference on Machine Learning (ICML) - 2022
- Volunteer:
- International Conference on Machine Learning (ICML) - 2022
- The North American Chapter of the Association for Computational Linguistics (NAACL) - 2022
Last updated on Jan 25, 2024.