Han Zhong

Cituota

	Visi	Nuo 2019
Šaltiniai	472	472
h-rodyklė	13	13
i10-rodyklė	14	14

280

140

210

20212022202320245 40 147 278

Viešas pasiekiamumas

Peržiūrėti viską

5 straipsniai

0 straipsnių

pasiekiami

nepasiekiami

Pagal finansavimo įpareigojimus

Bendraautoriai

Liwei WangProfessor, Peking UniversityPatvirtintas el. paštas cis.pku.edu.cn
Tong ZhangUIUCPatvirtintas el. paštas tongzhang-ml.org
Wei XiongComputer Science, University of Illinois Urbana-ChampaignPatvirtintas el. paštas illinois.edu
Zhaoran WangAssistant Professor at Northwestern UniversityPatvirtintas el. paštas northwestern.edu
Zhuoran YangYale UniversityPatvirtintas el. paštas yale.edu
Simon Shaolei DuAssistant Professor, School of Computer Science and Engineering, University of WashingtonPatvirtintas el. paštas cs.washington.edu
Yunchang YangPeking UniversityPatvirtintas el. paštas pku.edu.cn
Tianhao WuUniversity of California, BerkeleyPatvirtintas el. paštas berkeley.edu
Chengshuai ShiElectrical and Computer Engineering, University of VirginiaPatvirtintas el. paštas virginia.edu
Cong ShenAssociate Professor, University of VirginiaPatvirtintas el. paštas virginia.edu
Hanze DongSalesforce ResearchPatvirtintas el. paštas salesforce.com
Chenlu YeHong Kong University of Science and TechnologyPatvirtintas el. paštas connect.ust.hk
Michael I. JordanProfessor of Electrical Engineering and Computer Sciences and Professor of Statistics, UC BerkeleyPatvirtintas el. paštas cs.berkeley.edu
Shenao ZhangNorthwestern UniversityPatvirtintas el. paštas gatech.edu
Xiaoyu ChenPeking UniversityPatvirtintas el. paštas pku.edu.cn
Jose BlanchetStanford UniversityPatvirtintas el. paštas stanford.edu
Rui YangHong Kong University of Science and TechnologyPatvirtintas el. paštas connect.ust.hk
Jiyuan TanStanford UniversityPatvirtintas el. paštas stanford.edu
Lin F. Yang (杨林)Assistant Professor, Department of Electrical and Computer Engineering @ UCLAPatvirtintas el. paštas ee.ucla.edu
Jiayi HuangPeking UniversityPatvirtintas el. paštas stu.pku.edu.cn

Stebėti

Han Zhong

Peking University

Patvirtintas el. paštas stu.pku.edu.cn - Pagrindinis puslapis

Machine Learning


Pavadinimas Rūšiuoti pagal šaltinius Rūšiuoti pagal metus Rūšiuoti pagal pavadinimą	Cituota Cituota	Metai
GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond H Zhong, W Xiong, S Zheng, L Wang, Z Wang, Z Yang, T Zhang arXiv preprint arXiv:2211.01962, 2022	52*	2022
Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation X Chen, H Zhong, Z Yang, Z Wang, L Wang International Conference on Machine Learning, 3773-3793, 2022	47	2022
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint W Xiong, H Dong, C Ye, Z Wang, H Zhong, H Ji, N Jiang, T Zhang ICLR 2024 Workshop on Mathematical and Empirical Understanding of Foundation …, 2023	46*	2023
Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game W Xiong, H Zhong, C Shi, C Shen, L Wang, T Zhang arXiv preprint arXiv:2205.15512, 2022	42	2022
Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopically Rational Followers? H Zhong, Z Yang, Z Wang, MI Jordan Journal of Machine Learning Research 24 (35), 1-52, 2023	41*	2023
Pessimistic minimax value iteration: Provably efficient equilibrium learning from offline datasets H Zhong, W Xiong, J Tan, L Wang, T Zhang, Z Wang, Z Yang International Conference on Machine Learning, 27117-27142, 2022	40	2022
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration Z Liu, M Lu, W Xiong, H Zhong, H Hu, S Zhang, S Zheng, Z Yang, Z Wang Thirty-seventh Conference on Neural Information Processing Systems, 2023	26*	2023
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games W Xiong, H Zhong, C Shi, C Shen, T Zhang International Conference on Machine Learning, 24496-24523, 2022	25	2022
A theoretical analysis of optimistic proximal policy optimization in linear markov decision processes H Zhong, T Zhang Advances in Neural Information Processing Systems 36, 2024	22	2024
Why robust generalization in deep learning is difficult: Perspective of expressive power B Li, J Jin, H Zhong, J Hopcroft, L Wang Advances in Neural Information Processing Systems 35, 4370-4384, 2022	21	2022
Double pessimism is provably efficient for distributionally robust offline reinforcement learning: Generic algorithm and robust partial coverage J Blanchet, M Lu, T Zhang, H Zhong Advances in Neural Information Processing Systems 36, 2024	18	2024
Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs H Zhong, Z Yang, Z Wang, C Szepesvári arXiv preprint arXiv:2110.08984, 2021	18	2021
Nearly optimal policy optimization with stable at any time guarantee T Wu, Y Yang, H Zhong, L Wang, S Du, J Jiao International Conference on Machine Learning, 24243-24265, 2022	14	2022
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment R Yang, X Pan, F Luo, S Qiu, H Zhong, D Yu, J Chen arXiv preprint arXiv:2402.10207, 2024	10	2024
DPO Meets PPO: Reinforced Token Optimization for RLHF H Zhong, G Feng, W Xiong, L Zhao, D He, J Bian, L Wang arXiv preprint arXiv:2404.18922, 2024	9	2024
Towards Robust Offline Reinforcement Learning under Diverse Data Corruption R Yang, H Zhong, J Xu, A Zhang, C Zhang, L Han, T Zhang arXiv preprint arXiv:2310.12955, 2023	8	2023
Tackling heavy-tailed rewards in reinforcement learning with function approximation: Minimax optimal and instance-dependent regret bounds J Huang, H Zhong, L Wang, L Yang Advances in Neural Information Processing Systems 36, 2024	6	2024
Provable Sim-to-real Transfer in Continuous Domain with Partial Observations J Hu, H Zhong, C Jin, L Wang arXiv preprint arXiv:2210.15598, 2022	6	2022
Breaking the Moments Condition Barrier: No-Regret Algorithm for Bandits with Super Heavy-Tailed Payoffs H Zhong, J Huang, L Yang, L Wang Advances in Neural Information Processing Systems 34, 2021	6	2021
A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning Y Yang, T Wu, H Zhong, E Garcelon, M Pirotta, A Lazaric, L Wang, SS Du International Conference on Learning Representations, 2021/9/29, 2021	6*	2021

Sistema negali atlikti operacijos. Bandykite vėliau dar kartą.

Straipsniai 1–20

Šaltinių per metus

Dubliuoti šaltiniai

Sujungti šaltiniai

Pridėti bendraautoriusBendraautoriai

Stebėti

Cituota

Bendraautoriai