Philip Thomas

Cituota

	Visi	Nuo 2019
Šaltiniai	4541	3504
h-rodyklė	32	28
i10-rodyklė	55	48

820

410

205

615

2011201220132014201520162017201820192020202120222023202416 27 28 41 68 137 177 258 413 568 681 724 810 305

Viešas pasiekiamumas

Peržiūrėti viską

27 straipsniai

0 straipsnių

pasiekiami

nepasiekiami

Pagal finansavimo įpareigojimus

Bendraautoriai

Emma BrunskillAssociate Professor of Computer Science, Stanford UniversityPatvirtintas el. paštas cs.stanford.edu
Georgios TheocharousAdobe ResearchPatvirtintas el. paštas adobe.com
Bruno Castro da SilvaUniversity of MassachusettsPatvirtintas el. paštas cs.umass.edu
Scott M. JordanPostdoctoral Fellow, University of AlbertaPatvirtintas el. paštas ualberta.ca
George KonidarisBrownPatvirtintas el. paštas cs.brown.edu
Scott NiekumAssociate Professor, University of Massachusetts AmherstPatvirtintas el. paštas cs.umass.edu
Stephen GiguereUniversity of MassachusettsPatvirtintas el. paštas cs.umass.edu
Antonie J. (Ton) van den BogertProfessor of Mechanical Engineering, Cleveland State UniversityPatvirtintas el. paštas csuohio.edu
Yuriy BrunManning College of Information and Computer Sciences, University of Massachusetts AmherstPatvirtintas el. paštas cs.umass.edu
Chris NotaUniversity of Massachusetts, AmherstPatvirtintas el. paštas cs.umass.edu
Michael BranickyProfessor of Electrical Engineering & Computer Science, University of KansasPatvirtintas el. paštas ku.edu
Sarah OsentoskiVinci4dPatvirtintas el. paštas vinci4d.ai
Erik Learned-MillerProfessor of Computer Science, University of Massachusetts AmherstPatvirtintas el. paštas cs.umass.edu
Sridhar MahadevanDirector, Data Science Lab, Adobe Research & Professor, University of Massachusetts, AmherstPatvirtintas el. paštas cs.umass.edu
Blossom MetevierUniversity of Massachusetts AmherstPatvirtintas el. paštas umass.edu
Will DabneyDeepMindPatvirtintas el. paštas google.com
Francisco M. GarciaUniversity of Massachusetts - AmherstPatvirtintas el. paštas cs.umass.edu
Robert KirschProfessor and Chair of Biomedical Engineering, Case Western Reserve UniversityPatvirtintas el. paštas case.edu
Arthur GuezGoogle DeepMindPatvirtintas el. paštas google.com
Rémi MunosDeepMindPatvirtintas el. paštas inria.fr

Stebėti

Philip Thomas

University of Massachusetts Amherst

Patvirtintas el. paštas cs.umass.edu - Pagrindinis puslapis

Artificial Intelligence Reinforcement Learning AI Safety


Pavadinimas Rūšiuoti pagal šaltinius Rūšiuoti pagal metus Rūšiuoti pagal pavadinimą	Cituota Cituota	Metai
Data-efficient off-policy policy evaluation for reinforcement learning P Thomas, E Brunskill International Conference on Machine Learning, 2139-2148, 2016	738	2016
Value function approximation in reinforcement learning using the Fourier basis G Konidaris, S Osentoski, P Thomas Proceedings of the AAAI conference on artificial intelligence 25 (1), 380-385, 2011	566	2011
High-confidence off-policy evaluation P Thomas, G Theocharous, M Ghavamzadeh Proceedings of the AAAI Conference on Artificial Intelligence 29 (1), 2015	305	2015
High confidence policy improvement P Thomas, G Theocharous, M Ghavamzadeh International Conference on Machine Learning, 2380-2388, 2015	216	2015
Ad recommendation systems for life-time value optimization G Theocharous, PS Thomas, M Ghavamzadeh Proceedings of the 24th international conference on world wide web, 1305-1310, 2015	192	2015
Preventing undesirable behavior of intelligent machines P Thomas, B Castro da Silva, A Barto, S Giguere, Y Brun, E Brunskill Science 366 (6468), 999-1004, 2019	189	2019
Learning action representations for reinforcement learning Y Chandak, G Theocharous, J Kostas, S Jordan, P Thomas International conference on machine learning, 941-950, 2019	181	2019
Increasing the action gap: New operators for reinforcement learning MG Bellemare, G Ostrovski, A Guez, P Thomas, R Munos Proceedings of the AAAI Conference on Artificial Intelligence 30 (1), 2016	168	2016
Bias in natural actor-critic algorithms P Thomas International conference on machine learning, 441-448, 2014	158	2014
Safe reinforcement learning PS Thomas	115	2015
Is the policy gradient a gradient? C Nota, PS Thomas arXiv preprint arXiv:1906.07073, 2019	68	2019
Training an actor-critic reinforcement learning controller for arm movement using human-generated rewards KM Jagodnik, PS Thomas, AJ van den Bogert, MS Branicky, RF Kirsch IEEE Transactions on Neural Systems and Rehabilitation Engineering 25 (10 …, 2017	67	2017
Proximal reinforcement learning: A new theory of sequential decision making in primal-dual spaces S Mahadevan, B Liu, P Thomas, W Dabney, S Giguere, N Jacek, I Gemp, ... arXiv preprint arXiv:1405.6757, 2014	66	2014
Optimizing for the future in non-stationary mdps Y Chandak, G Theocharous, S Shankar, M White, S Mahadevan, ... International Conference on Machine Learning, 1414-1425, 2020	64	2020
Predictive off-policy policy evaluation for nonstationary decision problems, with applications to digital marketing P Thomas, G Theocharous, M Ghavamzadeh, I Durugkar, E Brunskill Proceedings of the AAAI Conference on Artificial Intelligence 31 (2), 4740-4745, 2017	63	2017
Policy gradient methods for reinforcement learning with function approximation and action-dependent baselines PS Thomas, E Brunskill arXiv preprint arXiv:1706.06643, 2017	62	2017
Evaluating the performance of reinforcement learning algorithms S Jordan, Y Chandak, D Cohen, M Zhang, P Thomas International Conference on Machine Learning, 4962-4973, 2020	61	2020
Risk Quantification for Policy Deployment PS Thomas, G Theocharous, M Ghavamzadeh US Patent App. 14/552,047, 2016	54	2016
Importance Sampling for Fair Policy Selection. S Doroudi, PS Thomas, E Brunskill Grantee Submission, 2017	53	2017
Some recent applications of reinforcement learning AG Barto, PS Thomas, RS Sutton Proceedings of the eighteenth Yale workshop on adaptive and learning systems, 2017	51	2017

Sistema negali atlikti operacijos. Bandykite vėliau dar kartą.

Straipsniai 1–20

Šaltinių per metus

Dubliuoti šaltiniai

Sujungti šaltiniai

Pridėti bendraautoriusBendraautoriai

Stebėti

Cituota

Bendraautoriai