Stebėti
Marc Lanctot
Marc Lanctot
Research Scientist, DeepMind
Patvirtintas el. paštas google.com - Pagrindinis puslapis
Pavadinimas
Cituota
Cituota
Metai
Mastering the game of Go with deep neural networks and tree search
D Silver, A Huang, CJ Maddison, A Guez, L Sifre, G Van Den Driessche, ...
Nature 529 (7587), 484-489, 2016
173332016
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
D Silver, T Hubert, J Schrittwieser, I Antonoglou, M Lai, A Guez, M Lanctot, ...
Science 362 (6419), 1140-1144, 2018
5609*2018
Dueling Network Architectures for Deep Reinforcement Learning
Z Wang, T Schaul, M Hessel, H van Hasselt, M Lanctot, N de Freitas
arXiv preprint arXiv:1511.06581, 2016
42432016
Value-decomposition networks for cooperative multi-agent learning based on team reward
P Sunehag, G Lever, A Gruslys, WM Czarnecki, V Zambaldi, M Jaderberg, ...
Proceedings of the 17th international conference on autonomous agents and …, 2018
1349*2018
Deep Q-learning from Demonstrations
T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, D Horgan, ...
Association for the Advancement of Artificial Intelligence (AAAI), 2018
10752018
Multi-agent Reinforcement Learning in Sequential Social Dilemmas
JZ Leibo, V Zambaldi, M Lanctot, J Marecki, T Graepel
AAMAS, 2017
8222017
A unified game-theoretic approach to multiagent reinforcement learning
M Lanctot, V Zambaldi, A Gruslys, A Lazaridou, K Tuyls, J Pérolat, D Silver, ...
arXiv preprint arXiv:1711.00832, 2017
6412017
Monte Carlo sampling for regret minimization in extensive games
M Lanctot, K Waugh, M Zinkevich, M Bowling
Advances in neural information processing systems 22, 1078-1086, 2009
3382009
The hanabi challenge: A new frontier for ai research
N Bard, JN Foerster, S Chandar, N Burch, M Lanctot, HF Song, E Parisotto, ...
Artificial Intelligence 280, 103216, 2020
3282020
Fictitious Self-Play in Extensive-Form Games
J Heinrich, M Lanctot, D Silver
International Conference on Machine Learning, 2015
3092015
Memory-efficient backpropagation through time
A Gruslys, R Munos, I Danihelka, M Lanctot, A Graves
Advances In Neural Information Processing Systems, 4125-4133, 2016
224*2016
OpenSpiel: A Framework for Reinforcement Learning in Games
M Lanctot, E Lockhart, JB Lespiau, V Zambaldi, S Upadhyay, J Pérolat, ...
arXiv preprint arXiv:1908.09453, 2019
2032019
Emergent Communication through Negotiation
K Cao, A Lazaridou, M Lanctot, JZ Leibo, K Tuyls, S Clark
arXiv preprint arXiv:1804.03980, 2018
1702018
Actor-critic policy optimization in partially observable multiagent environments
S Srinivasan, M Lanctot, V Zambaldi, J Pérolat, K Tuyls, R Munos, ...
Advances in Neural Information Processing Systems, 3422-3435, 2018
1512018
Convolution by evolution: Differentiable pattern producing networks
C Fernando, D Banarse, M Reynolds, F Besse, D Pfau, M Jaderberg, ...
Proceedings of the Genetic and Evolutionary Computation Conference 2016, 109-116, 2016
1282016
Real-Time Monte-Carlo Tree Search in Ms Pac-Man
T Pepels, MHM Winands, M Lanctot
Transactions on Computation Intelligence and AI in Games, 2014
1142014
α-Rank: Multi-Agent Evaluation by Evolution
S Omidshafiei, C Papadimitriou, G Piliouras, K Tuyls, M Rowland, ...
Scientific reports 9 (1), 9937, 2019
1112019
Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research
JZ Leibo, E Hughes, M Lanctot, T Graepel
arXiv preprint arXiv:1903.00742, 2019
1042019
Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization.
M Johanson, N Bard, M Lanctot, RG Gibson, M Bowling
AAMAS, 837-846, 2012
1022012
Adversarial planning through strategy simulation
F Sailer, M Buro, M Lanctot
2007 IEEE Symposium on Computational Intelligence and Games, 80-87, 2007
1002007
Sistema negali atlikti operacijos. Bandykite vėliau dar kartą.
Straipsniai 1–20