3649 - 3660 of 18321 Results
Format results
- Dimitri Bertsekas (ASU & MIT)
Batch Policy Learning in Average Reward Markov Decision Processes
Peng Liao (Harvard)Challenges of (asymptotically safe) quantum gravity
Marc Schiffer Radboud Universiteit Nijmegen
The geometry of string compactifications
Lara Anderson Virginia Polytechnic Institute and State University
The Mean-Squared Error of Double Q-Learning
R. Srikant (University of Illinois at Urbana-Champaign)Zap Q-learning with Nonlinear Function Approximation
Sean Meyn (University of Florida)Special Topics in Astrophysics - Numerical Hydrodynamics - Lecture 22
Daniel Siegel University of Greifswald
Uniform Offline Policy Evaluation and Offline Learning in Tabular RL
Yu-Xiang Wang (UC Santa Barbara)Testing Gauge Gravity duality with Matrix models
Denjoe O'Connor Dublin Institute for Advanced Studies
Batch Value-function Approximation with Only Realizability
Nan Jiang (University of Illinois at Urbana-Champaign)