[Probabilidad-Estadistica-Seminario] Seminario de Probabilidad y Estadística - Vittorio Puricelli (LAAS-CNRS, Francia)
seminarios en cmat.edu.uy
seminarios en cmat.edu.uy
Mie Mar 25 08:00:07 -03 2026
Seminario de Probabilidad y Estadística
----------------------------------------
Título: "Online RL for infinite state space problems"
Expositor: Vittorio Puricelli (LAAS-CNRS, Francia)
Resumen:
For infinite space problems, classical reinforcement learning (RL) algorithms
can fail to converge due to unstable behavior. In this talk, we present a path
toward addressing this problem and obtaining converge guarantees. We will start
by introducing the fundamental concepts from Markov decision process theory,
focusing on the expected discounted cost problem. We then move on to online RL
methods, with a particular emphasis on the Q-learning algorithm. We comment on
the convergence of Q-learning for finite state space problems. We argue then
that the convergence can be extended for infinite state spaces under certain
stability assumptions. Next, we present a recent result where we show that
Q-learning by itself can fail to promote stability, and that an stabilizing
scheme is needed to ensure convergence. We will outline the key ideas of the
proof of this last result and discuss connections to self-interacting random
walk models. We will end by discussing work in progress in which we aim to
develop an online stabilizing scheme that guarantees convergence of Q-learning.
--------------------------------------------------------------------------------
Viernes 27/3 a las 10:30, FCEA: Salón 1 del EIP (entrada por Lauro Müller)
Contacto: Laura Aspirot - laspirot en gmail.com
--------------------------------------------------------------------------------
https://salavirtual-udelar. zoom.us/j/87033011104?pwd=
qnKGw4syp4Izilf5QekV7Ama7oyjXZ .1
--------------------------------------------------------------------------------
Más seminarios en: http://www.cmat.edu.uy/seminarios
------------ próxima parte ------------
Se ha borrado un adjunto en formato HTML...
URL: <http://listas.cmat.edu.uy/pipermail/seminario-probabilidad-estadistica/attachments/20260325/d8d81c39/attachment.html>
Más información sobre la lista de distribución seminario-probabilidad-estadistica