Kempner Sketchpad with Ryan Truong (Gershman lab)

Name: Kempner Sketchpad with Ryan Truong (Gershman lab)
Start: 2026-05-27T14:00:00-04:00
End: 2026-05-27T15:00:00-04:00
Location: Kempner Large Conference Room (SEC 6.242)

Date: Wednesday, May 27, 2026 Time: 2:00 - 3:00pm

Location: Kempner Large Conference Room (SEC 6.242)

Kempner Sketchpad is a workshop series focused on works in progress where Kempner Graduate Fellows share early-stage ideas and work. In each session, a Fellow gives an informal presentation on something they are currently working on and would like feedback on—for example, research questions, analysis pipelines, figure design, puzzling results, or ways to bring in other disciplines. The focus is on constructive feedback, brainstorming, and community-building, rather than polished talks.

Abstract:
Recent projects such as the AI Gamestore (Ying et al., 2026) have demonstrated the potential of browser-based JavaScript games as flexible evaluation platforms for modern AI systems, particularly vision-language models. However, these games have historically been difficult to use as proper reinforcement learning (RL) environments due to limitations in interfacing, control, and training efficiency. This project introduces a framework for running fast reinforcement learning directly on raw JavaScript games, transforming browser-based games into scalable RL environments. The framework enables efficient interaction, training, and evaluation of RL agents across a diverse range of lightweight game environments, while preserving the flexibility and richness of web-native games.

In this presentation, I will introduce the framework architecture, present preliminary results on several baselines, and discuss the broader methodological motivation for leveraging JavaScript games as RL testbeds. I hope to gather feedback on which model baselines and game profiles are most informative, as well as on the broader utility of this approach beyond bespoke high-speed RL environments.

Discussion questions:

Which reinforcement learning or deep learning baselines would be most informative to include in evaluating this framework?
Are there domains in which the proposal approach would be most helpful?
Beyond speed and convenience, what methodological or practical bottlenecks could this framework help address in RL research?

Everyone welcome! There’s value in different perspectives, so even if this is slightly outside your subfield, please come and collaborate.