Skip to content
View dyth's full-sized avatar

Block or report dyth

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
dyth/README.md

David Yu-Tung Hui 許宇同

I am currently unemployed. I used to be an AI researcher in deep reinforcement learning. I wrote two works improving the optimization stability of off-policy gradient-based Q-learning algorithms.

  1. Stabilizing Q-Learning for Continuous Control
    David Yu-Tung Hui
    MSc Thesis, University of Montreal, 2022
    I described two principles for creating stable deep learning algorithms and applied the principles to deep reinforcement learning. The principles were 1) maximum entropy, from which many deep learning loss functions are derived and 2) the neural tangent kernel, which provides a convergence analysis justifying the use of normalization layers and the ReLU activation function. In RL, I used maximum entropy to justify the design of a Q-learning family of algorithms and showed that LayerNorm reduced divergence of these algorithms, especially in high-dimensional continuous control problems.
    [.pdf] [Errata]

  2. Double Gumbel Q-Learning
    David Yu-Tung Hui, Aaron Courville, Pierre-Luc Bacon
    Spotlight at NeurIPS 2023
    We showed that using deep neural networks in Q-learning introduces two heteroscedastic Gumbel noise sources. An algorithm modeling these noise sources attained just under 2 times the aggregate asymptotic performance of the popular SAC baseline.
    [.pdf] [Reviews] [Poster (.png)] [5-min talk] [1-hour seminar] [Code (GitHub)] [Errata]

The best way to contact me is email. My email address is listed in one of my written works.

Pinned Loading

  1. doublegum doublegum Public

    NeurIPS 2023 Spotlight

    Python 10 4

  2. causal-entropic-forces causal-entropic-forces Public

    Python reimplementation of Wissner-Gross & Freer, 2013

    Jupyter Notebook 11 4