Skip to content

Section 2.4.1 line of reasoning #3

@nlyu1

Description

@nlyu1
Image

I fail to see how (the Bellman optimality operator $\mathcal J^*$ is a $\gamma$-contraction map) follows from (the Bellman consistency operator $\mathcal J^\pi$ is a $\gamma$-contraction map).

The two operators have different expressions, and there's no reduction (maybe I'm wrong here) along the lines of $\mathcal J^*(v) = \mathcal J^{\pi(g(v))}$ for some $\pi$.

The convergence of VI is still guaranteed by a separate proof.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions