Skip to content

💡 Proposal: get_state() and set_state() on the Cache interface #680

@aaronsteers

Description

This proposal would give a public API for reading and writing stream state for a Cache interface.

Signature might be something like:

class CacheBase:
    ...

    def get_state(
        stream_name: str,
    ) -> StreamState | None:
        """Return a stream state object for the provided stream name.

        This is a thin wrapper around the internal `StateProvider` interface.
        """
        ...

    def set_state(
        stream_name: str,
        stream_state: StreamState | dict,
    ) -> None:
        """Set a stream state object for the provided stream name.

        This is a thin wrapper around the internal `StateWriter` interface.
        """
        ...


    def migrate_state(
        streams: list[str] | Literal["*"],
        to_cache: CacheBase,
    ) -> None:
        """Copies all matching stream states to the specified `Cache` object.

        This is a thin wrapper around the respective `Cache` objects'
        `get_state` and `set_state` methods.
        """
        ...

Other considerations:

  1. Since there are as of now many different strongly typed State classes (StateMessage, StreamState, etc.), we'd want to think carefully on which (if any) we feel comfortable to make a part of the public interface.
  2. It might actually be cleaner to get/set values as dict objects, since that would fully avoid needing to expose a public API for the state object itself. (Tradeoff: dict objects may be difficult to parse.)
  3. It should be noted that manually modifying a state artifact for a stream basically always "voids the warranty", and sources and not guaranteed to have stable state artifact interfaces over time.
  4. This feature is actually more appropriate for migration of state - such as during renames or moving from one state backend to another, or from one table name/alias to a new one. For this reason, I've included a possible migrate_state() method in the above which could streamline a "copy-all"-type operation.
  5. Internally, states are often cached in memory by the StateProvider class. We'd need to take care to invalidate or refresh the caches after a "set" action.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions