Skip to content

Explore a Flink catalog based on Recap #407

@gunnarmorling

Description

@gunnarmorling

In Apache Flink, catalogs "provide metadata, such as databases, tables, partitions, views, and functions and information needed to access data stored in a database or other external systems".

There is an in-memory implementation which keeps any information only in the context of specific sessions. A persistent implementation is provided in form of the HiveCatalog, using Apache Hive (more precisely, the Hive Metastore) as the underlying storage layer.

The purpose of this issue is to explore the feasibility of creating a persistent Flink catalog on top of Recap (or, specifically, its schema registry). This could be interesting to Flink users looking for an alternative to the Hive-based catalog implementation. Note the Catalog contract has some facets which probably are not supported in Recap, e.g. the ability to store metadata about functions and statistics. I'm not sure whether Recap's core design is extensible so that it would allow for storing this kind of additional information?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions