-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-20625][pubsub,e2e] Add PubSubSource connector using FLIP-27 #18823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-20625][pubsub,e2e] Add PubSubSource connector using FLIP-27 #18823
Conversation
* WIP * WIP * Working WIP * Clean up * Place new Pub/Sub source into existing Pub/Sub connector module * Clean up * Apply Spotless code formatting [FLINK-20625][pubsub,e2e] Attempt to support stopping the reader when stopmark is encountered [FLINK-20625][pubsub,e2e] Add checkpointing and do some refactorings * Simplify fetching from Pub/Sub in SplitReader * Allow Pub/Sub source to be only continuous unbounded * Add basic PubSubSource builder * Add configuration options for SubscriberFactory to PubSubSource, remove unused collector * Add checkpointing [FLINK-20625][pubsub,e2e] Allow multiple records inside single Pub/Sub message for deserialization [FLINK-20625][pubsub,e2e] Add Javadocs, README and clean up [FLINK-20625][pubsub,e2e] Reduce visibility of classes and their members [FLINK-20625][pubsub,e2e] Propagate Pub/Sub subscriber creation errors from SplitReader [FLINK-20625][pubsub,e2e] Use constants for default Pub/Sub subscriber parameters [FLINK-20625][pubsub,e2e] Fix dynamic Scala version in artifact example [FLINK-20625][pubsub,e2e] Rename PubSubEnumeratorCheckpoint -> PubSubEnumeratorState [FLINK-20625][pubsub,e2e] Add version checks for deserialization [FLINK-20625][pubsub,e2e] Remove unnecessary declaration of exception-throwing [FLINK-20625][pubsub,e2e] Remove disfunctional end-of-stream logic [FLINK-20625][pubsub,e2e] Avoid concurrency issues with list of Pub/Sub messages to acknowledge [FLINK-20625][pubsub,e2e] Refactor PubSubSourceBuilder [FLINK-20625][pubsub,e2e] Clarify consistency guarantee description [FLINK-20625][pubsub,e2e] Clarify Pub/Sub request timeout [FLINK-20625][pubsub,e2e] Restructure and extend readme, add basic architecture info to docstring [FLINK-20625][pubsub,e2e] Attempt to solve concurrency issues with checkpointing
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit 4b8aeb8 (Thu Feb 17 14:10:03 UTC 2022) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
@RyanSkraba Thanks a lot for the PR! We're working on moving out the first connectors from the Flink repo. Have you already looked into porting the GCP Sink to the Unified Sink API by any chance? |
Hello! I have been working on two specific Pub/Sub things (1) updating unit tests here and (2) the FLIP-143 unified sink API version. I'm currently on vacation 🏖️ so they've been progressing slowly (I haven't claimed the JIRA for this reason). I've also been watching the elasticsearch migration. It's looking pretty good and meeting the goals of the externalization! When do you think other connectors should start moving? |
@RyanSkraba Awesome stuff. Let me know if you want me to assign the Jira to you. I'm hoping that we'll be able to start on the other connectors in the next couple of week. We've already created a couple of new repo's, if you also think you're ready for that let me know so I can get it arranged. |
Hi @RyanSkraba thanks for your work on pushing this connector closer to the finish line! I'm interested in helping to test it out & get it merged. Can you outline what remaining areas need to be worked on? |
@RyanSkraba It would indeed be nice if we can move this now to the externalized repo. @dchristle it would be great if you can help validate it so we can move it forward. |
@MartijnVisser I've moved this to apache/flink-connector-gcp-pubsub#2! Thanks for the ping! @dchristle having some eyes check it out for validation would be the most useful next step. |
What is the purpose of the change
Brief change log
Verifying this change
Please make sure both new and modified tests in this PR follows the conventions defined in our code quality guide: https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
This change added tests and can be verified as follows:
(example:)
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation