-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Build: better support for empty default branch #12462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Documentation build overview
Show files changed (6 files in total): 📝 6 modified | ➕ 0 added | ➖ 0 deleted
|
Documentation build overview
Show files changed (4 files in total): 📝 4 modified | ➕ 0 added | ➖ 0 deleted
|
| latest_version_identifier = ( | ||
| self.versions.filter(slug=LATEST, machine=True) | ||
| .values_list("identifier", flat=True) | ||
| .first() | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is introducing an extra query for each time latest is returned in a response, since we now rely on latest having the correct value of the default branch (similar to get_original_stable_version).
|
What happens if the default branch is |
The user who needed this isn't using it, so we can probably remove the feature flag for now, and simplify this logic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable, but is a pretty big change to the build process. Hopefully it won't impact too much, but I do worry a bit about how much we're changing here for some edge cases, but I do like the idea of not specifying a branch checkout on initial clone to work around broken latest, which is a bad UX for users.
Yep, if default_branch isn't set we never reference latest for its branch name when cloning, but we update latest to the new default branch after building, so all links work properly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, everything here made sense to me 👍
|
I think I'm mostly OK with this change, since it is pretty bad UX for users to get stuck here (I just got stuck with a similar issue and it was super annoying...). @humitos Any more thoughts before we move forward? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed this again and I still have the same thoughts.
I don't still understand why we need to keep the Project.default_branch in sync with the "default branch cloned when calling git clone on a repository". What's the reason for this?
My point is "if the Project.default_branch==None we don't need to do anything special when cloning the repository", and that's all. Why we can't do that and avoid all this complexity to keep this branch in sync?
| # We can skip the checkout step since we just cloned the repository, | ||
| # and the default branch is already checked out. | ||
| if not is_latest_without_default_branch: | ||
| identifier = self.data.build_commit or self.data.version.identifier | ||
| log.info("Checking out.", identifier=identifier) | ||
| self.vcs_repository.checkout(identifier) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this conditional here and always execute this command, for simplification of the workflow. It's better to be explicit here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this won't be executed only at most, once, per project. Starting from the second time, the project will have default_branch set and we will execute this all the times.
I would simplify this and execute this command always to reduce complexity here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have the name of the branch for those projects, omitting this step is the purpose of this PR.
It seems this won't be executed only at most, once, per project. Starting from the second time, the project will have default_branch set and we will execute this all the times.
No, default_branch is kept empty for those projects.
| if is_latest_without_default_branch: | ||
| self.data.default_branch = self.data.build_director.vcs_repository.get_default_branch() | ||
| log.info( | ||
| "Default branch for the repository detected.", | ||
| default_branch=self.data.default_branch, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we need to save this in the database. Why we are not always cloning the repository without the default branch set if it will clone the correct one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are confusing the default_branch from the project model. This is the default branch of the repository, which is used to keep the identifier of latest in sync, not the attribute from the project model.
| # Default branch for the repository. | ||
| # Only set when building the latest version, and the project | ||
| # doesn't have an explicit default branch. | ||
| default_branch: str | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call it more explicitly like vcs_default_branch?
| # and the user hasn't defined a default branch (which means we need to use the default branch from the repo). | ||
| omit_remote_reference = self.version.is_machine_latest and not self.project.default_branch | ||
| if not omit_remote_reference: | ||
| remote_reference = self.get_remote_fetch_refspec() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that if we return None from this method get_remote_fetch_refspec when LATEST and not project.default_branch at https://github.com/readthedocs/readthedocs.org/blob/main/readthedocs/vcs_support/backends/git.py#L102-L115 we shouldn't need this PR and keep the project.default_branch in sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my reply is the same as #12462 (comment).
This is needed to keep all the links for the version working. Latest is an alias in our code, so it needs to point to a valid version, we can't leave it just empty. |
The basic idea is to have the value of latest in sync with the value from the default branch, which we already do. But, in case the default branch is left empty, we were falling back to master, which is just a guess of the default, so instead of guessing the value, we just don't update it if we don't know its value. When cloning the repo, we have access to the default branch, so that's when we update latest for those projects.
Note this is only relevant for projects that aren't linked to a remote repository, since for those we know the default branch. So, when we try to build the latest version of a project that doesn't have a default branch, we:
There's still the problem about latest's identifier being out of sync when the default branch changes, but that's fixed after a new build, and this isn't a new problem (now users aren't blocked at least), and changing the default branch of a repo isn't something users do that often, and this only affects projects that aren't linked to a repository, so I think we are okay with that for now.
This replaces #10927
Closes #12231