Skip to content

Conversation

matifali
Copy link
Member

@matifali matifali commented Sep 1, 2025

Would prevents issues like the one I fixed in #404

This change should have caught that error as

2025-09-01 13:12:45.132 [erro]  ...
    msg= Error during "README parsing" phase of README validation:
         - "registry/umair/modules/digitalocean-region/README.md": incorrect source URL format: found "registry.coder.com/coder/digitalocean-region/coder", expected "registry.coder.com/umair/digitalocean-region/coder"
error: script "validate-readme" exited with code 1

diff --git a/.github/workflows/ci.yaml b/.github/workflows/ci.yaml index
53b912b..eb3cf8b 100644 --- a/.github/workflows/ci.yaml +++
b/.github/workflows/ci.yaml @@ -63,8 +63,8 @@ jobs: - name: Set up Go
uses: actions/setup-go@v5 with: - go-version: "1.23.2" - - name:
Validate contributors + go-version: "1.25.0" + - name: Validate Reademde
run: go build ./cmd/readmevalidation && ./readmevalidation - name:
Remove build file artifact run: rm ./readmevalidation

Signed-off-by: Muhammad Atif Ali <me@matifali.dev>
@matifali matifali requested a review from Parkreiner September 1, 2025 13:21
@matifali matifali self-assigned this Sep 1, 2025
@matifali matifali force-pushed the atif/validate-readme-source branch 3 times, most recently from 98efa49 to 898e773 Compare September 1, 2025 14:50
- Downgrade Go version in CI to 1.24 for consistency.
- Fix naming and path issues in `readmevalidation` code.
- Improve regex validation for module and namespace names.
- Correct typos and improve comments for clarity.
@matifali matifali force-pushed the atif/validate-readme-source branch from 898e773 to 3b6b1ba Compare September 1, 2025 14:53
@@ -63,8 +63,8 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: "1.23.2"
- name: Validate contributors
go-version: "1.24"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was having weird issues with running Golang CI locally, and had to match the Go version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests generated by gemini CLI

@@ -25,7 +25,7 @@ var (
terraformVersionRe = regexp.MustCompile(`^\s*\bversion\s+=`)

// Matches the format "> [!INFO]". Deliberately using a broad pattern to catch formatting issues that can mess up
// the renderer for the Registry website
// the renderer for the Registry website.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these . are added by running golangci-lint locally, and I don't know how I can remove them. The CI also passes so I assume its fine leaving them

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, I feel like there should be a way to enforce this during the CI step, but this was my fault for not linting before opening one of the recent PRs

Comment on lines +320 to +326
errs = append(errs, xerrors.New("registry does not support nested GFM alerts"))
continue
}

leadingWhitespace := currentMatch[1]
if len(leadingWhitespace) != 1 {
errs = append(errs, errors.New("GFM alerts must have one space between the '>' and the start of the GFM brackets"))
errs = append(errs, xerrors.New("GFM alerts must have one space between the '>' and the start of the GFM brackets"))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are also done by golangci-lint

@matifali matifali changed the title chore: Aadd validation for Terraform module source URLs chore: add validation for Terraform module source URLs Sep 1, 2025
@matifali matifali changed the title chore: add validation for Terraform module source URLs chore: add validation for module source URLs Sep 1, 2025
Copy link
Member

@Parkreiner Parkreiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is close. I think there's a couple of changes we can make the code a bit better, though

@@ -25,7 +25,7 @@ var (
terraformVersionRe = regexp.MustCompile(`^\s*\bversion\s+=`)

// Matches the format "> [!INFO]". Deliberately using a broad pattern to catch formatting issues that can mess up
// the renderer for the Registry website
// the renderer for the Registry website.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, I feel like there should be a way to enforce this during the CI step, but this was my fault for not linting before opening one of the recent PRs

"strings"

"golang.org/x/xerrors"
)

var (
terraformSourceRe = regexp.MustCompile(`^\s*source\s*=\s*"([^"]+)"`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a good idea for what source fields typically look like? I feel like this regex is too broad (the [^"]+ alone feels like it's too permissive – I don't we should be supporting any arbitrary character that isn't a double quote)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing all of these on the Terraform docs site:
https://developer.hashicorp.com/terraform/language/modules/sources

I think I'd prefer to have more explicit patterns for each source type example listed there

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, also, could we add a comment saying what general pattern we expect this to match?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last update: now that I've looked at the rest of this file, if we're always going to use registry.coder.com/ in the source, we could bake that into the regex pattern

return namespace, moduleName, nil
}

func validateModuleSourceURL(body string, filePath string) []error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish I knew more about Coder templates. Do templates have a source field (or anything equivalent) that might need to be validated, too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no its not applicable to templates

Comment on lines 47 to 56
if strings.HasPrefix(nextLine, "```") {
if strings.HasPrefix(nextLine, "```tf") && firstTerraformBlock {
isInsideTerraform = true
firstTerraformBlock = false
} else if isInsideTerraform {
// End of first terraform block.
break
}
continue
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be rewritten as:

if strings.HasPrefix(nextLine, "```") {
	if strings.HasPrefix(nextLine, "```tf")  {
		isInsideTerraform = true
		continue
	}
	if isInsideTerraform {
		break
	}
}

I don't see the point of the firstTerraformBlock variable. If we always break at the end of the first Terraform code block, we don't really need to track anything extra

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to check all tf blocks.

Copy link
Member

@Parkreiner Parkreiner Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something obvious, then. What validation requirement(s) are we trying to fulfill?

I was under the impression that we wanted the first Terraform block to contain a source field, but that other sources could be valid for the other blocks. Are we trying to keep all sources fully "boxed in" for Coder templates?

Not saying that the current code is wrong, but I'm not fully sure what we're trying to protect against, so I just want to make sure I'm giving good feedback

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to make sure all sources for all codeblock in a module README.md are in correct format. This should not apply to templates if it does its probably a bug.

- Make regex more specific for registry.coder.com patterns only
- Refactor to add namespace and resourceName fields to coderResourceReadme struct
- Inline path parsing logic into parseCoderResourceReadme
- Update validateModuleSourceURL to use struct fields instead of filePath parameter
- Simplify Terraform block detection logic
- Reduce nesting with early continue statements
- Add comment explaining regex pattern
- Extract registry.coder.com into a constant
- Improve test readability with extracted variables
- Remove redundant checks in tests
- Replace custom contains function with strings.Contains

Co-authored-by: matifali <matifali@users.noreply.github.com>
@matifali matifali requested a review from Parkreiner September 4, 2025 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants