Skip to content

Conversation

@l0ud0gg
Copy link
Contributor

@l0ud0gg l0ud0gg commented Mar 4, 2025

The seo jet is called in on all generated pages with a hardcoded index value - true / false. Added a check to make this false if a no_index feature toggle is set. This can be an extra measure for sites that wish to remain off search engines.

@l0ud0gg l0ud0gg requested a review from sam-shift72 March 4, 2025 21:49
@l0ud0gg
Copy link
Contributor Author

l0ud0gg commented Mar 4, 2025

@sam-shift72 perhaps we could check the same toggle with the go live button? happy to name it something different or go a different approach

{{end}}

{* override default setting to index the page if no_index toggle is set *}
{{index := site.Toggles["no_index"] ? false : index}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an existing toggle?

If not, making it a config (plausibly something we might give site owners control over) and giving it a less obscure name would be good.

If we had a config like this, it would be nice if it controlled robots.txt too I assume

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nah its a new one. how about user > config: disable_site_indexing? or site_robots_noindex

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it would be a site API change to check the config when the site is activated to decide if robots is enabled or not

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I see a couple options:

  • Cleanest: Use the Site allow_robots field as source of truth. Update /services/users/v1/sites to return allow_robots as well as site status. Fetch this in Kibble and put it on the site model. Core template could simply use site.AllowRobots in the template. Add a separate config that specifies this is a "private" site and don't flip allow robots when going live.
  • Easiest: call this config something really specific like seo_disable_meta_robots and use it directly from the template. Fix the go live issue separately.

The clean way is probably also a ton of faff that would require coordinating deploys and version bumps so quick and dirty seems appealing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The go-live issue could be mitigated by leaving the allow_robots setting alone in that API and change it so that having a sitelock implies robots.txt is always denied.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The go-live issue could be mitigated by leaving the allow_robots setting alone in that API and change it so that having a sitelock implies robots.txt is always denied.

This issue is really only a few sites where they want the site to go live without site lock but not have it visible to the public / the site is invite only, therefore relying on the site lock wont work in this case, and its nice to ensure robots is enabled for all the other sites. I think for a quick solution, we can create a config for core to write a noindex meta tag for all pages and therefore if robots is enabled then the crawler will hit the site and see not to index any pages. It does mean we have the site no robots field and config for essentially the same thing which could be confusing though.

@l0ud0gg l0ud0gg closed this Jul 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants