Skip to content

fix(install.sh): stage payload as .ps1 file + ssh-keygen -A for hostkey ACLs (continuum's catches)#197

Merged
joelteply merged 9 commits intocanaryfrom
fix/windows-ps1-file-and-hostkey-acls
Apr 28, 2026
Merged

fix(install.sh): stage payload as .ps1 file + ssh-keygen -A for hostkey ACLs (continuum's catches)#197
joelteply merged 9 commits intocanaryfrom
fix/windows-ps1-file-and-hostkey-acls

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Builds on #195. Two Windows install bugs caught by continuum-b69f testing on real Windows MINGW64 (issue #196):

  1. Inline payload mangled by 4-layer quote escaping → silently breaks Start-Transcript. Fix: stage as .ps1 file, run via Start-Process -File.
  2. sshd Start-Service fails with WIN32_EXIT_CODE 1067 on every fresh Windows install because OpenSSH host-key files have overly-permissive ACLs. Fix: ssh-keygen -A between capability install and Start-Service (idempotent, generates missing keys + restores correct ACLs).
  3. Bash side re-queries sshd state post-elevation as belt-and-suspenders; surfaces 'partial install' warning when elevated exit=0 but service isn't actually Running.

continuum-b69f verified the .ps1 file approach gets the transcript every time on his real Windows. ssh-keygen -A is the documented standard fix for the OpenSSH-on-Windows post-install ACL bug.

…ey ACLs

Two Windows install bugs found via Mac↔Windows Claude debug loop on
issue #196 (continuum-b69f testing on real Windows MINGW64):

1. **Inline payload mangled by 4-layer quote escaping.** Pre-fix:
   `... -ArgumentList '-NoProfile -Command "$_elevated_payload"'`
   The payload contained many "" (PS strings) and \\ (registry paths);
   bash double-quoted → ps outer -Command → Start-Process ArgumentList
   single-quoted → inner -Command double-quoted. Each layer ate quotes
   differently. PowerShell never parsed the payload, the elevated
   window opened + ran nothing + closed silently. No transcript ever
   written. Joel saw a "OpenSSH installed + started" success message
   contradicted by a missing-transcript warning on the same run.

   Fix: stage payload as a .ps1 file in $CLONE_DIR, run via
   `Start-Process -File <path>`. Zero-quoting on the boundary; the
   .ps1 file is plain PowerShell and quotes/backslashes work natively.

2. **sshd Start-Service fails with WIN32_EXIT_CODE 1067 ("terminated
   unexpectedly") on every fresh Windows OpenSSH install** because
   host-key files exist with overly-permissive ACLs (Authenticated
   Users / BUILTIN\\Users / Everyone). sshd refuses to load them
   ("sshd: no hostkeys available -- exiting").

   Fix: add `ssh-keygen -A` to the elevated payload between the
   capability install and Start-Service. Idempotent — generates
   missing host keys AND restores correct ACLs (SYSTEM + Admins
   only) on existing ones. continuum-b69f's diagnosis.

3. **Bash side now re-queries sshd state post-elevation** as belt-
   and-suspenders. Previous behavior printed "OpenSSH installed +
   started" if the elevated payload exit was 0, even when no transcript
   was written and sshd wasn't actually running. The silent-success-
   while-broken path was the worst version of this bug. Now: bash
   calls `Get-Service sshd` from non-elevated PS; if state isn't
   "Running" it surfaces a "partial install" warning even when
   elevated exit was 0.

Verified by continuum-b69f on real Windows MINGW64: PR #195 (which
this PR builds on) now produces a complete transcript dumped to bash
terminal. Without the ssh-keygen -A addition though, sshd Start-Service
still failed in his run — that's what this PR adds.
…efore UAC

Three real bugs hiding behind one symptom on continuum-b69f's Windows
machine: install reported "OpenSSH installed + started" while sshd was
actually crashloop-stopped with exit 1067 ("no hostkeys available").
Joel called it "amateur try/catch" -- he was right.

1. Em-dash (U+2014) in a string literal mis-parsed under cp1252.

   PowerShell 5.1 reads BOMless .ps1 files as the system codepage
   (cp1252 on most Windows). UTF-8 em-dash is bytes E2 80 94. Byte 94
   in cp1252 is RIGHT-DOUBLE-QUOTATION-MARK. Parser sees "...$path "
   ...rest" -- treats the trailing 94 as a closing string quote and
   the rest of the file fails to parse. Nothing executes. No log
   written. Elevated window blinks closed silently.

   Fix: heredoc is now ASCII-only AND we prepend a UTF-8 BOM as
   defense-in-depth so future edits don't regress.

2. Global try/catch + $ErrorActionPreference = "Stop" hid the parse
   error completely.

   The parse error happens BEFORE Start-Transcript runs -- nothing in
   the try/catch could catch it because the parser never reaches the
   try at all. The bash side saw "no transcript written" and printed
   the misleading "UAC denied or Start-Process failed" warning.

   Fix: drop both. Each step runs plainly. PowerShell prints native
   errors to the transcript and execution continues. Bash side
   already re-queries Get-Service sshd post-elevation as the source-
   of-truth verdict, so we don't need the script's exit code to lie
   about success.

3. Parse errors didn't surface until after UAC.

   Fix: bash side now runs [Parser]::ParseFile on the staged .ps1
   from a non-elevated process before Start-Process is called. If
   any parse errors exist, we print them and abort -- no UAC prompt,
   no silent close, the user sees exactly what's wrong.

Per Joel: "we prefer parser issues to actually error" -- this is how
they actually error.

Verified locally on continuum-b69f's box: new payload parses clean
(456 tokens, no errors). Will end-to-end-test next.
…ot enough)

Previous commit's diagnosis was half-right: yes the host-key step needs
work, but ssh-keygen -A is for *generating missing keys*, not for
fixing ACLs on existing ones. Confirmed by capturing the elevated
transcript on continuum-b69f's box -- ssh-keygen -A produced no output
at all (existing keys were already there, nothing to do), and sshd
still failed Start-Service with exit 1067.

Ran sshd -ddd directly to see the underlying file-open errors:
  Failed to open file: ...ssh_host_rsa_key error:5   (ACCESS_DENIED)
  Failed to open file: ...ssh_host_rsa_key error:13  (ACL secure_permission_check failed)

So sshd-as-LocalSystem can't read the host keys *and* their ACLs flunk
sshd's own security check. Two distinct ACL problems, both fixed by
the same pattern: take ownership, wipe inheritance, grant SYSTEM +
BUILTIN\Administrators full control, no other ACEs.

Tools considered:
- FixHostFilePermissions.ps1: removed from Windows-OpenSSH years ago
- OpenSSHUtils PS module: official, but PSGallery dep + module trust
  prompt = friction we don't want for an install script
- icacls: in-box on every Windows + bulletproof. Picked this.

The new step:
  takeown /F <key>             # become owner
  icacls <key> /reset          # wipe inherited ACEs
  icacls <key> /inheritance:r /grant SYSTEM:F /grant Administrators:F

Output is captured per-key in the transcript so any failure is visible.
ssh-keygen -A still runs first (cheap, idempotent) so any *missing*
keys get auto-generated before the ACL fix runs.
Copilot AI review requested due to automatic review settings April 28, 2026 04:55
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves the Windows Git Bash install path in install.sh by making the UAC-elevated OpenSSH setup more reliable (avoids PowerShell quoting/escaping issues, fixes host key ACL problems that prevent sshd from starting, and adds a post-elevation service-state verification).

Changes:

  • Stage the elevated PowerShell payload as a .ps1 file (with UTF-8 BOM + preflight parse-check) and run it via Start-Process ... -File.
  • Add OpenSSH host-key generation + ACL reset (ssh-keygen -A + icacls) to prevent sshd start failures (1067) on fresh installs.
  • Re-query sshd status after elevation to detect “partial install” even when the elevated launcher returns success.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread install.sh
Comment on lines +430 to +434
# Run the elevated payload via -File (no quoting hell). Start-
# Process -Wait propagates the elevated process's exit code.
# -ExecutionPolicy Bypass so the elevated PS doesn't refuse
# the unsigned .ps1.
powershell.exe -NoProfile -Command "Start-Process powershell -Verb RunAs -Wait -ArgumentList @('-NoProfile','-ExecutionPolicy','Bypass','-File','$_elevated_ps1_win')" 2>&1 \
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start-Process -Wait does not propagate the child process exit code back to the current powershell.exe process. As written, _elev_rc will usually stay 0 even if the elevated powershell -File fails, so bash can incorrectly treat the payload as successful. Capture the exit code explicitly (e.g., use Start-Process ... -PassThru, then Wait-Process and exit $p.ExitCode, or avoid Start-Process and invoke elevation in a way that returns the child exit code).

Suggested change
# Run the elevated payload via -File (no quoting hell). Start-
# Process -Wait propagates the elevated process's exit code.
# -ExecutionPolicy Bypass so the elevated PS doesn't refuse
# the unsigned .ps1.
powershell.exe -NoProfile -Command "Start-Process powershell -Verb RunAs -Wait -ArgumentList @('-NoProfile','-ExecutionPolicy','Bypass','-File','$_elevated_ps1_win')" 2>&1 \
# Run the elevated payload via -File (no quoting hell). Capture
# the Process object, wait for it, then explicitly exit this
# non-elevated PowerShell with the elevated process's exit code.
# -ExecutionPolicy Bypass so the elevated PS doesn't refuse
# the unsigned .ps1.
powershell.exe -NoProfile -Command "\$p = Start-Process powershell -Verb RunAs -PassThru -ArgumentList @('-NoProfile','-ExecutionPolicy','Bypass','-File','$_elevated_ps1_win'); Wait-Process -InputObject \$p; exit \$p.ExitCode" 2>&1 \

Copilot uses AI. Check for mistakes.
Comment thread install.sh
Comment on lines +391 to +393
_parse_errs=$(powershell.exe -NoProfile -Command "
\$tokens = \$null; \$errors = \$null;
[System.Management.Automation.Language.Parser]::ParseFile('$_elevated_ps1_win', [ref]\$tokens, [ref]\$errors) | Out-Null;
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$_elevated_ps1_win is embedded inside a single-quoted PowerShell string in the ParseFile('...') call. If the Windows path contains a ' (apostrophe) (e.g., a username like O'Connor), the parse-check command will break and may produce misleading errors. Prefer passing the path as an argument (or via an env var) and using -LiteralPath, or escape single quotes before embedding.

Suggested change
_parse_errs=$(powershell.exe -NoProfile -Command "
\$tokens = \$null; \$errors = \$null;
[System.Management.Automation.Language.Parser]::ParseFile('$_elevated_ps1_win', [ref]\$tokens, [ref]\$errors) | Out-Null;
_parse_errs=$(AIRC_ELEVATED_PS1_WIN="$_elevated_ps1_win" powershell.exe -NoProfile -Command "
\$tokens = \$null; \$errors = \$null;
[System.Management.Automation.Language.Parser]::ParseFile(\$env:AIRC_ELEVATED_PS1_WIN, [ref]\$tokens, [ref]\$errors) | Out-Null;

Copilot uses AI. Check for mistakes.
Comment thread install.sh
Comment on lines +433 to +434
# the unsigned .ps1.
powershell.exe -NoProfile -Command "Start-Process powershell -Verb RunAs -Wait -ArgumentList @('-NoProfile','-ExecutionPolicy','Bypass','-File','$_elevated_ps1_win')" 2>&1 \
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The staged script path is passed to PowerShell as a single-quoted string element ('$_elevated_ps1_win') inside -ArgumentList. This will break if the Windows path contains a ' character. Consider passing the path via $env:... / -ArgumentList @(..., $path) where $path is built inside PowerShell, or escape quotes when interpolating from bash.

Suggested change
# the unsigned .ps1.
powershell.exe -NoProfile -Command "Start-Process powershell -Verb RunAs -Wait -ArgumentList @('-NoProfile','-ExecutionPolicy','Bypass','-File','$_elevated_ps1_win')" 2>&1 \
# the unsigned .ps1. Pass the path via env so PowerShell
# doesn't have to parse an interpolated single-quoted literal.
AIRC_ELEVATED_PS1_WIN="$_elevated_ps1_win" \
powershell.exe -NoProfile -Command '$p=$env:AIRC_ELEVATED_PS1_WIN; Start-Process powershell -Verb RunAs -Wait -ArgumentList @("-NoProfile","-ExecutionPolicy","Bypass","-File",$p)' 2>&1 \

Copilot uses AI. Check for mistakes.
Comment thread install.sh
Comment on lines 243 to 245
# blinks for a half second so i have no idea"). Log lives at
# $env:TEMP\airc-install-elevated.log; bash side surfaces it
# below regardless of success/failure.
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says the elevated log lives at $env:TEMP\airc-install-elevated.log, but the code now consistently uses [System.IO.Path]::GetTempPath() to avoid the Git Bash TEMP=/tmp mismatch. Please update this comment to reflect the actual mechanism/path so it doesn't reintroduce confusion during debugging.

Suggested change
# blinks for a half second so i have no idea"). Log lives at
# $env:TEMP\airc-install-elevated.log; bash side surfaces it
# below regardless of success/failure.
# blinks for a half second so i have no idea"). Log is written to
# [System.IO.Path]::GetTempPath() + "airc-install-elevated.log";
# bash side surfaces it below regardless of success/failure.

Copilot uses AI. Check for mistakes.
…icient for sshd)

icacls /grant SYSTEM:F /grant Administrators:F succeeded per the
transcript on continuum-b69f's box, but sshd-as-LocalSystem still
refused to load the keys with errors 5+13 (ACCESS_DENIED + ACL fails
secure_permission_check). The post-fix ACLs are technically correct
(SYSTEM + Admins only, no inheritance), but OpenSSH's permission check
is fragile w.r.t. owner identity and explicit-vs-inherited handling.

Cleaner: delete any existing host_key files and re-run ssh-keygen -A.
Since ssh-keygen -A here runs from an elevated SYSTEM-context
PowerShell, it sets the right owner (SYSTEM) and ACEs at creation
time -- which sshd accepts. This sidesteps every "what does icacls
think SYSTEM:(F) means" question entirely.

Safe at install time: the host hasn't published any fingerprint to
peers yet, so regenerating doesn't break anything. Subsequent installs
where sshd is already Running (state == Running) skip this whole
ensure_sshd_running block via the case statement.

Also added a post-regen `icacls <rsa-key>` dump to the transcript so
we can see at a glance what the resulting ACL looks like -- saves a
UAC round-trip the next time something looks off.
…keys

Found via post-regen ACL dump on continuum-b69f 2026-04-28:

  C:\ProgramData\ssh\ssh_host_rsa_key BUILTIN\Administrators:(F)
                                      NT AUTHORITY\SYSTEM:(F)
                                      BIGMAMA\green:(M)    <-- the bug

ssh-keygen -A on Windows leaves an ACE for whichever user ran it (the
creator), even when running elevated. OpenSSH's secure_permission_check
rejects any non-(owner|SYSTEM|Administrators) ACE -- so the freshly
regenerated keys still failed sshd's check, even though they had no
inheritance and SYSTEM + Admins had Full Control.

Fix: after ssh-keygen -A, run icacls /remove:g $(whoami) on each
host_*_key to strip the creator's ACE. Combined with /inheritance:r
+ /grant SYSTEM:F + Admins:F, the resulting ACL is exactly what sshd
wants: just SYSTEM and Administrators, no inheritance, no extras.

The post-fix ACL is dumped to the transcript so we can verify it
visually -- and so future "wait sshd still won't start" diagnoses
have a paper trail of what the ACL looked like.
Copilot AI review requested due to automatic review settings April 28, 2026 05:03
Found via Get-Acl owner check on continuum-b69f 2026-04-28: even after
removing creator's ACE, ssh-keygen -A leaves the file OWNER as
BIGMAMA\green (the elevated user). OpenSSH's secure_permission_check
also looks at owner -- if the owner isn't in {SYSTEM, Administrators,
running sshd user}, the check fails with error 13 even though access
control entries are correct.

Adding icacls /setowner 'NT AUTHORITY\SYSTEM' before the inheritance
and grant calls so SYSTEM owns the key. Owner = SYSTEM, ACEs = SYSTEM
+ Admins, no creator, no inheritance -- the canonical OpenSSH-on-
Windows host key permission state.
Adds a 'sshd -t' dry-run step from the elevated context and dumps the
post-fix file owner alongside the ACL. Goal: when Start-Service sshd
fails, the transcript shows exactly what sshd itself complains about
('no hostkeys available' vs 'bad ownership' vs config syntax) without
needing another UAC round-trip to query.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

install.sh:454

  • Start-Process -Wait does not propagate the launched process’ exit code by default; powershell.exe typically exits 0 as long as Start-Process succeeded. As written, _elev_rc won’t reflect failures inside the elevated script. If you need the elevated script’s exit code, use -PassThru and then exit $proc.ExitCode (or equivalent) and update the comment accordingly.
          ok "sshd running (Windows OpenSSH.Server)"
          return 0
          ;;
        Stopped|StopPending|StartPending|Paused|"")
          info "Configuring OpenSSH.Server + HNS port-22 reservation (UAC prompt incoming)."

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread install.sh
Comment on lines +411 to +415
if [ -f "$_elevated_ps1" ]; then
local _tmp_bom="$_elevated_ps1.bom"
printf '\xEF\xBB\xBF' > "$_tmp_bom"
cat "$_elevated_ps1" >> "$_tmp_bom"
mv "$_tmp_bom" "$_elevated_ps1"
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parse preflight embeds $_elevated_ps1_win inside a PowerShell single-quoted string. If the Windows path contains an apostrophe (e.g. in the username), the PowerShell snippet will fail to parse and the install will abort. Pass the path via -ArgumentList/$args[0], or escape ' for PowerShell string literals before interpolating it.

Copilot uses AI. Check for mistakes.
Comment thread install.sh
Comment on lines +322 to +326
# secure_permission_check requires owner in {SYSTEM, Administrators,
# running sshd user}. Setting owner to SYSTEM is the safe default.
$me = (whoami).Trim()
$newKeys = Get-ChildItem (Join-Path $sshDir 'ssh_host_*_key') -ErrorAction SilentlyContinue
foreach ($k in $newKeys) {
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ACL fix grants SYSTEM+Administrators and removes the creator ACE, but it never sets the file owner. If OpenSSH’s permission check enforces owner=SYSTEM/Administrators, sshd may still fail. Consider adding an explicit owner fix (e.g. icacls ... /setowner) as part of the remediation.

Copilot uses AI. Check for mistakes.
Comment thread install.sh
Comment on lines +269 to +273

# No global try/catch, no $ErrorActionPreference = "Stop". Each step
# runs plainly; if a cmdlet errors, PowerShell prints the error to the
# transcript and execution continues. Bash side detects success/failure
# from Get-Service sshd post-check, not from this script's exit code.
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The staged script avoids global error handling and bases success/failure on a post-check outside the script. With this approach, failures in earlier required steps (capability/HNS/firewall/DefaultShell) may be silent from the caller’s perspective if sshd happens to end up Running. Consider emitting a stable failure marker and/or a non-zero exit when a required step fails, while still keeping the transcript for diagnostics.

Suggested change
# No global try/catch, no $ErrorActionPreference = "Stop". Each step
# runs plainly; if a cmdlet errors, PowerShell prints the error to the
# transcript and execution continues. Bash side detects success/failure
# from Get-Service sshd post-check, not from this script's exit code.
$failureMarker = "AIRC_ELEVATED_INSTALL_FAILED";
$ErrorActionPreference = "Stop";
trap {
Write-Host $failureMarker;
Write-Error $_;
try { Stop-Transcript | Out-Null } catch {}
exit 1;
}
# Required setup steps must fail fast so the caller does not treat the
# install as successful based only on a later sshd status check. Keep
# the transcript for diagnostics and emit a stable marker on failure.

Copilot uses AI. Check for mistakes.
Comment thread install.sh
Comment on lines +306 to +310
$existing = Get-ChildItem (Join-Path $sshDir 'ssh_host_*') -ErrorAction SilentlyContinue
if ($existing) {
Write-Host " removing $($existing.Count) existing host key file(s)"
$existing | Remove-Item -Force -ErrorAction SilentlyContinue
}
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleting any existing C:\ProgramData\ssh\ssh_host_* files will rotate host key fingerprints on re-runs (e.g. if sshd is merely stopped), causing client trust/MITM warnings. Prefer leaving existing host keys in place; run ssh-keygen -A to generate missing keys and repair ACLs/ownership in-place, and only regenerate keys as a last resort when sshd cannot start due to permission issues.

Copilot uses AI. Check for mistakes.
Comment thread install.sh
Comment on lines +295 to +296
# from this elevated SYSTEM-context process. ssh-keygen -A sets the
# right ACLs at creation time (owner = SYSTEM, ACEs = SYSTEM + Admins).
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here says the payload runs in a SYSTEM context, but Start-Process ... -Verb RunAs elevates to the current user’s admin token (not SYSTEM). This matters for host-key ownership/ACL expectations. Please adjust the comment to reflect the actual execution context (or change the launch mechanism if SYSTEM is truly required).

Suggested change
# from this elevated SYSTEM-context process. ssh-keygen -A sets the
# right ACLs at creation time (owner = SYSTEM, ACEs = SYSTEM + Admins).
# from this elevated administrative process. That recreates the keys with
# fresh default ACLs at creation time, but this path should not assume
# SYSTEM ownership unless the installer is actually launched as SYSTEM.

Copilot uses AI. Check for mistakes.
…tual MS-documented cause)

WebSearch turned up the exact MS Learn KB for our symptom (sshd -t passes
from elevated, Start-Service fails 1067, no event log entry):

  https://learn.microsoft.com/en-us/troubleshoot/windows-server/system-management-components/error-1053-1067-7034-after-update-openssh-doesnt-start

  "This issue occurs if the C:\ProgramData\ssh and C:\ProgramData\ssh\logs
   folders have incorrect permissions. The permissions might be too
   limited or too open. For example, the SYSTEM account or the
   Administrators group might not have write permissions. For a second
   example, regular users might have write or full control permissions."

Required ACL on each folder:
  SYSTEM              : Full Control
  Administrators      : Full Control
  Authenticated Users : Read & execute  (no write)
Owner: SYSTEM.

Up to this commit we'd been fixing the host_*_key file ACLs only, never
the parent folder. The Microsoft fix is on the FOLDER. Adds a new
elevated-payload step that sets owner + inheritance + ACEs on both
C:\ProgramData\ssh and C:\ProgramData\ssh\logs with (OI)(CI) inheritance
flags so newly-created files inherit correctly.

The Oct-2024 update introduced this strictness; the March-2025 update
loosened it back into a warning ("Event ID 4: write access is granted
to the following users: ..."), so machines fully patched past March
2025 may not need this. But continuum-b69f's box (Windows 11 24H2,
build 26100.8115, otherwise fully patched) is still hitting the
strict-mode failure -- so applying the documented fix is still required.
…ctual blocker)

OpenSSH/Admin event log on continuum-b69f revealed the real blocker:

  sshd: error: Bind to port 22 on 0.0.0.0 failed: Permission denied.
  sshd: error: Bind to port 22 on :: failed: Permission denied.
  sshd: fatal: Cannot bind any address.

Even with the HNS reg key (EnableExcludedPortRange=0) set AND netsh
showing port 22 in the excluded range ('22  22  *' administered),
sshd-as-LocalSystem still got EACCES on bind. HNS service was holding
port 22 at a layer below netsh visibility -- the reg key + netsh
reservation only take effect after a Restart-Service hns (or reboot).

Adds an HNS restart immediately after the port-22 reservation step.
Now sshd can actually bind port 22 when Start-Service runs the next
step. This was already documented in continuum-b69f's memory file
(reference_airc_windows.md) but the install.sh implementation never
actually restarted the service.

The host-key permission saga from the prior 7 commits in this branch
turned out to be a sidequest -- those issues were real but not the
blocker. sshd -t (which doesn't bind a socket) was passing the whole
time. The real failure was at bind time, not at config-load time.
Copilot AI review requested due to automatic review settings April 28, 2026 05:12
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread install.sh
Comment on lines +363 to +364
icacls $d /inheritance:r 2>&1 | Out-Null
icacls $d /grant 'NT AUTHORITY\SYSTEM:(OI)(CI)(F)' 'BUILTIN\Administrators:(OI)(CI)(F)' 'NT AUTHORITY\Authenticated Users:(OI)(CI)(RX)' 2>&1 | Out-Null
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The directory ACL “fix” uses icacls ... /inheritance:r plus /grant, which only adds ACEs and can leave pre-existing explicit permissions in place. That undermines the goal of ensuring C:\ProgramData\ssh and ...\logs are not writable by regular users, and may also fail OpenSSH’s strict-perms checks if extra ACEs remain. Consider using icacls in a way that replaces the DACL (e.g., /grant:r plus explicit /remove of broad principals, or icacls ... /reset followed by a known-good set) so the resulting ACL matches the documented required set.

Suggested change
icacls $d /inheritance:r 2>&1 | Out-Null
icacls $d /grant 'NT AUTHORITY\SYSTEM:(OI)(CI)(F)' 'BUILTIN\Administrators:(OI)(CI)(F)' 'NT AUTHORITY\Authenticated Users:(OI)(CI)(RX)' 2>&1 | Out-Null
icacls $d /reset 2>&1 | Out-Null
icacls $d /inheritance:r 2>&1 | Out-Null
icacls $d /grant:r 'NT AUTHORITY\SYSTEM:(OI)(CI)(F)' 'BUILTIN\Administrators:(OI)(CI)(F)' 'NT AUTHORITY\Authenticated Users:(OI)(CI)(RX)' 2>&1 | Out-Null

Copilot uses AI. Check for mistakes.
Comment thread install.sh
Comment on lines +389 to +414
$regChanged = $false
if ($reg -ne 0) {
reg add "HKLM\SYSTEM\CurrentControlSet\Services\hns\State" /v "EnableExcludedPortRange" /d 0 /f | Out-Null;
Write-Host " HNS auto-exclusion disabled"
$regChanged = $true
} else { Write-Host " HNS auto-exclusion already off" }
$excl = netsh int ipv4 show excludedportrange protocol=tcp | Out-String;
if ($excl -notmatch "(?m)^\s*22\s+22\b") {
netsh int ipv4 add excludedportrange protocol=tcp startport=22 numberofports=1 | Out-Null;
Write-Host " port 22 reserved in static excluded-port-range"
} else { Write-Host " port 22 already reserved" }

# Verify port 22 is actually claimable. If HNS has it reserved at a
# layer below netsh-visible (Hyper-V/WSL2/Docker share dynamic port
# ranges via HNS), a restart of the HNS service is the only way to
# re-evaluate the reservation. Without this, netsh shows port 22
# excluded but sshd-as-LocalSystem still gets EACCES on bind:
# sshd: error: Bind to port 22 on 0.0.0.0 failed: Permission denied.
# sshd: fatal: Cannot bind any address.
# Verified on continuum-b69f 2026-04-28 in OpenSSH/Admin event log.
$hns = Get-Service hns -ErrorAction SilentlyContinue
if ($hns -and $hns.Status -eq 'Running') {
Write-Host " restarting HNS service so port-22 reservation takes effect"
Restart-Service hns -Force -ErrorAction SilentlyContinue
Start-Sleep -Seconds 2
Write-Host " HNS state: $((Get-Service hns).Status)"
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$regChanged is set but never used, and the script restarts the HNS service whenever it is Running. Restarting HNS can disrupt WSL2/Docker/Hyper-V networking; it should ideally be done only when it’s actually needed (e.g., when EnableExcludedPortRange was changed or when the port-22 excluded range was newly added). Consider using $regChanged (and a similar flag for the netsh add excludedportrange call) to conditionally restart HNS, or remove the restart if no state changed.

Copilot uses AI. Check for mistakes.
Comment thread install.sh
Comment on lines +519 to 524
# Run the elevated payload via -File (no quoting hell). Start-
# Process -Wait propagates the elevated process's exit code.
# -ExecutionPolicy Bypass so the elevated PS doesn't refuse
# the unsigned .ps1.
powershell.exe -NoProfile -Command "Start-Process powershell -Verb RunAs -Wait -ArgumentList @('-NoProfile','-ExecutionPolicy','Bypass','-File','$_elevated_ps1_win')" 2>&1 \
|| _elev_rc=$?
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment and logic here assume Start-Process -Wait will propagate the elevated PowerShell process’s exit code into the non-elevated powershell.exe exit status, but Start-Process does not do that by default. As written, _elev_rc is effectively only detecting “Start-Process/UAC failed”, not “payload failed”. If you want a reliable payload exit code, capture it via Start-Process -PassThru -Wait and exit with the returned process’s ExitCode; otherwise, adjust the comment/logic to reflect that the post-check is the authoritative success signal.

Copilot uses AI. Check for mistakes.
Comment thread install.sh
Comment on lines +269 to +275

# No global try/catch, no $ErrorActionPreference = "Stop". Each step
# runs plainly; if a cmdlet errors, PowerShell prints the error to the
# transcript and execution continues. Bash side detects success/failure
# from Get-Service sshd post-check, not from this script's exit code.
# Anything wrapped in try/catch below is wrapped because the failure is
# *expected* and *recoverable* (e.g. ssh-keygen missing -> warn + skip).
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The staged PowerShell payload now intentionally avoids a global try/catch and always ends with exit 0, which means the script no longer emits a stable failure marker (the prior airc-elevated-error: string) and cannot communicate step failures via exit code. Since the bash side still tries to infer “payload failed” from the transcript contents / exit status, consider reintroducing a consistent, parseable failure signal (or updating the bash-side detection logic accordingly) so failures don’t get silently treated as success when sshd happens to be Running.

Suggested change
# No global try/catch, no $ErrorActionPreference = "Stop". Each step
# runs plainly; if a cmdlet errors, PowerShell prints the error to the
# transcript and execution continues. Bash side detects success/failure
# from Get-Service sshd post-check, not from this script's exit code.
# Anything wrapped in try/catch below is wrapped because the failure is
# *expected* and *recoverable* (e.g. ssh-keygen missing -> warn + skip).
$ErrorActionPreference = "Stop"
trap {
$msg = $_.Exception.Message
if (-not $msg) { $msg = $_.ToString() }
Write-Host ("airc-elevated-error: " + $msg)
try { Stop-Transcript | Out-Null } catch {}
exit 1
}
# Unexpected failures must be visible to the bash caller via both a
# stable transcript marker and a non-zero exit status. Steps that are
# intentionally recoverable should continue to use local try/catch.

Copilot uses AI. Check for mistakes.
Comment thread install.sh
Comment on lines +297 to +309
# Since this is install-time setup and the host hasn't published any
# fingerprint yet, regenerating is safe -- nobody is trusting these
# keys yet from a client.
$sshKeygen = Join-Path $env:WINDIR "System32\OpenSSH\ssh-keygen.exe";
if (-not (Test-Path $sshKeygen)) {
Write-Host " WARN: ssh-keygen.exe not found at $sshKeygen -- sshd will fail to start"
} else {
$sshDir = 'C:\ProgramData\ssh';
if (-not (Test-Path $sshDir)) { New-Item -Path $sshDir -ItemType Directory -Force | Out-Null }
$existing = Get-ChildItem (Join-Path $sshDir 'ssh_host_*') -ErrorAction SilentlyContinue
if ($existing) {
Write-Host " removing $($existing.Count) existing host key file(s)"
$existing | Remove-Item -Force -ErrorAction SilentlyContinue
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The elevated PowerShell payload deletes any existing C:\ProgramData\ssh\ssh_host_* files before running ssh-keygen -A. If a user re-runs install on a machine where OpenSSH was already in use (e.g., sshd installed but currently Stopped), this will rotate host keys and break existing client trust (host key mismatch) / potentially disrupt automation. Consider gating deletion/regeneration to the “fresh install” path only (e.g., only when $cap.State was not Installed before Add-WindowsCapability, or only when keys are missing), and otherwise just repair permissions / generate missing keys without rotating existing ones.

Suggested change
# Since this is install-time setup and the host hasn't published any
# fingerprint yet, regenerating is safe -- nobody is trusting these
# keys yet from a client.
$sshKeygen = Join-Path $env:WINDIR "System32\OpenSSH\ssh-keygen.exe";
if (-not (Test-Path $sshKeygen)) {
Write-Host " WARN: ssh-keygen.exe not found at $sshKeygen -- sshd will fail to start"
} else {
$sshDir = 'C:\ProgramData\ssh';
if (-not (Test-Path $sshDir)) { New-Item -Path $sshDir -ItemType Directory -Force | Out-Null }
$existing = Get-ChildItem (Join-Path $sshDir 'ssh_host_*') -ErrorAction SilentlyContinue
if ($existing) {
Write-Host " removing $($existing.Count) existing host key file(s)"
$existing | Remove-Item -Force -ErrorAction SilentlyContinue
# Preserve any existing host keys so a reinstall does not rotate the
# server identity and break existing client trust. ssh-keygen -A only
# generates missing host keys, so it is safe to run without deleting.
$sshKeygen = Join-Path $env:WINDIR "System32\OpenSSH\ssh-keygen.exe";
if (-not (Test-Path $sshKeygen)) {
Write-Host " WARN: ssh-keygen.exe not found at $sshKeygen -- sshd will fail to start"
} else {
$sshDir = 'C:\ProgramData\ssh';
if (-not (Test-Path $sshDir)) { New-Item -Path $sshDir -ItemType Directory -Force | Out-Null }
$existingKeys = Get-ChildItem (Join-Path $sshDir 'ssh_host_*_key') -ErrorAction SilentlyContinue
if ($existingKeys) {
Write-Host " preserving $($existingKeys.Count) existing host key file(s)"
} else {
Write-Host " no existing host keys found; generating host keys"

Copilot uses AI. Check for mistakes.
@joelteply joelteply merged commit 615b57a into canary Apr 28, 2026
10 checks passed
@joelteply joelteply deleted the fix/windows-ps1-file-and-hostkey-acls branch April 28, 2026 05:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants