Switch to direct git operations instead of zipball downloads#15
Switch to direct git operations instead of zipball downloads#15
Conversation
Co-authored-by: athas <55833+athas@users.noreply.github.com>
Co-authored-by: athas <55833+athas@users.noreply.github.com>
Co-authored-by: athas <55833+athas@users.noreply.github.com>
athas
left a comment
There was a problem hiding this comment.
The tests clearly fail. The zip archives are not downloaded properly, or not found correctly.
Co-authored-by: athas <55833+athas@users.noreply.github.com>
|
Actually, while the tests fail, it seems like it actually works from a quick manual test. I guess I will have to read this code a little more carefully. |
Fixed in commit c050063. The test zip file (src/util/test_zip/v0.1.0.zip) was accidentally deleted in the first commit and has now been restored. The file is needed by the Zip module's test suite. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
athas
left a comment
There was a problem hiding this comment.
Although there is a comment stating that a cache is used, it actually turns out that the repository corresponding to a package is cloned from scratch whenever package information is requested (e.g. in pkgInfo).
Instead, when smlpkg starts up, it should construct a temporary directory that will contain all clones. The lifetime of the temporary directory should be the entire execution of the smlpkg process. Before cloning a repository, check if it already exists in that directory. This is similar to how futhark pkg does it.
Co-authored-by: athas <55833+athas@users.noreply.github.com>
Implemented in commit 84b1a65. Created a shared cache directory that persists for the lifetime of each smlpkg execution. The Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
src/pkg.sml
Outdated
| val tmpdir = OS.FileSys.tmpName() ^ "-smlpkg-install" | ||
| val () = log ("cloning to temporary directory " ^ tmpdir) | ||
| (* Try shallow clone with branch first, fall back to full clone if that fails *) | ||
| val shallow_clone_cmd = "git clone --depth 1 --branch " ^ System.shellEscape refe ^ |
There was a problem hiding this comment.
Why not reuse the cloneRepo function?
There was a problem hiding this comment.
Fixed in commit bfbce64. Package installation now uses PkgInfo.getCachedRepo to get the cached bare repository and extracts files using git archive instead of cloning fresh.
There was a problem hiding this comment.
Fixed in commit bfbce64. Now using PkgInfo.getCachedRepo to reuse the cached repository.
src/pkginfo.sml
Outdated
| couple of generic functions that are used to implement support for | ||
| both. *) | ||
| (* Shared temporary directory for all repository clones during this execution *) | ||
| val cacheDir : string option ref = ref NONE |
There was a problem hiding this comment.
The cache directory should be deleted when smlpkg terminates.
There was a problem hiding this comment.
Fixed in commit bfbce64. Added cleanupCache function that removes the temporary cache directory when smlpkg terminates. The cleanup is called at the end of execution and also on exceptions.
… exit Co-authored-by: athas <55833+athas@users.noreply.github.com>
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
@athas What needs to be done here? |
|
Someone needs to vouch for this actually working. I have not tested it myself. |
There was a problem hiding this comment.
Pull request overview
This PR switches the package manager from downloading zipballs via HTTP to using git directly for package operations. This enables support for private repositories, arbitrary git hosting sites, and git over SSH connections.
Key changes:
- Removed zipball download functionality and replaced with direct git clone/checkout operations
- Implemented a caching system for bare git repositories to avoid redundant clones
- Added shell escaping utilities for security
- Updated package installation to use
git archivefor efficient file extraction
Reviewed changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| src/util/zip.sml | Deleted zipball extraction functionality (no longer needed) |
| src/util/util.mlb | Removed zip.sml from build |
| src/util/test_zip/v0.1.0.zip | Removed test fixture for zip functionality |
| src/util/test_zip/.gitignore | Removed gitignore for zip test artifacts |
| src/util/test_zip.sml | Removed zip functionality tests |
| src/util/test_zip.mlb | Removed zip test build file |
| src/util/test.mlb | Removed zip tests from test suite |
| src/util/system.sml | Added shellEscape function for secure shell command construction |
| src/util/system.sig | Exported shellEscape function |
| src/util/Makefile | Simplified test makefile (removed zip test cleanup) |
| src/pkginfo.sml | Major rewrite: replaced HTTP-based zipball downloads with git operations, added repository caching, manifest reading from git |
| src/pkginfo.sig | Updated interface: replaced zipball-related functions with git repo functions, added cache management functions |
| src/pkg.sml | Updated package installation to use cached git repositories and git archive, added cache cleanup on exit |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/pkginfo.sml
Outdated
| in if System.doesDirExist repo_dir then | ||
| ( log ("reusing cached repository " ^ repo_dir) | ||
| ; repo_dir ) |
There was a problem hiding this comment.
When a cached repository is reused (line 92-94), there's no mechanism to update it with the latest tags and commits from the remote. This means if a repository has new tags or updates since the first clone, they won't be fetched. Consider running git fetch on cached repositories to ensure they're up-to-date, or documenting this limitation.
src/pkginfo.sml
Outdated
| val cmd = "git " ^ "--git-dir=" ^ System.shellEscape repo_dir ^ " show " ^ | ||
| System.shellEscape (refe ^ ":" ^ manifest_file) |
There was a problem hiding this comment.
The git show command concatenates refe ^ ":" ^ manifest_file before escaping (line 110). This means the entire string including the colon is escaped, which is correct for git's pathspec syntax. However, if refe contains characters that have special meaning in git pathspecs (like wildcards), this could cause unexpected behavior. Consider validating that refe is a simple tag or commit hash.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
I have tried it out a little bit and it does seem to work. |
Implementation Plan: Switch to using Git directly
pkginfo.siginterface to remove zipball-related fieldspkginfo.smlto use git clone/checkout instead of HTTP downloadspkg.smlto work with git clones instead of zipball extractiongetCachedRepofunction from PkgInfoinstallInDirto use cached bare repositoriesgit archiveto extract files efficientlycleanupCachefunction to PkgInfoChanges in this commit
Package installation now uses cached repositories: Modified
pkg.smlto usePkgInfo.getCachedRepoinstead of cloning fresh. Usesgit archiveto efficiently extract files from the bare repository.Cache cleanup on exit: Added
cleanupCachefunction that removes the temporary cache directory. Called at the end of execution and on exceptions to ensure cleanup.Exported cache management functions: Added
getCachedRepoandcleanupCacheto the PKG_INFO signature to allow pkg.sml to access the cache.This addresses all code review comments about cache usage and cleanup.
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.