Skip to content

Conversation

lolzdev
Copy link

@lolzdev lolzdev commented Aug 18, 2025

I refactored a bit the code related to fetching dependencies over git so that it's easier to implement different transports.
With that I implemented the git protocol transport (git://) reserving space for ssh eventually (ssh://).

Copy link
Contributor

@ianprime0509 ianprime0509 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that I'm not a member of the Zig core team, so feel free to ignore any or all of this feedback. I am familiar with this code, though, so hopefully some of it will be helpful.

Also, I haven't tested the change extensively, but I did try one example that's relevant to one of my projects and it worked:

ian@ian-laptop:~/src/zig/build$ stage4/bin/zig fetch git+git://g.blicky.net/yxml.git/
N-V-__8AAAVVAQC7bPAomXJZrTzJUIEMXq0LwNXN5mkVw36I

(this server is the only one I know of currently where, for whatever reason, the HTTPS Git server doesn't support the right protocol version, but the Git network protocol apparently works fine)

Comment on lines +656 to +657
// This is a placeholder: git also supports using ssh as a transport
ssh: u8,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to leave this out for now and add it if/when git+ssh support is added, since otherwise it's just dead code.

@@ -811,7 +843,7 @@ pub const Session = struct {
}

const CapabilityIterator = struct {
request: std.http.Client.Request,
request: ?std.http.Client.Request = null,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, but in my opinion, this would read better as a tagged union of all supported transports, e.g.

resource: union(enum) {
    http: std.http.Client.Request,
    git,
},

(I also changed the name here to resource to be a bit more generic, especially since it isn't being used for anything except to make sure it's deinitialized at the right time)

(same comment for RefIterator and FetchStream)

var body: std.Io.Writer = .fixed(response_buffer);
try Packet.write(.{ .data = "command=fetch\n" }, &body);
var body: std.Io.Writer = .fixed(packet_buffer);
try Packet.write(.{ .data = "command=fetch" }, &body);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note, the pack-protocol documentation says

[...] unless otherwise noted the usual pkt-line LF rules apply: the sender SHOULD include a LF, but the receiver MUST NOT complain if it is not present.

so it should be fine to remove the LFs as you did here, although I haven't tested this change extensively across many different servers to make sure they're all following this.

try Packet.write(.{ .data = object_format_packet }, &body);
}
try Packet.write(.delimiter, &body);
// Our packfile parser supports the OFS_DELTA object type
try Packet.write(.{ .data = "ofs-delta\n" }, &body);
try Packet.write(.{ .data = "thin-pack" }, &body);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason behind adding this capability? The documentation describes it as

Request that a thin pack be sent, which is a pack with deltas
which reference base objects not contained within the pack (but
are known to exist at the receiving end). This can reduce the
network traffic significantly, but it requires the receiving end
to know how to "thicken" these packs by adding the missing bases
to the pack.

which doesn't seem relevant to this implementation, since we don't advertise having any existing refs to the server.

@@ -1099,22 +1178,24 @@ pub const Session = struct {
const input = fs.input;
if (fs.remaining_len == 0) {
while (true) {
switch (Packet.peek(input) catch |err| {
peek: switch (Packet.peek(input) catch |err| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the change to use a labeled switch here, since it evaluates to void and there are no associated continues. Was there another motivation for doing this that I'm missing?

if (ascii.eqlIgnoreCase(uri.scheme, "git+http") or
ascii.eqlIgnoreCase(uri.scheme, "git+https"))
{
if (ascii.startsWithIgnoreCase(uri.scheme, "git+")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is currently implemented, the fetch URL ends up starting with git+git://. In my opinion, it would make more sense as just git:// for the Git network protocol, since Git is effectively the "native" application for the protocol.

A couple examples as prior art:

  • NPM: "<protocol> is one of git, git+ssh, git+http, git+https, or git+file."
  • PIP actually supports both: "The supported schemes are git+file, git+https, git+ssh, git+http, git+git and git."


if (ascii.eqlIgnoreCase(transport_uri.scheme, "git")) {
var buffer: [std.Uri.host_name_max]u8 = undefined;
var socket = std.net.tcpConnectToHost(f.arena.allocator(), transport_uri.getHost(&buffer) catch @panic("handle this"), 9418) catch @panic("handle this");
Copy link
Contributor

@ianprime0509 ianprime0509 Aug 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These errors should be handled gracefully, particularly the connection error:

ian@ian-laptop:~/src/zig/build$ stage4/bin/zig fetch git+git://github.com/ianprime0509/zig-xml
thread 440446 panic: handle this
/home/ian/src/zig/lib/std/posix.zig:1861:23: 0x1a2c743 in openatZ (std.zig)
            .NOENT => return error.FileNotFound,
                      ^
[...]

(the reason for this not working is that GitHub does not support the Git network protocol)


if (ascii.eqlIgnoreCase(transport_uri.scheme, "git")) {
var buffer: [std.Uri.host_name_max]u8 = undefined;
var socket = std.net.tcpConnectToHost(f.arena.allocator(), transport_uri.getHost(&buffer) catch @panic("handle this"), 9418) catch @panic("handle this");
Copy link
Contributor

@ianprime0509 ianprime0509 Aug 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where will socket be closed in this implementation? It's not as convenient as the HTTP case where there's a shared HTTP client/pool for all the fetch jobs, unfortunately; I suspect you'll need to introduce some additional resource tracking here to make sure it gets closed after the fetching process is done (this might mean that the type of the git member of the FetchStream resource I suggested in a previous comment will be std.net.Stream).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants