Skip to content

Watchman#1

Open
dturner-tw wants to merge 8 commits intomasterfrom
watchman
Open

Watchman#1
dturner-tw wants to merge 8 commits intomasterfrom
watchman

Conversation

@dturner-tw
Copy link
Owner

No description provided.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just make libcrypto a real requirement (since it's required for libcurl to support ssl anyway and realistically will be there) and remove this thing?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. (I think the way I'll respond to these is to create new commits to
address each of them, and then do another round of history rewriting before
sending upstream.)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, on reflection, I would rather keep this. If I were writing this for Twitter-internal use, I would happily get rid of it. But there's no way that git can ship with an index format that can only be used if git is built with OpenSSL -- that would be just too picky. Wrapping large parts of read-cache.c in #ifdef seems like a red flag. Am I crazy here?

Signed-off-by: David Turner <dturner@twitter.com>
The index is integrity-checked with SHA1.  Break the code to write the
index while computing the SHA1 out into a separate file, so that it's
easier for other code to use.

Signed-off-by: David Turner <dturner@twitter.com>
Using SHA-1 to check the index integrity takes up about 5% of the
runtime of git status (depending on the repository). We can do better
by using VMAC instead.  VMAC isn't really intended for integrity
checking, but it has the correct properties.  It is also approximately
5 times faster than SHA-1 on my machine.

Signed-off-by: David Turner <dturner@twitter.com>
Cache a cache for git's view of the working tree.  This can be used to
reduce the number of syscalls to opendir, readdir, and lstat.

Signed-off-by: David Turner <dturner@twitter.com>
When we first call get_fs_cache_file, set git_fs_cache_file to an
absolute path.  This means that writing the fs_cache in atexit will
work despite intervening chdirs.

Signed-off-by: David Turner <dturner@twitter.com>
Add support for filesystem view caching and updating via Facebook's
Watchman daemon.

Signed-off-by: David Turner <dturner@twitter.com>
Remove str(i?)hash functions from hashmap.c as they are unused

Signed-off-by: David Turner <dturner@twitter.com>
The original FNV hashing algorithm processes one byte at a time.  SSE
lets us process more.  This is a lackadaisically tuned implementation;
in my tests, it was the fastest by a hair.  It is about twice as fast
as the original FNV, for about a 5-10% speedup on git status (with
watchman).

Note that you are unlikely to see speed improvements in test-hashmap,
since it uses very short, very similar strings.

Signed-off-by: David Turner <dturner@twitter.com>
dturner-tw pushed a commit that referenced this pull request Jun 23, 2015
* jc/t9001-modernise:
  t9001: style modernisation phase #5
  t9001: style modernisation phase #4
  t9001: style modernisation phase #3
  t9001: style modernisation phase #2
  t9001: style modernisation phase #1
dturner-tw pushed a commit that referenced this pull request Jun 23, 2015
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dturner-tw pushed a commit that referenced this pull request Jun 25, 2015
The collect_parents() function now is responsible for

 1. parsing the commits given on the command line into a list of
    commits to be merged;

 2. filtering these parents into independent ones; and

 3. optionally calling fmt_merge_msg() via prepare_merge_message()
    to prepare an auto-generated merge log message, using fake
    contents that FETCH_HEAD would have had if these commits were
    fetched from the current repository with "git pull . $args..."

Make "git merge FETCH_HEAD" to be the same as the traditional

    git merge "$(git fmt-merge-msg <.git/FETCH_HEAD)" $commits

invocation of the command in "git pull", where $commits are the ones
that appear in FETCH_HEAD that are not marked as not-for-merge, by
making it do a bit more, specifically:

 - noticing "FETCH_HEAD" is the only "commit" on the command line
   and picking the commits that are not marked as not-for-merge as
   the list of commits to be merged (substitute for step #1 above);

 - letting the resulting list fed to step #2 above;

 - doing the step #3 above, using the contents of the FETCH_HEAD
   instead of fake contents crafted from the list of commits parsed
   in the step #1 above.

Note that this changes the semantics.  "git merge FETCH_HEAD" has
always behaved as if the first commit in the FETCH_HEAD file were
directly specified on the command line, creating a two-way merge
whose auto-generated merge log said "merge commit xyz".  With this
change, if the previous fetch was to grab multiple branches (e.g.
"git fetch $there topic-a topic-b"), the new world order is to
create an octopus, behaving as if "git pull $there topic-a topic-b"
were run.  This is a deliberate change to make that happen, and
can be seen in the changes to t3033 tests.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
dturner-tw pushed a commit that referenced this pull request Sep 3, 2015
When ac49f5c (rerere "remaining", 2011-02-16) split out a new
helper function check_one_conflict() out of find_conflict()
function, so that the latter will use the returned value from the
new helper to update the loop control variable that is an index into
active_cache[], the new variable incremented the index by one too
many when it found a path with only stage #1 entry at the very end
of active_cache[].

This "strange" return value does not have any effect on the loop
control of two callers of this function, as they all notice that
active_nr+2 is larger than active_nr just like active_nr+1 is, but
nevertheless it puzzles the readers when they are trying to figure
out what the function is trying to do.

In fact, there is no need to do an early return.  The code that
follows after skipping the stage #1 entry is fully prepared to
handle a case where the entry is at the very end of active_cache[].

Help future readers from unnecessary confusion by dropping an early
return.  We skip the stage #1 entry, and if there are stage #2 and
stage #3 entries for the same path, we diagnose the path as
THREE_STAGED (otherwise we say PUNTED), and then we skip all entries
for the same path.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
dturner-tw pushed a commit that referenced this pull request Sep 3, 2015
A conflicted index can have multiple stage #1 entries when dealing
with a criss-cross merge and using the "resolve" merge strategy.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
dturner-tw pushed a commit that referenced this pull request Sep 3, 2015
A conflicted index can have multiple stage #1 entries when dealing
with a criss-cross merge and using the "resolve" merge strategy.

Plug the leak by reading only the first one of the same stage
entries.

Strictly speaking, this fix does change the semantics, in that we
used to use the last stage #1 entry as the common ancestor when
doing the plain-vanilla three-way merge, but with the leak fix, we
will use the first stage #1 entry.  But it is not a grave backward
compatibility breakage.  Either way, we are arbitrarily picking one
of multiple stage #1 entries and using it, ignoring others, and
there is no meaning in the ordering of these stage #1 entries.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
dturner-tw pushed a commit that referenced this pull request Sep 3, 2015
Code clean-up and minor fixes (so far).

* jc/rerere: (21 commits)
  rerere: un-nest merge() further
  rerere: use "struct rerere_id" instead of "char *" for conflict ID
  rerere: call conflict-ids IDs
  rerere: further clarify do_rerere_one_path()
  rerere: further de-dent do_plain_rerere()
  rerere: refactor "replay" part of do_plain_rerere()
  rerere: explain the remainder
  rerere: explain "rerere forget" codepath
  rerere: explain the primary codepath
  rerere: explain MERGE_RR management helpers
  rerere: fix benign off-by-one non-bug and clarify code
  rerere: explain the rerere I/O abstraction
  rerere: do not leak mmfile[] for a path with multiple stage #1 entries
  rerere: stop looping unnecessarily
  rerere: drop want_sp parameter from is_cmarker()
  rerere: report autoupdated paths only after actually updating them
  rerere: write out each record of MERGE_RR in one go
  rerere: lift PATH_MAX limitation
  rerere: plug conflict ID leaks
  rerere: handle conflicts with multiple stage #1 entries
  ...
dturner-tw pushed a commit that referenced this pull request Oct 13, 2015
Code clean-up and minor fixes.

* jc/rerere: (21 commits)
  rerere: un-nest merge() further
  rerere: use "struct rerere_id" instead of "char *" for conflict ID
  rerere: call conflict-ids IDs
  rerere: further clarify do_rerere_one_path()
  rerere: further de-dent do_plain_rerere()
  rerere: refactor "replay" part of do_plain_rerere()
  rerere: explain the remainder
  rerere: explain "rerere forget" codepath
  rerere: explain the primary codepath
  rerere: explain MERGE_RR management helpers
  rerere: fix benign off-by-one non-bug and clarify code
  rerere: explain the rerere I/O abstraction
  rerere: do not leak mmfile[] for a path with multiple stage #1 entries
  rerere: stop looping unnecessarily
  rerere: drop want_sp parameter from is_cmarker()
  rerere: report autoupdated paths only after actually updating them
  rerere: write out each record of MERGE_RR in one go
  rerere: lift PATH_MAX limitation
  rerere: plug conflict ID leaks
  rerere: handle conflicts with multiple stage #1 entries
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants