-
-
Notifications
You must be signed in to change notification settings - Fork 411
Reduce template function instantiations related to array equality #3152
Conversation
|
Thanks for your pull request, @n8sh! Bugzilla references
Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub run digger -- build "master + druntime#3152" |
|
Pushing to fix a comment x 2 |
7bae49a to
10a3de5
Compare
4baf944 to
bdb3810
Compare
src/core/internal/array/equality.d
Outdated
| // This would improperly allow equality of integers and pointers | ||
| // but the CTFE branch will stop this function from compiling then. | ||
| import core.stdc.string : memcmp; | ||
| return lhs.length == 0 || 0 == memcmp(lhs.ptr, rhs.ptr, lhs.length * T1.sizeof); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure you need the length == 0 case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's an optimisation, for an uncommon case, therefore a slowdown for the common case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's there because I am under the impression that memcmp has undefined behavior for null pointers. If not it can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@n8sh you are correct.
memcpy on null is undefined
|
@n8sh Have you verified that this helps? |
|
@UplinkCoder Yes, I wrote a long program that was nothing but array equality checks, and with the change compiling it takes about 23% as much time and about 22% as much memory. |
src/core/internal/array/equality.d
Outdated
| // exclude opaque structs due to https://issues.dlang.org/show_bug.cgi?id=20959 | ||
| if (!(is(T == struct) && !is(typeof(T.sizeof)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied from #3142
|
I've rebased and updated #3142 in the meantime too. I don't really want to rebase yet another time after pieces of it are ripped out into yet another separate PR (and each with its own dlang issue? Is that tracker spamming (ab)used for changelog generation?). |
I am under the impression that any non-trivial PR should have an associated issue. |
|
Anyway, if this PR is merged before that one (which seems possible because this PR isn't waiting on anyone else's work), I'd be fine with rebasing #3142 for you. Regardless of which PR is pulled first the other will need to be rebased anyway. |
|
It'd be nice to have a comparison with your test program, but the non-identical semantics (also wrt. memcmp application) don't allow a direct comparison. Anyway, getting rid of superfluous instantiations was obviously my main goal after seeing the original code, and the .tupleof thing an unexpected hold-up. [Edit: And AFAICT, this PR shouldn't be needed after mine.] |
|
If I remove all of this PR except the addition of the scalar-specific |
|
For the time being, maybe, but my version, AFAICT, doesn't need any of this, so I'd basically just revert this and apply mine instead. In my version, there's no |
With those provisos, yours compiles in 75% as much time and using 60% as much memory as the baseline. |
Right, but doing so improves compilation speed without interfering with any of the stuff you are doing. |
|
Your version doesn't use memcmp for static array element types etc. though, that's why runtime performance would be interesting as well. Please post your test code as gist or so somewhere, so that I can experiment as well. |
Right, and static arrays are handled by the non-scalar version (since
Sure. You'll see that the code does nothing but test whether there is a net benefit in compilation time for the happy path given the additional work done in template resolution. My starting assumption was that array equality checks are primarily performed between arrays of scalars so that is the case I was focused on improving. https://pastebin.com/d1A5f07s |
|
Thx a lot, I'll look into it another time - I hope(d) to have covered your happy path optimization in a not-unnecessarily-complex |
84145b9 to
585fecd
Compare
This is achieved by splitting off the all-scalar case
and by moving the.atandtrustedCasthelper functions out of the body ofcore.internal.array.equality.__equals