Skip to content

Conversation

ZERICO2005
Copy link
Contributor

@ZERICO2005 ZERICO2005 commented Sep 6, 2025

calloc (when __TICE__ is defined) now uses an inlined implementation of bzero which uses the $E40000 address to speed up the zero filling of memory.

Otherwise, it will use the previous memset implementation when __TICE__ is undefined.

Comment on lines 25 to 27
add hl, bc
or a, a
sbc hl, bc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the add here should never carry?

Suggested change
add hl, bc
or a, a
sbc hl, bc
add hl, bc
sbc hl, bc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if you also move up the pop of the size (the pop of the parameter is undefined data because malloc may clobber it)

Copy link
Contributor Author

@ZERICO2005 ZERICO2005 Sep 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this go off the assumption that malloc won't return an address higher than 0xE40000 on the CE and that the allocation size is less than 0x1C0000 bytes or etc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relies on the seemingly sound assumption that the pointer to the allocated memory plus the allocation size does not overflow.

@ZERICO2005 ZERICO2005 changed the title optimized the zero filling in calloc (__TICE__ only) changed behaviour of malloc(0) and optimized zero filling in calloc Sep 6, 2025
@ZERICO2005 ZERICO2005 changed the title changed behaviour of malloc(0) and optimized zero filling in calloc changed behaviour of malloc(0) and optimized calloc Sep 6, 2025
@runer112 runer112 self-requested a review September 6, 2025 22:31
; inlined memset/bzero
; assumes that malloc(0) returns NULL, so we can skip the check for zero size
add hl, bc
cpd
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume it is okay for cpd to read nonnull_ptr_from_malloc + size

@ZERICO2005
Copy link
Contributor Author

ZERICO2005 commented Sep 16, 2025

To reduce the complexity of this PR, I undid the allocator switches to nonzero_fill_calloc when malloc(0) is known to return NULL commits for now (which essentially added 4 different calloc variants). That way we can focus on the two important changes that will be made in this PR:

  • Changing the behavior of __simple_malloc(0) and __standard_malloc(0) to return NULL.
  • Optimizing calloc zero filling by using the $E40000 all zeros address

@ZERICO2005 ZERICO2005 marked this pull request as ready for review September 21, 2025 00:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

3 participants