Skip to content

Conversation

dementive
Copy link

std::unordered_map is used all over in the ClassDB but there doesn't seem to be any good reason for it, internally godot would normally use HashMap for something like this.

I tested the compile time of the test project with ClangBuildAnalyzer and got these results showing that almost all of the slowest templates to instantiate were unordered_map, even though it is only ever use in the ClassDB. These are the full results I got:

first.txt

The relevant parts that showed me that unorderd_map was insanely slow for compile times were here:

**** Time summary:
Compilation (1025 times):
  Parsing (frontend):         2281.7 s
  Codegen & opts (backend):    249.1 s

**** Templates that took longest to instantiate:
 36703 ms: std::unordered_map<godot::StringName, const GDExtensionInstanceBindi... (980 times, avg 37 ms)
 36586 ms: std::__detail::_Map_base<godot::StringName, std::pair<const godot::S... (980 times, avg 37 ms)
 26190 ms: std::vector<godot::Variant>::operator= (981 times, avg 26 ms)
 17916 ms: std::unordered_map<godot::StringName, godot::Object *>::operator[] (980 times, avg 18 ms)
 17818 ms: std::__detail::_Map_base<godot::StringName, std::pair<const godot::S... (980 times, avg 18 ms)
 16274 ms: std::unordered_map<int, int> (980 times, avg 16 ms)
 14134 ms: std::unordered_map<godot::StringName, godot::MethodBind *> (980 times, avg 14 ms)
 13204 ms: std::_Hashtable<godot::StringName, std::pair<const godot::StringName... (980 times, avg 13 ms)
 13012 ms: std::_Hashtable<int, std::pair<const int, int>, std::allocator<std::... (980 times, avg 13 ms)
 12679 ms: std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_... (980 times, avg 12 ms)
 11412 ms: std::unordered_map<godot::StringName, const GDExtensionInstanceBindi... (980 times, avg 11 ms)
 11370 ms: godot::Vector<godot::StringName>::push_back (1004 times, avg 11 ms)
 10899 ms: std::_Hashtable<godot::StringName, std::pair<const godot::StringName... (980 times, avg 11 ms)
 10616 ms: godot::Vector<godot::StringName>::resize (1004 times, avg 10 ms)
 10515 ms: godot::CowData<godot::StringName>::resize<false> (1004 times, avg 10 ms)
 10465 ms: std::unordered_map<godot::StringName, godot::ClassDB::ClassInfo> (980 times, avg 10 ms)
 10459 ms: std::unordered_map<godot::StringName, godot::Object *> (980 times, avg 10 ms)
 10280 ms: std::unordered_map<godot::StringName, godot::ClassDB::ClassInfo::Vir... (980 times, avg 10 ms)
 10278 ms: std::_Hashtable<godot::StringName, std::pair<const godot::StringName... (980 times, avg 10 ms)
 10017 ms: std::_Hashtable<godot::StringName, std::pair<const godot::StringName... (980 times, avg 10 ms)
  9932 ms: std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_... (980 times, avg 10 ms)
  9892 ms: std::vector<godot::Variant>::_M_allocate_and_copy<__gnu_cxx::__norma... (981 times, avg 10 ms)
  9091 ms: std::_Hashtable<godot::StringName, std::pair<const godot::StringName... (980 times, avg 9 ms)
  9035 ms: std::_Hashtable<godot::StringName, std::pair<const godot::StringName... (980 times, avg 9 ms)
  8906 ms: std::_Hashtable<godot::StringName, std::pair<const godot::StringName... (980 times, avg 9 ms)
  8783 ms: std::__uninitialized_copy_a<__gnu_cxx::__normal_iterator<const godot... (981 times, avg 8 ms)
  8580 ms: std::vector<godot::PropertyInfo>::reserve (981 times, avg 8 ms)
  8502 ms: std::uninitialized_copy<__gnu_cxx::__normal_iterator<const godot::Va... (981 times, avg 8 ms)
  7976 ms: godot::List<godot::PropertyInfo>::~List (1003 times, avg 7 ms)
  5548 ms: godot::UtilityFunctions::printerr<> (980 times, avg 5 ms)

**** Template sets that took longest to instantiate:
 73027 ms: std::unordered_map<$> (5880 times, avg 12 ms)
 65127 ms: std::_Hashtable<$> (6860 times, avg 9 ms)
 43060 ms: std::__and_<$> (30704 times, avg 1 ms)
 36703 ms: std::unordered_map<godot::StringName, const GDExtensionInstanceBindi... (980 times, avg 37 ms)
 36586 ms: std::__detail::_Map_base<godot::StringName, std::pair<const godot::S... (980 times, avg 37 ms)
 26190 ms: std::vector<godot::Variant>::operator= (981 times, avg 26 ms)
 24566 ms: std::__or_<$> (28793 times, avg 0 ms)
 23527 ms: std::_Hashtable<$>::_Scoped_node::_Scoped_node<$> (1965 times, avg 11 ms)
 22660 ms: std::__detail::_Hashtable_alloc<$>::_M_allocate_node<$> (1967 times, avg 11 ms)
 19562 ms: std::basic_string<$> (3928 times, avg 4 ms)
 19492 ms: std::__detail::_Insert<$> (6663 times, avg 2 ms)
 17916 ms: std::unordered_map<godot::StringName, godot::Object *>::operator[] (980 times, avg 18 ms)
 17818 ms: std::__detail::_Map_base<godot::StringName, std::pair<const godot::S... (980 times, avg 18 ms)
 17686 ms: std::__detail::_Insert_base<$> (5882 times, avg 3 ms)
 15336 ms: std::chrono::duration<$> (11412 times, avg 1 ms)
 14917 ms: std::basic_string<$>::_M_construct<$> (5796 times, avg 2 ms)
 14713 ms: std::pair<$> (5884 times, avg 2 ms)
 14457 ms: std::__uninitialized_copy_a<$> (2951 times, avg 4 ms)
 14220 ms: std::basic_string<$>::basic_string (5432 times, avg 2 ms)
 14086 ms: __gnu_cxx::__stoa<$> (14810 times, avg 0 ms)
 14014 ms: std::is_trivially_destructible<$> (5046 times, avg 2 ms)
 14002 ms: std::uninitialized_copy<$> (2950 times, avg 4 ms)
 12558 ms: std::vector<$>::reserve (1965 times, avg 6 ms)
 12114 ms: std::vector<$>::_M_allocate_and_copy<$> (1965 times, avg 6 ms)
 11370 ms: godot::Vector<$>::push_back (1004 times, avg 11 ms)
 11067 ms: std::vector<$> (3994 times, avg 2 ms)
 10874 ms: std::is_nothrow_default_constructible<$> (10476 times, avg 1 ms)
 10858 ms: std::__detail::_Hashtable_alloc<$> (5880 times, avg 1 ms)
 10765 ms: std::vector<$>::push_back (1970 times, avg 5 ms)
 10621 ms: godot::Vector<$>::resize (1005 times, avg 10 ms)

Almost all of the slowest to instantiate templates are unordered_map or some std hashing stuff.

Not only is it crazy slow to compile but some benchmarks show that it is also pretty slow at runtime compared to other hash map implementations due to it's constraints: https://github.com/boostorg/boost_unordered_benchmarks/tree/boost_unordered_aggregate.

After changing unordered_map to HashMap in ClassDB and then rerunning ClangBuildAnalyzer I got these results:

second.txt

The parts showing compile time differences in using HashMap vs unordered_map were here:

**** Time summary:
Compilation (1025 times):
  Parsing (frontend):         1609.5 s
  Codegen & opts (backend):    196.3 s

**** Templates that took longest to instantiate:
 20143 ms: std::vector<godot::Variant>::operator= (981 times, avg 20 ms)
  8722 ms: godot::Vector<godot::StringName>::push_back (1004 times, avg 8 ms)
  8120 ms: godot::Vector<godot::StringName>::resize (1004 times, avg 8 ms)
  8044 ms: godot::CowData<godot::StringName>::resize<false> (1004 times, avg 8 ms)
  7572 ms: std::vector<godot::Variant>::_M_allocate_and_copy<__gnu_cxx::__norma... (981 times, avg 7 ms)
  6737 ms: std::__uninitialized_copy_a<__gnu_cxx::__normal_iterator<const godot... (981 times, avg 6 ms)
  6626 ms: std::vector<godot::PropertyInfo>::reserve (981 times, avg 6 ms)
  6524 ms: std::uninitialized_copy<__gnu_cxx::__normal_iterator<const godot::Va... (981 times, avg 6 ms)
  6186 ms: godot::HashMap<godot::StringName, const GDExtensionInstanceBindingCa... (980 times, avg 6 ms)
  6073 ms: godot::List<godot::PropertyInfo>::~List (1003 times, avg 6 ms)
  4838 ms: godot::HashMap<godot::StringName, const GDExtensionInstanceBindingCa... (980 times, avg 4 ms)
  4752 ms: godot::HashMap<godot::StringName, godot::Object *>::operator[] (980 times, avg 4 ms)
  4612 ms: godot::HashMap<godot::StringName, godot::Object *>::_insert (980 times, avg 4 ms)
  4391 ms: std::basic_string<char16_t> (982 times, avg 4 ms)
  4326 ms: godot::UtilityFunctions::printerr<> (980 times, avg 4 ms)
  4168 ms: std::vector<GDExtensionClassMethodArgumentMetadata>::push_back (983 times, avg 4 ms)
  4134 ms: std::__uninitialized_copy<false>::__uninit_copy<__gnu_cxx::__normal_... (981 times, avg 4 ms)
  4095 ms: std::basic_string<char32_t>::basic_string (982 times, avg 4 ms)
  4084 ms: std::vector<godot::PropertyInfo>::push_back (981 times, avg 4 ms)
  4032 ms: std::basic_string<char16_t>::basic_string (982 times, avg 4 ms)
  4006 ms: std::vector<GDExtensionClassMethodArgumentMetadata>::emplace_back<GD... (981 times, avg 4 ms)
  3986 ms: std::vector<godot::PropertyInfo>::emplace_back<godot::PropertyInfo> (981 times, avg 4 ms)
  3956 ms: std::__do_uninit_copy<__gnu_cxx::__normal_iterator<const godot::Vari... (981 times, avg 4 ms)
  3884 ms: std::basic_string<char32_t> (982 times, avg 3 ms)
  3846 ms: std::copy<__gnu_cxx::__normal_iterator<const godot::Variant *, std::... (981 times, avg 3 ms)
  3691 ms: std::__detail::__cyl_bessel_j<long double> (1023 times, avg 3 ms)
  3686 ms: std::__detail::__cyl_bessel_j<float> (1023 times, avg 3 ms)
  3584 ms: std::basic_string<wchar_t> (982 times, avg 3 ms)
  3527 ms: std::basic_string<char> (982 times, avg 3 ms)
  3353 ms: godot::List<godot::PropertyInfo>::clear (1003 times, avg 3 ms)

**** Template sets that took longest to instantiate:
 20143 ms: std::vector<godot::Variant>::operator= (981 times, avg 20 ms)
 15387 ms: std::basic_string<$> (3928 times, avg 3 ms)
 13080 ms: std::is_trivially_destructible<$> (6032 times, avg 2 ms)
 12826 ms: std::__and_<$> (13197 times, avg 0 ms)
 11701 ms: std::chrono::duration<$> (10957 times, avg 1 ms)
 11510 ms: std::basic_string<$>::_M_construct<$> (5385 times, avg 2 ms)
 11310 ms: std::__or_<$> (13678 times, avg 0 ms)
 11064 ms: std::__uninitialized_copy_a<$> (2940 times, avg 3 ms)
 11048 ms: std::basic_string<$>::basic_string (5069 times, avg 2 ms)
 10713 ms: std::uninitialized_copy<$> (2932 times, avg 3 ms)
  9650 ms: std::vector<$>::reserve (1965 times, avg 4 ms)
  9501 ms: godot::HashMap<$>::_insert (1964 times, avg 4 ms)
  9372 ms: __gnu_cxx::__stoa<$> (12664 times, avg 0 ms)
  9248 ms: std::vector<$>::_M_allocate_and_copy<$> (1965 times, avg 4 ms)
  8722 ms: godot::Vector<$>::push_back (1004 times, avg 8 ms)
  8620 ms: std::vector<$> (3994 times, avg 2 ms)
  8288 ms: std::vector<$>::push_back (1970 times, avg 4 ms)
  8125 ms: godot::Vector<$>::resize (1005 times, avg 8 ms)
  8056 ms: godot::CowData<$>::resize<$> (1009 times, avg 7 ms)
  7994 ms: std::vector<$>::emplace_back<$> (1963 times, avg 4 ms)
  7378 ms: std::__detail::__cyl_bessel_j<$> (2046 times, avg 3 ms)
  7310 ms: std::_Destroy<$> (3912 times, avg 1 ms)
  6589 ms: std::is_nothrow_default_constructible<$> (7648 times, avg 0 ms)
  6186 ms: godot::HashMap<godot::StringName, const GDExtensionInstanceBindingCa... (980 times, avg 6 ms)
  6073 ms: godot::List<$>::~List (1003 times, avg 6 ms)
  6036 ms: std::__detail::__cyl_bessel_i<$> (2046 times, avg 2 ms)
  5989 ms: std::__uninitialized_copy<$>::__uninit_copy<$> (2479 times, avg 2 ms)
  5598 ms: std::__do_uninit_copy<$> (2244 times, avg 2 ms)
  5574 ms: std::__detail::__bessel_jn<$> (2046 times, avg 2 ms)
  5212 ms: std::is_destructible<$> (2947 times, avg 1 ms)

This shows the compile times increased by 40% and a lot less of the most expensive to instantiate templates are now HashMap compared to how many were unordered_map before. The 40% isn't the clock time though, that is the time ClangBuildAnalzyer reports by adding the time from every file individual file but the compiler runs a lot of that in parallel so the real time difference is lower. The clock time (on my computer) of compiling the tests was 128.86 originally and with HashMap it is 114.34, so only about 12% faster in actual time.

The results also show that now a lot of the slowest to instantiate templates are from std::vector and std::string so in a future PR I think I'll look into moving some of those over to the Godot LocalVector and String/StringName equivalents.

@dementive dementive requested a review from a team as a code owner September 8, 2025 23:30
Copy link
Member

@Ivorforce Ivorforce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those look like some pretty great gains for compile time!

I think I concur with the general idea; if we don't need unordered_map in Godot, we probably don't need it in godot-cpp either. Especially when we can save so much compile time.

However, I think we should use AHashMap instead of HashMap for this. It should be quite a lot faster for lookups (and changes). I think it's not been transferred to godot-cpp yet, so this might be a good opportunity to do so.

@dementive
Copy link
Author

However, I think we should use AHashMap instead of HashMap for this. It should be quite a lot faster for lookups (and changes). I think it's not been transferred to godot-cpp yet, so this might be a good opportunity to do so.

Sounds good I totally agree. I use AHashMap all over in my module code that I'm converting. So I'm going to need it in the future coming up here anyway. I'll add it to this PR and make some tweaks. It should be pretty easy, I probably just have to copy and paste it and then change a few words.

@dsnopek
Copy link
Collaborator

dsnopek commented Sep 9, 2025

std::unordered_map is used all over in the ClassDB but there doesn't seem to be any good reason for it, internally godot would normally use HashMap for something like this.

I'm not sure how we started using the STL containers in ClassDB (perhaps avoiding using Godot stuff before Godot was ready?) but it's certainly for some historical reason and I don't think it applies anymore. This is something I've been thinking about looking into for a long time - thanks for taking the initiative on it!

@dementive
Copy link
Author

I'm not sure how we started using the STL containers in ClassDB

I found some uses when doing #1841 where the allocation function from Godot isn't possible to use yet since it's being called during static initialization. I assume there just used to be some lifetime issues that prevented using Godot templates that are (mostly) resolved now.

The one legit use of STL containers now is extremely jank too and it'd be nice to rework it to be more sane imo but I have no idea how to go about doing something like that. Internal classes get registered in a macro that calls another macro by the constructor of an unused inline static variable. Then they get unregistered in the destructor of that unused static variable. Of all the ways to call a function in C++ this might be one of the least sane ways I can think of haha.

@Calinou Calinou added enhancement This is an enhancement on the current functionality topic:buildsystem Related to the buildsystem or CI setup labels Sep 9, 2025
@dementive
Copy link
Author

@Ivorforce I'm running into a problem when converting AHashMap over.

There is no Memory::alloc_static_zeroed in godot-cpp and there doesn't seem to be a way to make one on the godot-cpp side because there is no calloc exposed to the gdextension API.

Presumably the whole point of calling gdextension_interface_mem_alloc in alloc_static in godot-cpp is so the engine can do the memory tracking stuff like stats and leak checks so it doesn't seem like a good idea to call libc calloc in godot-cpp.

I tried just changing the alloc_static_zeroed usage in AHashMap to alloc_static but that seems to make everything explode. When I did that the program gets stuck in an infinite loop in the while loop in AHashMap::_insert_with_hash.

@Ivorforce
Copy link
Member

Ivorforce commented Sep 10, 2025

@Ivorforce I'm running into a problem when converting AHashMap over.

There is no Memory::alloc_static_zeroed in godot-cpp and there doesn't seem to be a way to make one on the godot-cpp side because there is no calloc exposed to the gdextension API.

Right, i forgot about this. alloc_static_zeroed was recently added and is not exposed to GDExtension yet.
You can use the following instead; this should be nearly as fast (illustrative code; please don't copy as-is):

// instead of void *mem = Memory::alloc_static_zeroed(size);
void *mem = Memory::alloc_static(size);
memset(mem, 0, size);

@dementive dementive force-pushed the improve-build-time branch 2 times, most recently from 875c3bd to f8f964f Compare September 10, 2025 23:34
@dementive dementive changed the title Replace unordered_map with HashMap to improve build time Replace unordered_map with AHashMap to improve build time Sep 10, 2025
@dementive
Copy link
Author

memset fixed it. Everything should be good to go with AHashMap now.

Copy link
Member

@Ivorforce Ivorforce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
There is some risk that the mutation pointer instability of AHashMap could cause issues, but I looked over the code and didn't see any that jumped out to me as risky. I also don't know the exact guarantees of std::unordered_map.

If we do find regressions in this regard, we can switch to HashMap for singular cases.

Comment on lines +166 to +168
AHashMap<StringName, Object *>::ConstIterator i = engine_singletons.find(p_class_name);
if (i != engine_singletons.end()) {
ERR_FAIL_COND((*i).second != p_singleton);
ERR_FAIL_COND((*i).value != p_singleton);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

		AHashMap<StringName, Object *>::ConstIterator singleton_it = engine_singletons.find(p_class_name);
		if (singleton_it != engine_singletons.end()) {
			ERR_FAIL_COND((*singleton_it).value != p_singleton);

To match the rest of the file, and for readability. i usually used for integer iterators.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This is an enhancement on the current functionality topic:buildsystem Related to the buildsystem or CI setup
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants