Skip to content

Discussion: Possible fundamental approaches for improvements in model build time in SpineOpt and SpineInterface #1219

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
DillonJ opened this issue May 23, 2025 · 11 comments

Comments

@DillonJ
Copy link
Contributor

DillonJ commented May 23, 2025

We discussed drastic fundamental approaches to improve the model build time in SpineOpt and SpineInterface.

The purpose of this issue is to facilitate discussions on the way forward and where @manuelma can provide input / thoughts.

The main problems identified so far have been:

  • Globals used in SpineInterface
  • Type instability from functions created on the fly in SpineInterface

Maybe others - please feel free to update issue description.

One potential way forward that was identified was to generate static, explicit function definitions for the SpineInterface convenience functions based on the current SpineOpt template. These days it doesn't change very much very often. When the template changes, these functions can be regenerated and SpineInterface continues to be useful for developers and other tools that use it.

But SpineInterface does other things besides creating the convenience functions, these would need to be seperated into a different package, or into SpineOpt if appropriate/necessary.

Reworking the indices functions might also be part of this - so they are more like conventional static sets.

Optionally, or eventually, maybe this could even be a compiled package and implemented in Rust/C?

@tarskul @Tasqu @datejada @suvayu @manuelma @mihlema @nnhjy any thoughts?

@datejada
Copy link
Member

datejada commented May 23, 2025

I am adding here the reference to issue #1216 with the profiling results.

Part of the issue also comments on possible improvements:

  • @eval must die! @tarskul
  • Further improve the type stability, focusing on the red bars in the profiling (screenshot in the issue)
  • Rethink how we create the indices function, creating it once at the beginning instead of calculating it on the fly.

@tarskul
Copy link
Collaborator

tarskul commented May 23, 2025

@nnhjy had a good idea to preserve the functionality of SpineInterface whilst taking into account the specific structure of SpineOpt. The idea was that you use the current functionality in SpineInterface when you are developing your model and that we add a function that SpineInterface can write these global functions to a file when the model is fully developed.

However, we are not sure whether it is actually feasible to code this. Additionally, this may not be necessary if we are able to get rid of the compilation time in another way.

@DillonJ
Copy link
Contributor Author

DillonJ commented May 23, 2025

Thanks @tarskul. @nnhjy suggestion is what I meant by:

One potential way forward that was identified was to generate static, explicit function definitions for the SpineInterface convenience functions based on the current SpineOpt template. These days it doesn't change very much very often. When the template changes, these functions can be regenerated and SpineInterface continues to be useful for developers and other tools that use it.

I don't see why it wouldn't be feasible - it's just writing text and eventually, it could even be written in a compiled or another compiled language. As I say, another challenge is the other functionality that SpineInterface provides that might need to be seperated out.

@tarskul
Copy link
Collaborator

tarskul commented May 23, 2025

To clarify @DillonJ

The difference between your suggestion and that of @nnhjy is that you initially proposed to systematically manually create these functions, whereas @nnhjy suggested this to be done automatically by SpineInterface. The principle is the same but the implementation is different. Yours will indeed work but requires a significant effort. @nnhjy may require less work but may not be feasible.

@DillonJ
Copy link
Contributor Author

DillonJ commented May 23, 2025

Also to mention here that @eval is used a lot in pre-processing where we create new SpineInterface parameters on-the-fly. We'd need to rework all that.

@DillonJ
Copy link
Contributor Author

DillonJ commented May 23, 2025

I was hardly suggesting we type out hundreds of convenience functions manually! Anyway, it's an idea that has been mentioned before and should be perfectly feasible

@suvayu
Copy link

suvayu commented May 23, 2025

But SpineInterface does other things besides creating the convenience functions, these would need to be seperated into a different package, or into SpineOpt if appropriate/necessary.

Based on my discussion with @tarskul and @Tasqu, I understand a lot of what it does is also provided by spinedb_api now, and this is the way it is for historical reasons. I'll paraphrase someone "it replicates a spinedb like structure in-memory". It seems to me the data (and mental) model for how SpineOpt works needs to change. It made sense in the past, but now the approach should be either:

  • use the DB efficiently without code duplication using spinedb_api (I proposed something using message passing), or
  • keep the access pattern as-is, i.e. get everything all in one go, but change the data model to something more appropriate for memory operations (e.g. matrices, incidentally, that's also how JuMP thinks! 😉)

Reworking the indices functions might also be part of this - so they are more like conventional static sets.

Yes, not just efficient reimplementation of current steps, but also possibly changing the data model so that some of these computations are vectorisable or not required (relates to my second point above).

As for writing stuff in compiled languages, let's leave that can of worms aside.

@ll-ara
Copy link
Contributor

ll-ara commented May 24, 2025

Generally speaking, I think we have come to the right conclusions about how to improve performance.

I think first we focus on targeted optimizations (ie. 'low hanging fruit') as described in #1218 and exemplified in #1217. Once we have addressed most of the 'low hanging fruit' then any systematic issues relating to data structures / data management should become more clear. Best case is that all the targeted optimizations we make will reduce the need for large restructures.
However, I do agree that @eval must die. In addition to performance issues, dynamic assignment of variables/functions makes the code difficult to read/interpret imo. I'm sure that there is a reason that it was implemented this way, but I'm sure we can find a highly viable alternative - as people have discussed above.

At some point I actually think creating a visual diagram/s of data flows / structures / usage would be a useful exercise (if it hasn't been done already). Aside from just generally being good documentation, I think it would aid thinking and communication of these proposed structural changes to the code base.


Optionally, or eventually, maybe this could even be a compiled package and implemented in Rust/C?

Julia is a compiled language (albeit AOT) and both SpineOpt and SpineInterface are written in Julia - am I missing something here?

Yes, good C/Rust code will be a tad faster than good Julia code for a number of reasons that include:

  • Slightly smarter compilers and global optimization
  • Less garbage collection overhead
  • No compiler overhead at runtime
  • More control over types

But these almost never result in a significant difference in execution time. If you are writing code to run on a specific machine/architecture then the story changes a bit, but that isn't the case here. My guess is that for all intensive purposes, optimized SpineOpt/SpineInterface in Julia would be less than 20% slower than a realistic optimized C code alternative. Hence I can't imagine that it would be worth it to transfer a portion of the code base out of Julia and into C/Rust. At the very least, I reckon we leave any discussion of this until after v1.0.0 :)

As for writing stuff in compiled languages, let's leave that can of worms aside.

Yes, I wholeheartedly agree!

@manuelma
Copy link
Collaborator

manuelma commented May 26, 2025

About the static convenience functions—I don't get how that helps since the return value needs to be adapted depending on the contents on the Db. So if we defined the function unit that returns the list of units, since we don't know the list of units beforehand, how can we make it really static. Either we need to eval the function definition at runtime (when reading the Db) or having the function return some global state that gets updated when reading the Db. Again, we're back at the same problem we have now. What am I missing? How do you want to do it instead?

I have opened an issue about removing the global state based on returning the convenience functions as a result of calling using_spinedb—rather than evaling them—so maybe we can consider that idea as well?

#1100

@tarskul
Copy link
Collaborator

tarskul commented May 26, 2025

Whereas SpineInterface can be typed for a generic spine db, writing the convenience functions to a file can be typed for a specific SpineOpt db. Instead of creating a function named 'units' on the fly, we already know the name of the function to be 'units'. We may not know the number of units, but we do know the type. The idea is therefore to improve the type stability.

Though, this may not be necessary if we can make spineinterface type stable for any generic spine db by taking advantage of the supported parameter values.

To further reduce the runtime compilation we indeed need to remove the @eval function by passing the database. The database can be global but we'll have to be careful to do it in a type stable way. There is an example of such use in the julia documentation.

@manuelma
Copy link
Collaborator

To further reduce the runtime compilation we indeed need to remove the @eval function by passing the database

What do you mean by "by passing the database"? Passing the database URL? passing it to what?

Thanks for the clarification. So the static convenience functions are intended to improve type stability, but not to remove the global state, is that correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants