Skip to content

Prior work on the topic #1

@ljleb

Description

@ljleb

Hi! I wanted to reach out because I made a bit of research on this exact specific idea in the past, and I implemented the algorithm 1 from the paper in the specific case of 1 base model and 1 finetuned model a long while ago back in 2023. You can see the code and the commit history here:

https://github.com/ljleb/sd-mecha/blob/833b0e8e418e9719b155cbf9e1fd1697a5dec3d7/sd_mecha/extensions/builtin/merge_methods/svd.py#L10

The oldest implementation I wrote can be found here:

s1dlx/meh#50

And other interactions with the community:

hako-mikan/sd-webui-supermerger#347 (comment)

I think the implementation I have is more general with respect to the orthogonal term in the case of 1 base model and 1 finetuned model. It allows to partially apply the orthogonal map using fractional matrix exponentiation, which can be useful if the full map leads to poor performance.

My implementation and the algorithms from your work differ when multiple finetuned models are considered, but in the case of 1 base model and 1 finetuned model, I believe them to have an identical behavior.

Could you please consider mentioning my prior work on this idea in the related work section of the paper? I'm not sure how citations work when the prior work has no paper associated. Please let me know if I can provide any more context or information!

Edit: I realize that the idea implemented in your paper is in fact not the same as the one I implemented. However, the ideas are closely related: they both use the same orthogonal Procrustes approach to connect two or more models within a more natural manifold. I believe my work is still closely related to your work in a meaningful way.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions