Skip to content

viking: Check calls to unimplemented functions#38

Open
MonsterDruide1 wants to merge 2 commits intoopen-ead:masterfrom
MonsterDruide1:upstream-check-unimpl-funcs
Open

viking: Check calls to unimplemented functions#38
MonsterDruide1 wants to merge 2 commits intoopen-ead:masterfrom
MonsterDruide1:upstream-check-unimpl-funcs

Conversation

@MonsterDruide1
Copy link
Contributor

@MonsterDruide1 MonsterDruide1 commented Feb 15, 2026

Upstreaming the proposed changes from LynxDev2#2.

  1. Add new mismatch type InternalError, to be used when something unexpected happens within viking. Should hopefully never be shown to the user, but I would rather display a cryptic mismatch (that can be ignored by marking as "mismatching") rather than silently fail and mark it as matching (existing behaviour, probably still happens for some things)
  2. Parse more things from the ELF, namely .dynsym and .dynstrtab to resolve names of .plt-references
  3. Add functions from .plt to list of functions, to allow them to be "resolved" when comparing. For example, if some code calls cos(...), cos is not listed in the file_list.yml, so should rather be imported from .plt.
  4. Finally, when checking function calls, pay more attention to the failures: If the name within the file list of a called function is empty, but we expect it to be implemented, it's likely the name being missing from the list. If a given name does not occur in the decomp elf, we likely called the wrong function, and the actual target function never got called/emitted. Finally, if the address of the expected function call and the actual call target mismatch, that's the standard "chose the wrong function" scenario.
  5. For games without symbols (like BotW) that do not want to check for the names of plt references to be correct, this behaviour can be disabled in config.toml using check_unimplemented_references = false.

The logic in 4 was mostly in place before as well, but has now been extended with multiple plt_X_to_X calls, which take over when the "standard functions" return no results: Functions in the PLT have a symbol, but it's pointing to 0 instead of the actual function, so name/address have to be mapped via relocations.

Note, this still introduces 390 errors in BotW's run of tools/check. This is because previously, a missing/wrong name on an implemented function has been silently ignored. This means that not only the references to these functions were ignored, but the functions themselves were not checked in the first place either. These need to be fixed before BotW can update to the current version of viking (once this is merged).


This change is Reviewable

@MonsterDruide1
Copy link
Contributor Author

Simple script to fix most mismatching names:

import sys
import re

data = sys.stdin.read()

p = re.compile(r"incorrect reference; expected to see 0x([0-9a-f]+) .*\n.* decomp source code is referencing 0x[0-9a-f]+ .*\(.*\).*\((.*)\)")
matches = p.findall(data)

replaces = {}
for m in matches:
    replaces[int(m[0], 16)] = m[1]

def replace(line):
    if not line.startswith("0x"):
        return line
    ea = line.split(",")[0].strip()
    for addr, name in replaces.items():
        if ea.endswith("71%08x" % addr):
            return ",".join(line.split(",")[:-1]) + "," + name + "\n"
    return line

with open("data/uking_functions.csv", "r") as f:
    lines = f.readlines()

with open("data/uking_functions.csv", "w") as f:
    for line in lines:
        f.write(replace(line))

Run with tools/check 2>&1 | python <scriptname>

However, it messes up some stuff (duplicates functions) and might mess up names of functions that are actually wrongly referenced somewhere, so somebody with more knowledge of BotW specifically should go through these results to verify and fix the refences/names.

@MonsterDruide1 MonsterDruide1 marked this pull request as draft February 15, 2026 01:12
@MonsterDruide1 MonsterDruide1 changed the title viking: Check calls to unimplemented functions [DO NOT MERGE] viking: Check calls to unimplemented functions Feb 15, 2026
@MonsterDruide1
Copy link
Contributor Author

[DO NOT MERGE]: I just noticed that it attempts to introduce the functions from plt when updating the CSV (changing the matching state of some function), which is wrong and causes subsequent runs of tools/check to fail. Seems like it needs a bit more work to keep them separate.

@LynxDev2
Copy link
Contributor

Aren't PLT functions the ones that are already in the csv marked with L (Library)? Not sure if botw does that but iirc, odyssey had those before we moved to the file list. If that's the case, then I don't think adding the PLT functions as Ls would really be an issue

@MonsterDruide1 MonsterDruide1 changed the title [DO NOT MERGE] viking: Check calls to unimplemented functions viking: Check calls to unimplemented functions Feb 15, 2026
@MonsterDruide1 MonsterDruide1 marked this pull request as ready for review February 15, 2026 09:27
@MonsterDruide1
Copy link
Contributor Author

Ah, you're almost right - BotW has one huge function that covers all entries in the .plt, which is technically more wrong than listing each one separately. So we can just "fix" that by splitting that into multiple functions, so ... yup, that's fine.

Copy link
Contributor

@LynxDev2 LynxDev2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually doesn't that mean that combining functions and plt_functions will cause there to be duplicate Library functions? If the library functions get stored into the csv, then what's the point of parsing the PLT section in the first place. For the downstream implementation that operates on the file list, this is a whole different discussion since we currently don't have a good way of storing Library functions anywhere, but for this upstream implementation, I think it would make more sense for projects to run the check version from this commit or a script to add the library functions to the csv once and then just use the info there.

Or am I missing something?

@LynxDev2 reviewed 3 files and all commit messages, and made 5 comments.
Reviewable status: 3 of 5 files reviewed, 4 unresolved discussions (waiting on MonsterDruide1).


viking/src/tools/check.rs line 86 at r2 (raw file):

    let functions = functions.unwrap().context("failed to load function CSV")?;
    let plt_functions = plt_functions.unwrap().context("failed to load plt functions")?;
    let functions = vec![functions, plt_functions].concat();

Nit: afaik this discards the allocation for the original functions unless the compiler is smart enough to optimise that
Suggestion: functions.extend(plt_functions) (also make functions mutable)


viking/src/elf.rs line 102 at r2 (raw file):

        let hash_offset = dyn_info.hash.context("no hash")? as usize;
        let num_syms = u32::from_le_bytes(bytes[hash_offset+4..hash_offset+8].try_into()?) as usize;

So this chunk of the dyn_info goes as: 4 byte hash followed by a 4 byte symbol count integer (u32), right?
Maybe that should be documented in a comment above?


viking/src/checks.rs line 425 at r2 (raw file):

    fn plt_name_to_addr(elf: &elf::OwnedElf, func_name: &str) -> Option<u64> {
        // find plt relocation for given function name
        let (reloc_idx, _) = elf.pltrelocs.iter().enumerate().find(|(_, reloc)| {

Use iter().position() instead


viking/src/checks.rs line 439 at r2 (raw file):

        let name = elf.dynstrtab.get_at(elf.dynsyms.get(reloc.r_sym)?.st_name)?;
        Some(name)
    }

Even though these two functions are only used in this file, I feel like it would make more sense to place them in elf.rs instead

Code quote:

    fn plt_name_to_addr(elf: &elf::OwnedElf, func_name: &str) -> Option<u64> {
        // find plt relocation for given function name
        let (reloc_idx, _) = elf.pltrelocs.iter().enumerate().find(|(_, reloc)| {
            elf.dynsyms.get(reloc.r_sym).map(|s| elf.dynstrtab.get_at(s.st_name)).flatten() == Some(func_name)
        })?;

        let plt_section = elf::find_section(elf, ".plt").ok()?;
        Some(plt_section.sh_addr + (reloc_idx as u64 + 2) * 0x10)
    }

    fn plt_addr_to_name(elf: &elf::OwnedElf, addr: u64) -> Option<&str> {
        let plt_section = elf::find_section(elf, ".plt").ok()?;
        let reloc_idx = (addr - plt_section.sh_addr) / 16 - 2;
        let reloc = elf.pltrelocs.get(reloc_idx as usize)?;
        let name = elf.dynstrtab.get_at(elf.dynsyms.get(reloc.r_sym)?.st_name)?;
        Some(name)
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants