viking: Check calls to unimplemented functions#38
viking: Check calls to unimplemented functions#38MonsterDruide1 wants to merge 2 commits intoopen-ead:masterfrom
Conversation
|
Simple script to fix most mismatching names: import sys
import re
data = sys.stdin.read()
p = re.compile(r"incorrect reference; expected to see 0x([0-9a-f]+) .*\n.* decomp source code is referencing 0x[0-9a-f]+ .*\(.*\).*\((.*)\)")
matches = p.findall(data)
replaces = {}
for m in matches:
replaces[int(m[0], 16)] = m[1]
def replace(line):
if not line.startswith("0x"):
return line
ea = line.split(",")[0].strip()
for addr, name in replaces.items():
if ea.endswith("71%08x" % addr):
return ",".join(line.split(",")[:-1]) + "," + name + "\n"
return line
with open("data/uking_functions.csv", "r") as f:
lines = f.readlines()
with open("data/uking_functions.csv", "w") as f:
for line in lines:
f.write(replace(line))Run with However, it messes up some stuff (duplicates functions) and might mess up names of functions that are actually wrongly referenced somewhere, so somebody with more knowledge of BotW specifically should go through these results to verify and fix the refences/names. |
|
|
|
Aren't PLT functions the ones that are already in the csv marked with |
|
Ah, you're almost right - BotW has one huge function that covers all entries in the |
LynxDev2
left a comment
There was a problem hiding this comment.
Actually doesn't that mean that combining functions and plt_functions will cause there to be duplicate Library functions? If the library functions get stored into the csv, then what's the point of parsing the PLT section in the first place. For the downstream implementation that operates on the file list, this is a whole different discussion since we currently don't have a good way of storing Library functions anywhere, but for this upstream implementation, I think it would make more sense for projects to run the check version from this commit or a script to add the library functions to the csv once and then just use the info there.
Or am I missing something?
@LynxDev2 reviewed 3 files and all commit messages, and made 5 comments.
Reviewable status: 3 of 5 files reviewed, 4 unresolved discussions (waiting on MonsterDruide1).
viking/src/tools/check.rs line 86 at r2 (raw file):
let functions = functions.unwrap().context("failed to load function CSV")?; let plt_functions = plt_functions.unwrap().context("failed to load plt functions")?; let functions = vec![functions, plt_functions].concat();
Nit: afaik this discards the allocation for the original functions unless the compiler is smart enough to optimise that
Suggestion: functions.extend(plt_functions) (also make functions mutable)
viking/src/elf.rs line 102 at r2 (raw file):
let hash_offset = dyn_info.hash.context("no hash")? as usize; let num_syms = u32::from_le_bytes(bytes[hash_offset+4..hash_offset+8].try_into()?) as usize;
So this chunk of the dyn_info goes as: 4 byte hash followed by a 4 byte symbol count integer (u32), right?
Maybe that should be documented in a comment above?
viking/src/checks.rs line 425 at r2 (raw file):
fn plt_name_to_addr(elf: &elf::OwnedElf, func_name: &str) -> Option<u64> { // find plt relocation for given function name let (reloc_idx, _) = elf.pltrelocs.iter().enumerate().find(|(_, reloc)| {
Use iter().position() instead
viking/src/checks.rs line 439 at r2 (raw file):
let name = elf.dynstrtab.get_at(elf.dynsyms.get(reloc.r_sym)?.st_name)?; Some(name) }
Even though these two functions are only used in this file, I feel like it would make more sense to place them in elf.rs instead
Code quote:
fn plt_name_to_addr(elf: &elf::OwnedElf, func_name: &str) -> Option<u64> {
// find plt relocation for given function name
let (reloc_idx, _) = elf.pltrelocs.iter().enumerate().find(|(_, reloc)| {
elf.dynsyms.get(reloc.r_sym).map(|s| elf.dynstrtab.get_at(s.st_name)).flatten() == Some(func_name)
})?;
let plt_section = elf::find_section(elf, ".plt").ok()?;
Some(plt_section.sh_addr + (reloc_idx as u64 + 2) * 0x10)
}
fn plt_addr_to_name(elf: &elf::OwnedElf, addr: u64) -> Option<&str> {
let plt_section = elf::find_section(elf, ".plt").ok()?;
let reloc_idx = (addr - plt_section.sh_addr) / 16 - 2;
let reloc = elf.pltrelocs.get(reloc_idx as usize)?;
let name = elf.dynstrtab.get_at(elf.dynsyms.get(reloc.r_sym)?.st_name)?;
Some(name)
}
Upstreaming the proposed changes from LynxDev2#2.
InternalError, to be used when something unexpected happens within viking. Should hopefully never be shown to the user, but I would rather display a cryptic mismatch (that can be ignored by marking as "mismatching") rather than silently fail and mark it as matching (existing behaviour, probably still happens for some things).dynsymand.dynstrtabto resolve names of.plt-references.pltto list of functions, to allow them to be "resolved" when comparing. For example, if some code callscos(...),cosis not listed in thefile_list.yml, so should rather be imported from.plt.pltreferences to be correct, this behaviour can be disabled inconfig.tomlusingcheck_unimplemented_references = false.The logic in 4 was mostly in place before as well, but has now been extended with multiple
plt_X_to_Xcalls, which take over when the "standard functions" return no results: Functions in the PLT have a symbol, but it's pointing to0instead of the actual function, so name/address have to be mapped via relocations.Note, this still introduces 390 errors in BotW's run of
tools/check. This is because previously, a missing/wrong name on an implemented function has been silently ignored. This means that not only the references to these functions were ignored, but the functions themselves were not checked in the first place either. These need to be fixed before BotW can update to the current version ofviking(once this is merged).This change is