You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2019-12-23-fixing-opcodes.md
+263-1Lines changed: 263 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -805,4 +805,266 @@ Unlikely to be the same thing but decided to humour myself and go check, and it'
805
805
806
806
#### Building Handler Trees
807
807
808
-
Coming soon™
808
+
This shit is actually pretty cursed, so what I'm going to do is paste a heap of code and then make it slightly more digestible. I also don't want to spend more time on this because it hurts the soul and maybe with what we have, we might be able to get somewhere somewhat reliably.
Don't say I didn't warn you. Anyway, `run()` is basically the same thing with a few minor changes.
899
+
900
+
* We store the EA of the jumptable inside `ZoneDownHandler` so we don't duplicate it in the event that we are inside a case that refers to itself. Mainly because its just more junk to output that we really don't need
901
+
* We loop over each `case_info` dictionary that we created before and do things...
902
+
903
+
... so we'll start from `process_case(...)` and go from there:
904
+
905
+
```python
906
+
defprocess_case(case, id):
907
+
func = case['func'] = {}
908
+
body = func['body'] = get_bytes_str(case['start_ea'], case['end_ea'])
`process_case(...)` is pretty self explanatory, pretty much just sets up a dictionary and passes the ref through with the start and end EA of the segment of code we'll look at. We also get all the bytes of the case segment as a string, meaning this disassembly:
Nothing too complex, but there's a possible 'improvement' with this. Currently all references to data and so on is preserved as is, so in the event of the executable being rebuilt, it's very likely that some of the bytes in here will change. What's probably a good idea to do is to replace references to data and code with wildcards, so we know that during the processing step wildcards can be completely ignored and subsequently if then any of the remaining bytes change, there's either a code change or it's not the same thing. But we can cross that bridge later.
936
+
937
+
Moving on...
938
+
939
+
```python
940
+
defprocess_func(func, start_ea, end_ea):
941
+
for head in idautils.Heads(start_ea, end_ea):
942
+
flags = idaapi.getFlags(head)
943
+
if idaapi.isCode(flags):
944
+
945
+
mnem = idc.GetMnem(head)
946
+
947
+
if mnem =='call'or mnem =='jmp':
948
+
op_ea = idc.GetOperandValue(head, 0)
949
+
fn = ida_funcs.get_func(op_ea)
950
+
951
+
if fn:
952
+
fn_info = postprocess_func(fn)
953
+
954
+
if fn_info:
955
+
func['calls'][get_func_name(op_ea)] = fn_info
956
+
```
957
+
958
+
This is where it starts getting fucked. So, again, this is how it goes:
959
+
960
+
1. Loop over every instruction in the range `start_ea ... end_ea`
961
+
2. Check if it's actually code, though the check is probably redundant in this case and I think something I left in from before, its all a blur now
962
+
3. Get the mnemonic by name and check if it's a `call` or `jmp` instruction
963
+
4. If it is, we get the first operand value, or the instructions parameter -- in this case it should be the EA of a function
964
+
5. Call `get_func` on it and check if it actually is a function -- it returns `None` if its not
965
+
6. Do more shit with that function (see below)
966
+
7. Store the result in the dictionary keyed by the function name
967
+
968
+
Not totally indigestible, but it's pretty gnarly. So lets make it even worse and check out `postprocess_func`!
969
+
970
+
```python
971
+
defpostprocess_func(fn, depth=0):
972
+
func = {
973
+
'ea': fn.startEA,
974
+
'rva': ea_to_rva(fn.startEA),
975
+
'body': get_bytes_str(fn.startEA, fn.endEA)
976
+
}
977
+
978
+
# total aids
979
+
switch_ea, switch = find_switch(fn.startEA)
980
+
981
+
if switch and switch_ea != main_jumptable:
982
+
sw = func['switch'] = {}
983
+
984
+
res = idaapi.calc_switch_cases(switch_ea, switch)
985
+
986
+
case_ids = []
987
+
for case in res.cases:
988
+
for i in case:
989
+
case_ids.append(int(i))
990
+
991
+
sw['cases'] = [i for i inset(case_ids)]
992
+
993
+
else:
994
+
func['switch'] =None
995
+
996
+
return func
997
+
```
998
+
999
+
There's not anything 'new' here but it's pretty gross nonetheless. For the most part though, this is simply an isolated function where we can do everything later without being trapped in 60 layers of indentation. Check if we have a switch in the function, if we do, grab some info about it and then attach it to the `func` dictionary.
1000
+
1001
+
Something we could do here is grab the bytes of each case in the nested switches, so we can then distinguish nested switches at the same time but we'll come back to this later. I don't want to be battling this stupid shit without the easier stuff working properly first.
1002
+
1003
+
#### I Can't Believe That Writing JSON to the Clipboard Deserves It's Own Section
1004
+
1005
+
Now we'll export all this garbage and throw it into the clipboard so you can do things with it. Luckily this is actually pretty easy:
[Wow](https://www.youtube.com/watch?v=TRIwAHX3aHM). At the end of `run()`, just insert `set_clipboard_json(output)` and away you go. You'll get something like this, or maybe better if you're less retarded than I am:
0 commit comments