Skip to content

Conversation

@JillianTo
Copy link
Contributor

@JillianTo JillianTo commented Mar 9, 2025

Using the decompiled code exported from IDA as a HTML and the switch addresses that cause errors when using XenonRecomp (I deleted every line in SWA.toml after 'setjmp_address', (although I had to keep the invalid address block, else XenonRecomp wouldn't run), then ran XenonRecomp and saved the CLI output to a file), I wrote a script that reproduces 39/42 of the function boundaries in UnleashedRecomp. Below is the output of the script:

functions = [
    { address = 0x830B7DD0, size = 0x74 },
    { address = 0x82F098C0, size = 0x19C },
    { address = 0x826ABB70, size = 0x70 },
    { address = 0x8319ED58, size = 0x98 },
    { address = 0x82456DC8, size = 0xD4 },
    { address = 0x82DE36A8, size = 0x5C },
    { address = 0x82F852A0, size = 0xCC },
    { address = 0x82C980E8, size = 0x110 },
    { address = 0x82DE38A0, size = 0x16C },
    { address = 0x82EF5C38, size = 0x64 },
    { address = 0x82F1D668, size = 0x1E8 },
    { address = 0x82EE2D08, size = 0x154 },
    { address = 0x82F08730, size = 0x2B0 },
    { address = 0x82455E70, size = 0x84 },
    { address = 0x82E97E50, size = 0x84 },
    { address = 0x831530C8, size = 0x258 },
    { address = 0x82F13980, size = 0xF4 },
    { address = 0x82DE3708, size = 0x198 },
    { address = 0x82893088, size = 0x45C },
    { address = 0x831539E0, size = 0xD0 },
    { address = 0x82C49540, size = 0x114 },
    { address = 0x82E86770, size = 0x98 },
    { address = 0x83180700, size = 0x74 },
    { address = 0x83168F18, size = 0x254 },
    { address = 0x830DADA0, size = 0x150 },
    { address = 0x82DE3640, size = 0x64 },
    { address = 0x82F25FD8, size = 0x240 },
    { address = 0x82D9AC08, size = 0x78 },
    { address = 0x831487D0, size = 0xD4 },
    { address = 0x83168940, size = 0x100 },
    { address = 0x82CF7080, size = 0x80 },
    { address = 0x8317CD30, size = 0x50 },
    { address = 0x83168B70, size = 0x128 },
    { address = 0x82EF5D78, size = 0x3F8 },
    { address = 0x82DE35D8, size = 0x68 },
    { address = 0x83168A48, size = 0x11C },
    { address = 0x824E7EF0, size = 0x98 },
    { address = 0x8316C678, size = 0x78 },
    { address = 0x82F22908, size = 0x20C }
]

I verified these functions were correct using sort and diff in the Linux terminal. The only difference is the order, and that it is missing the following three functions:

{ address = 0x824E7F28, size = 0x60 }
{ address = 0x8305D168, size = 0x278 }
{ address = 0x831B0BA0, size = 0xA0 }

@JillianTo JillianTo changed the title Created script for automatically generating function boundries Created script for automatically generating function boundaries Mar 10, 2025
@masterspike52
Copy link

masterspike52 commented Mar 11, 2025

image has an error

did you try to run it as an idapython script? cause if so i think you're suppose to run it as a regular python script outside of idapro. you're suppose to take the log from runnning xenonrecomp and put it in a text file and then use idapro to make an html of default.xex and then run python directory/name of xex file directory/name of log file for xenonrecomp output name of file.toml

Edit: i was correct, it worked for the most part, it only missed 2 function boundaries for destroy all humans path of the furon which was easy for me to find 1 of, the others been a bit wonky so i wasnt surprised it couldnt find it.

@zeerowiibu
Copy link

I got hit by these error when I'm trying to run it

F:\Xenon\Recompilation\game\Auto_Function_Parser.py:134: SyntaxWarning: invalid escape sequence '\.'
  elif re.search('^\.text:'+curr_addr+' </span><span class="c[0-9]*">loc_'+curr_addr, line):
F:\Xenon\Recompilation\game\Auto_Function_Parser.py:166: SyntaxWarning: invalid escape sequence '\.'
  elif num_functs > 0 and re.search('<span class="c[0-9]*">\.long </span><span class="c[0-9]*">0$', line):
F:\Xenon\Recompilation\game\Auto_Function_Parser.py:201: SyntaxWarning: invalid escape sequence '\.'
  if re.search('<span class="c[0-9]*">\.section &quot;\.text&quot;', line) != None:

@JillianTo
Copy link
Contributor Author

JillianTo commented Mar 11, 2025

I got hit by these error when I'm trying to run it

F:\Xenon\Recompilation\game\Auto_Function_Parser.py:134: SyntaxWarning: invalid escape sequence '\.'
  elif re.search('^\.text:'+curr_addr+' </span><span class="c[0-9]*">loc_'+curr_addr, line):
F:\Xenon\Recompilation\game\Auto_Function_Parser.py:166: SyntaxWarning: invalid escape sequence '\.'
  elif num_functs > 0 and re.search('<span class="c[0-9]*">\.long </span><span class="c[0-9]*">0$', line):
F:\Xenon\Recompilation\game\Auto_Function_Parser.py:201: SyntaxWarning: invalid escape sequence '\.'
  if re.search('<span class="c[0-9]*">\.section &quot;\.text&quot;', line) != None:

I didn't test the script in Windows, can you try replacing every instance of \. with . and see if that works?

EDIT: I think your issue might be related to this, https://stackoverflow.com/questions/52335970/how-to-fix-syntaxwarning-invalid-escape-sequence-in-python I tested my script with Python 3.11, try using an older version of Python without changing the script.

EDIT2: I pushed an update of the script tested to work with Python 3.12, let me know if this worked

@zeerowiibu
Copy link

I got hit by these error when I'm trying to run it

F:\Xenon\Recompilation\game\Auto_Function_Parser.py:134: SyntaxWarning: invalid escape sequence '\.'
  elif re.search('^\.text:'+curr_addr+' </span><span class="c[0-9]*">loc_'+curr_addr, line):
F:\Xenon\Recompilation\game\Auto_Function_Parser.py:166: SyntaxWarning: invalid escape sequence '\.'
  elif num_functs > 0 and re.search('<span class="c[0-9]*">\.long </span><span class="c[0-9]*">0$', line):
F:\Xenon\Recompilation\game\Auto_Function_Parser.py:201: SyntaxWarning: invalid escape sequence '\.'
  if re.search('<span class="c[0-9]*">\.section &quot;\.text&quot;', line) != None:

I didn't test the script in Windows, can you try replacing every instance of . with . and see if that works?

EDIT: I think your issue might be related to this, https://stackoverflow.com/questions/52335970/how-to-fix-syntaxwarning-invalid-escape-sequence-in-python I tested my script with Python 3.11, try using an older version of Python without changing the script.

EDIT2: I pushed an update of the script tested to work with Python 3.12, let me know if this worked

That did a trick, runs without errors now

@ShadowLuigi
Copy link

ShadowLuigi commented Mar 12, 2025

Edit 1: was using an older version of the parser script without noticing, Current version from 11 hours ago as of me typing this seems to be working, though I'm currently waiting for this to be done with the html file since that for whatever reason is 1gb in size.

Edit 2: For whatever reason it came out as this.
"Parsing XenonRecomp log...
Parsing IDA HTML...
Searching for needed functions...
0 functions found!
Outputting to formatted file..."

I honestly don't know why it didn't even find the needed functions given they were pretty much in the IDA HTML, especially with them also being listed in the XenonRecomp log.

Tried doing this with the Open Season Video Game as an example, I'm stuck on this error with it pointing to line 122 in the error.
"H:\Documents\TimberlineRecompiled>parser.py default.xex.html default.xex.txt OSGAME_NEW.toml
Parsing XenonRecomp log...
Parsing IDA HTML...
Traceback (most recent call last):
File "H:\Documents\TimberlineRecompiled\parser.py", line 122, in
if not compare_xref_addr(line, functs[num_functs-1][0]):
~~~~~~^^^^^^^^^^^^^^
IndexError: list index out of range"

@JillianTo
Copy link
Contributor Author

Edit 1: was using an older version of the parser script without noticing, Current version from 11 hours ago as of me typing this seems to be working, though I'm currently waiting for this to be done with the html file since that for whatever reason is 1gb in size.

Edit 2: For whatever reason it came out as this. "Parsing XenonRecomp log... Parsing IDA HTML... Searching for needed functions... 0 functions found! Outputting to formatted file..."

I honestly don't know why it didn't even find the needed functions given they were pretty much in the IDA HTML, especially with them also being listed in the XenonRecomp log.

Tried doing this with the Open Season Video Game as an example, I'm stuck on this error with it pointing to line 122 in the error. "H:\Documents\TimberlineRecompiled>parser.py default.xex.html default.xex.txt OSGAME_NEW.toml Parsing XenonRecomp log... Parsing IDA HTML... Traceback (most recent call last): File "H:\Documents\TimberlineRecompiled\parser.py", line 122, in if not compare_xref_addr(line, functs[num_functs-1][0]): ~~~~~~^^^^^^^^^^^^^^ IndexError: list index out of range"

compare_xref_addr isn't a function in the newest version of the script, nor is it referred to on line 122. Try deleting your version and downloading it from https://raw.githubusercontent.com/hedge-dev/XenonRecomp/8fc280bed99903d7bfaf1003e18cfec0c627141d/Auto_Function_Parser.py

@blueskythlikesclouds
Copy link
Member

Would it be possible to integrate your detection here to XenonAnalyse in some way?

@ShadowLuigi
Copy link

ShadowLuigi commented Mar 12, 2025

compare_xref_addr isn't a function in the newest version of the script, nor is it referred to on line 122. Try deleting your version and downloading it from https://raw.githubusercontent.com/hedge-dev/XenonRecomp/8fc280bed99903d7bfaf1003e18cfec0c627141d/Auto_Function_Parser.py

Tried this, but had the same result. Though I did update to a different fork of XenonRecomp to try and deal with other issues, though not much had changed tbh.

This is the result as before.

"Parsing XenonRecomp log...
Parsing IDA HTML...
Searching for needed functions...
0 functions found!
Outputting to formatted file..."

The fork is from this #22

@JillianTo
Copy link
Contributor Author

JillianTo commented Mar 13, 2025

Would it be possible to integrate your detection here to XenonAnalyse in some way?

Yeah... although parsing IDA output is a bit easier than plain decompilation though because most subroutine headers will tell you what references it rather than having to look through every line for references and saving it. I went through the path of least resistance so I could quickly churn this out (hence using Python)

compare_xref_addr isn't a function in the newest version of the script, nor is it referred to on line 122. Try deleting your version and downloading it from https://raw.githubusercontent.com/hedge-dev/XenonRecomp/8fc280bed99903d7bfaf1003e18cfec0c627141d/Auto_Function_Parser.py

Tried this, but had the same result. Though I did update to a different fork of XenonRecomp to try and deal with other issues, though not much had changed tbh.

This is the result as before.

"Parsing XenonRecomp log...
Parsing IDA HTML...
Searching for needed functions...
0 functions found!
Outputting to formatted file..."

The fork is from this #22

I used this script with Ninja Gaiden 2 using a combination of the simde, Bakugan, and NG2 forks and it found 155 functions, so I don't think that's the issue. Are you using IDA Pro 9.0SP1 to create the HTML? If so, can you try adding "print(switch_addrs)" to line 49 and "print(functs)" to line 203 to see if either of those lists are empty?

@masterspike52
Copy link

Would it be possible to integrate your detection here to XenonAnalyse in some way?

Yeah... although parsing IDA output is a bit easier than plain decompilation though because most subroutine headers will tell you what references it rather than having to look through every line for references and saving it. I went through the path of least resistance so I could quickly churn this out (hence using Python)

compare_xref_addr isn't a function in the newest version of the script, nor is it referred to on line 122. Try deleting your version and downloading it from https://raw.githubusercontent.com/hedge-dev/XenonRecomp/8fc280bed99903d7bfaf1003e18cfec0c627141d/Auto_Function_Parser.py

Tried this, but had the same result. Though I did update to a different fork of XenonRecomp to try and deal with other issues, though not much had changed tbh.
This is the result as before.

"Parsing XenonRecomp log...
Parsing IDA HTML...
Searching for needed functions...
0 functions found!
Outputting to formatted file..."

The fork is from this #22

I used this script with Ninja Gaiden 2 using a combination of the simde, Bakugan, and NG2 forks and it found 155 functions on the first pass and 12 on the second so I don't think that's the issue. Are you using IDA Pro 9.0SP1 to create the HTML? If so, can you try adding "print(switch_addrs)" to line 49 and "print(functs)" to line 203 to see if either of those lists are empty?

i feel a lot of peoples issues are an expectation for using idapro 9.0 sp1 which most people arent aware of how to get without paying an absurd amount for it

@JillianTo
Copy link
Contributor Author

i feel a lot of peoples issues are an expectation for using idapro 9.0 sp1 which most people arent aware of how to get without paying an absurd amount for it

Yes, I do realize the issue with that, but on the other hand, I don't think anyone can make substantial progress on a recomp port if they don't have it.

@masterspike52
Copy link

i feel a lot of peoples issues are an expectation for using idapro 9.0 sp1 which most people arent aware of how to get without paying an absurd amount for it

Yes, I do realize the issue with that, but on the other hand, I don't think anyone can make substantial progress on a recomp port if they don't have it.

or at least not in a short time. i wonder if it can work with ghidra

@ShadowLuigi
Copy link

The only IDAPro 9.0 I have is a leaked build from August 2024, though it should be working relatively the same, so I don't know what's going on here tbh. I can try to work on this a little further if I can.

@Mystixor
Copy link
Contributor

i feel a lot of peoples issues are an expectation for using idapro 9.0 sp1 which most people arent aware of how to get without paying an absurd amount for it

Yes, I do realize the issue with that, but on the other hand, I don't think anyone can make substantial progress on a recomp port if they don't have it.

or at least not in a short time. i wonder if it can work with ghidra

Ghidra has the big problem that it lacks a lot of the special instructions used on the Xenon processor, making reverse engineering very difficult.

@masterspike52
Copy link

Ghidra has the big problem that it lacks a lot of the special instructions used on the Xenon processor, making reverse engineering very difficult.

This is fair. I wish it had a better plugin for xex files like idapro does

@Organizationguy
Copy link

Sorry,could you clarify what the CLI output file is that you are refering to.

@JillianTo
Copy link
Contributor Author

Sorry,could you clarify what the CLI output file is that you are refering to.

In a Linux terminal when you run XenonRecomp, append " > out.txt" to the command so it outputs all the stuff it would print to the terminal to a file instead

@Organizationguy
Copy link

Organizationguy commented Apr 3, 2025 via email

@derlineUn
Copy link

the tool managed to find 50% of the functions that xenonrecomp considers to be wrong, but for some reason it couldn't find the others, and when I run the tool again this time using the new log with the rest of the missing functions it gives me half of the ones that that had already been found before

@JillianTo
Copy link
Contributor Author

JillianTo commented Apr 3, 2025

Ok, so this script won't work in windows?

You can try copy pasting the output of XenonRecomp from the Windows prompt into a text file but I haven't verified that

the tool managed to find 50% of the functions that xenonrecomp considers to be wrong, but for some reason it couldn't find the others, and when I run the tool again this time using the new log with the rest of the missing functions it gives me half of the ones that that had already been found before

That's because the leftover 50% are functions nested in the ones found by the script, and the script doesn't handle that

@Organizationguy
Copy link

Organizationguy commented Apr 3, 2025 via email

@masterspike52
Copy link

What should Xenon recomp be outputting?

When you look at the command window it should just say what percentage has been recompiled, if it says that an address is going out of bounds from another address or something it means you need to add function boundaries to your config (what this script should do) if it says it found a jump table at an address but the switch table file you made with xenonanalyse doesn't exist it means you need to go to that address and find it's jump table. And if it says something about an unrecognized instruction you have to edit recompiler.cpp and add those instructions with the right c++ code and rebuild xenonrecomp

@Organizationguy
Copy link

Organizationguy commented Apr 4, 2025 via email

@Organizationguy
Copy link

I think Im nearly there, but Im getting this error.
C:\Users\Nevan\Desktop\function test>python Auto_Function_Parser.py IDA.html XenonRecomp_Log.txt Output.toml
Parsing XenonRecomp log...
Parsing IDA HTML...
Searching for needed functions...
Traceback (most recent call last):
File "C:\Users\Nevan\Desktop\function test\Auto_Function_Parser.py", line 225, in
curr_funct = functs[curr_funct_idx]
~~~~~~^^^^^^^^^^^^^^^^
IndexError: list index out of range

How can I fix it?

@AllanCat
Copy link

Tested TGM ACE TU1 with IDA Pro 9.1, ran without issue and found 41/42 functions.
This greatly helped with process, nice work!

Script output:
functions = [
{ address = 0x8229BBB8, size = 0x78 },
{ address = 0x82333720, size = 0xB4 },
{ address = 0x821511C0, size = 0xE4 },
{ address = 0x82326AA0, size = 0x494 },
{ address = 0x820F0D50, size = 0x2BC },
{ address = 0x821068C0, size = 0x1E4 },
{ address = 0x821AD620, size = 0xA8 },
{ address = 0x82077440, size = 0x10C },
{ address = 0x822EF668, size = 0x198 },
{ address = 0x8208ABE8, size = 0x2C4 },
{ address = 0x822EF9D8, size = 0x108 },
{ address = 0x8209F108, size = 0x78 },
{ address = 0x822F1C88, size = 0x104 },
{ address = 0x8230D700, size = 0x1F8 },
{ address = 0x82073388, size = 0x774 },
{ address = 0x82090438, size = 0x148 },
{ address = 0x82075FC0, size = 0x78 },
{ address = 0x821696D8, size = 0xB0 },
{ address = 0x820886D8, size = 0x280 },
{ address = 0x822EF2F0, size = 0x64 },
{ address = 0x8210CFC0, size = 0x9C },
{ address = 0x820F1CD0, size = 0x154 },
{ address = 0x822EF0A8, size = 0xB8 },
{ address = 0x8223FA58, size = 0x7C },
{ address = 0x8215FDB0, size = 0xA4 },
{ address = 0x822EF160, size = 0x80 },
{ address = 0x82117218, size = 0x140 },
{ address = 0x82077D08, size = 0xD4 },
{ address = 0x8209E368, size = 0x60 },
{ address = 0x822EF580, size = 0x58 },
{ address = 0x820907F8, size = 0xC8 },
{ address = 0x8214B588, size = 0x10C },
{ address = 0x82184140, size = 0x618 },
{ address = 0x822EF1E0, size = 0x7C },
{ address = 0x822EF288, size = 0x68 },
{ address = 0x821F75D0, size = 0x148 },
{ address = 0x821A7C10, size = 0x114 },
{ address = 0x822EF800, size = 0x16C },
{ address = 0x822EEFE8, size = 0xBC },
{ address = 0x82077550, size = 0x154 },
{ address = 0x822EF5D8, size = 0x5C }
]

Missed function:
{ address = 0x82184658, size = 0x100 }

@thelastant
Copy link

Hi, after parsing the .html i've got 0 functions on my log. Only: "functions = ]"
Could someone give me some help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants