Hi I am trying to unpack files from super smash bros ultimate. someone on gbatemp (https://gbatemp.net/threads/smash-ultim ... st-8398077) already wrote an unpacker but this only getx out 2GB of files from the .arc format, there are lots of files still in there. The header is weird, strings seem to be encrypted or compressed, file header is unknown, file type is unknown. But i dont think it is encrypted as Fix unpacker doesnt decrypt anything. Here is a screenshot of the hex of the start of the file.
Can anyone provide a guide or links to how we can start to unpack this.
Thanks
Edit1: I have tried to use quickbms with arc2 script, but this creates lots of DAT files and doesnt extract the music, which means its the incorrect format so arc2 doesnt work.
If those 64 bit fields in the image are correct, it looks like that data.arc is 14Gb huge I see a NUS3 header at 0x38, that one is a file. I think the 2Gb you are talking about are just the 64bit at offset 0x10 which is indeed that size
Yes the fields are correct, it is 14GB well spotted! So do you think that this 14GB file only has 2GB worth of data? Aluigi, if you have a guide on how i can figure this out it owuld be helpful. I would like to learn so i can write unpackers for games in the future!
edit: Some research online said it could be using Lz10 format, but that according to online looks backwards, the header of this file starts at the normal position, so i am not sure if its possible lossless LZ10
Usually with the analysis of big archives it's enough to collect the first and last N megabytes (usually 2 or 10 Mb), but in this case we already know that the first file is over 2Gb so it's quite a problem to download a so huge file just for trying to figure out the format.
Anyway there is an alternative way, for example you can upload just a part of this arc file. The following script for quickbms (quickbms_4gb_files.exe) creates two files that you can upload for analysis:
aluigi wrote:Usually with the analysis of big archives it's enough to collect the first and last N megabytes (usually 2 or 10 Mb), but in this case we already know that the first file is over 2Gb so it's quite a problem to download a so huge file just for trying to figure out the format.
Anyway there is an alternative way, for example you can upload just a part of this arc file. The following script for quickbms (quickbms_4gb_files.exe) creates two files that you can upload for analysis:
It's even possible that the header is just those 0x38 bytes at the beginning and so there are only 4 files in that arc
Ow no, you are mistaken, the first file is not 2GB. I mean the program extracts about 2-3GB of files. it extracts 97 WebM files (861MB) and 1,358 LOPUS files (1.61GB).
Here are the files with quick_bms.
Edit: Looks like there's 00 00 00 A0 at the start/end of new files?
I think 2MB wont be enough since some files are large so Here is the first 20 MB and last 20 MB of the file via script. And first 100MB and last 100 MB. Uploaded to mega
The 2 Gb of data extracted by that tool aren't the archived files but they are the data included in the first file (NUS3 type).
Can you tell me what's the exact total size of data.arc? Is it 14'435'753'256 bytes?
upload.dat is of no help unfortunately, I expected some structures or useful things but I was wrong. There is something that look like an index at the end of upload2 but it's probably a false positive, something unrelated.
aluigi wrote:The 2 Gb of data extracted by that tool aren't the archived files but they are the data included in the first file (NUS3 type).
Can you tell me what's the exact total size of data.arc? Is it 14'435'753'256 bytes?
upload.dat is of no help unfortunately, I expected some structures or useful things but I was wrong. There is something that look like an index at the end of upload2 but it's probably a false positive, something unrelated.
Yes I think so. Someone had a arc extractor here as well https://github.com/shinyquagsire23/arcshark this works and extracts the files maybe it will help with the quickbms version
some more info according to https://twitter.com/ShinyQuagsire - That I can tell, it looks like SSBU has no filenames for files in data.arc. Or rather, it has filenames, but they're all hashed with an inline function before it reaches data.arc - It's interesting though because it looks like they also hash each file/folder name in some tables as well, so there's a level of flexibility beyond just hashing entire paths.
Why don't you use that extractor? Is it no longer compatible with the format?
Anyway I made a skeleton of script on the fly based on that main.cpp but it can't work without testing and tuning, I leave it here just to avoid to do the job from scratch again in case someone wants to return on it:
for i = 0 < bgm_unk_movie_entries callfunction entry_triplet 1 next i for i = 0 < entries callfunction entry_pair 1 next i for i = 0 < entries get off4_nums long next i for i = 0 < entries_2 callfunction file_pair 1 next i for i = 0 < num_files callfunction entry_triplet 1 next i for i = 0 < 0xE callfunction big_hash_entry 1 next i for i = 0 < entries_big callfunction big_file_entry 1
math offset + offset_2 log "" OFFSET comp_size
next i for i = 0 < 0x248f73 callfunction entry_pair 1 next i for i = 0 < 0x89b11 callfunction quad_entries 1 next i for i = 0 < entries_big callfunction entry_pair 1 next i for i = 0 < entries_3 callfunction entry_pair 1 next i
goto offset_5 callfunction offset5_header 1
for i = 0 < entries callfunction entry_pair 1 next i for i = 0 < 0x247a1 callfunction entry_pair 1 next i for i = 0 < entries callfunction entry_pair 1 next i for i = 0 < 0x71a94 callfunction entry_pair 1 next i /* for i = 0 < entries_2 get entires_5 long next i */ for i = 0 < entries_2 callfunction entry_pair 1 next i