How much does it take to hack a mobile network?
Is electronic government secure
in the era of WikiLeaks and Anonymous?

Is SCADA hacking a Hollywood fiction
or the nowadays reality?
Internet banking: is there any chance to win
over the fraudsters?

Cyber-crimes, cyber-espionage, cyber-war: where do we draw a borderline?

Pages

Tuesday, June 30, 2015

Best Reverser Write-Up: Analyzing Uncommon Firmware



While developing tasks for PHDays’ contest in reverse engineering, we had a purpose of replicating real problems that RE specialists might face. At the same time we tried to avoid allowing cliche solutions.


Let us define what common reverse engineering tasks look like. Given an executable file for Windows (or Linux, MacOS or any other widely-used operating system). We can run it, watch it in a debugger, and twist it in virtual environments in any way possible. File format is known. The processor’s instruction set is x86, AMD64 or ARM. Library functions and system calls are documented. The equipment can be accessed through the operating system only. Using tools like IDAPro and HеxRays makes analysis of such applications very simple, while debug protection, virtual machines with their own instruction sets, and obfuscation could complicate the task. But large vendors hardly ever use any of those in their programs. So there’s no point in developing a contest aimed at demonstrating skills that are rarely addressed in practice.

However, there’s another area, where reverse engineering became more in-demand, that’s firmware analysis. The input file (firmware) could be presented in any format, can be packed, encrypted. The operating system could be unpopular, or there could be no operating system at all. Parts of the code could not be changed with firmware updates. The processor could be based on any architecture. (For example, IDAPro “knows” not more than 100 different processors.) And of course, there’s no documentation available, debugging or code execution cannot be performed―a firmware is presented, but there’s no device.

Our contest’s participants needed to analyze an executable file and find the correct key and the relative email (any internet user was able to take part in the contest).


Part One: Loader

 At the first stage, the input file is an ELF file compiled with a cross compiler for the PA-RISC architecture. IDA can work with this architecture, but not as good as with x86. Most requests to stack variables are not identified automatically, and you’ll have to do it manually. At least you can see all the library functions (log, printf, memcpy, strlen, fprintf, sscanf, memset, strspn) and even symbolic names for some functions (с32, exk, cry, pad, dec, cen, dde). The program expects two input arguments: an email and key.



It’s not hard to figure out that the key should consist of two parts separated by the “-“ character. The first part should consist of seven MIME64 characters (0-9A-Za-z+/), the second part of 32 hex characters that translate to 16 bytes.



Further we can see calls to c32 functions that result in:

t = c32(-1, argv[1], strlen(argv[1])+1)
k = ~c32(t, argv[2], strlen(argv[2])+1)

Name of the function is a hint: it’s a СRC32 function, which is confirmed by the constant 0xEDB88320.
Next, we call the dde function (short for doDecrypt), and it receives the inverted output of the CRC32 function (encryption key) as the first argument, and the address and the size of the encrypted array as the second and third ones.

Decryption is performed by BTEA (block tiny encryption algorithm) based on the code taken from Wikipedia. We can guess that it’s BTEA from the use of the constant DELTA==0x9E3779B9. It’s also used in other algorithms on which BTEA is based on, but there are not many of them.

The key should be of 128-bit width, but we receive only 32 bits from CRC32. So we get three more DWORDs from the exk function (expand_key) by multiplying the previous value by the same DELTA.

However, the use of BTEA is uncommon. First of all, the algorithm supports a variable-width block size, and we use a block of 12-bytes width (there are processors that have 24-bit width registers and memory, then why should we use only powers of two). And in the second place, we switched encryption and decryption functions.

Since data stream is encrypted, cipher block chaining is applied. Enthropy is calculated for decrypted data in the cen function (calc_enthropy). If its value exceeds 7, the decryption result is considered incorrect and the program will exit.

The encryption key is 32-bit width, so it seems to be easily brute-forced. However, in order to check every key we need to decrypt 80 kilobytes of data, and then calculate enthropy. So brute-forcing the encryption key will take a lot of time.

But after the calculation, we call the pad function (strip_pad), which check and remove PKCS#7 padding. Due to CBC features, we need to decrypt only one block (the last one), extract N byte, check whether its range is between 1 and 12 (inclusive) and each of the last N bytes has value N. This allows reducing the number of operations needed to check one key. But if the last encrypted byte equals 1 (which is true for 1/256 keys), the check should be still performed.

The faster method is to assume that decoded data have a DWORD-aligned length (4 bytes). Then in the last DWORD of the last block there may be only one of three possible values: 0x04040404, 0x08080808 or 0x0C0C0C0C. By using heuristic and brute force methods you can run through all possible keys and find the right one in less than 20 minutes.

If all the checks after the decryption (entropy and the integrity of the padding) are successful, we call the fire_second_proc function, which simulates the launch of the second CPU and the loading of decrypted data of the firmware (modern devices usually have more than one processorwith different architectures).

If the second processor launches, it receives the user’s email and 16 bytes with the second part of the key via the function send_auth_data. At this point we made a mistake: there was the size of the string with the email instead of the size of the second part of the key.


Part Two: Firmware

The analysis of the second part is a little bit more complicated. There was no ELF file, only a memory image—without headings, function names, and other metadata. Type of the processor and load address were unknown as well.

We thought of brute force as the algorithm of determining the processor architecture. Open in IDA, set the following type, and repeat until IDA shows something similar to a code. The brute force should lead to the conclusion that it is big-endian SPARC.

Now we need to determine the load address. The function 0x22E0 is not called, but it contains a lot of code. We can assume that is the entry point of the program, the start function.

In the third instruction of the start function, an unknown library function with one argument == 0x126F0 is called, and the same function is called from the start function four more times, always with arguments with similar values (0x12718, 0x12738, 0x12758, 0x12760). And in the middle of the program, starting from 0x2490, there are five lines with text messages:

00002490             .ascii "Firmware loaded, sending ok back."<0>
000024B8            .ascii "Failed to retrieve email."<0>
000024D8            .ascii "Failed to retrieve codes."<0>
000024F8             .ascii "Gratz!"<0>
00002500             .ascii "Sorry may be next time..."<0>

Assuming that the load address equals 0x126F0-0x2490 == 0x10260, then all the arguments will indicate the lines when calling the library function, and the unknown function turns out to be the printf function (or puts).

After changing the load base, the code will look something like this:



The value of 0x0BA0BAB0, transmitted to the function sub_12194, can be found in the first part of the task, in the function fire_second_proc, and is compared with what we obtain from read_pipe_u32 (). Thus sub_12194 should be called write_pipe_u32.

Similarly, two calls of the library function sub_24064 are memset (someVar, 0, 0x101) for the email and code, while sub_121BC is read_pipe_str (), reversed write_pipe_str () from the first part.

The first function (at offset 0 or address 0x10260) has typical constants of MD5_Init:



 Next to the call to  MD5_Init, it is easy to detect the function MD5_Update () and MD5_Final (), preceded by the call to the library strlen ().



Not too many unknown functions are left in the start() function.



The sub_12480 function reverses the byte array of specified length. In fact, it’s memrev, which receives a code array input of 16 bytes.

Obviously, the sub_24040 function checks whether the code is correct. The arguments transfer the calculated value of MD5(email), the array filled in function sub_12394, and the number 16. It could be a call to memcmp!

The real trick is happening in sub_12394. There is almost no hints there, but the algorithm is described by one phrase—the multiplication of binary matrix of the 128 by the binary vector of 128. The matrix is stored in the firmware at 0x240B8.

Thus, the code is correct if MD5(email) == matrix_mul_vector (matrix, code).


Calculating the Key

To find the correct value of the code, you need to solve a system of binary equations described by the matrix, where the right-hand side are the relevant bits of the MD5(email). If you forgot linear algebra: this is easily solved by Gaussian elimination.

If the right-hand side of the key is known (32 hexadecimal characters), we can try to guess the first seven characters so that the CRC32 calculation result was equal to the value found for the key BTEA. There are about 1024 of such values, and they can be quickly obtained by brute-force, or by converting CRC32 and checking valid characters.

Now you need to put everything together and get the key that will pass all the checks and will be recognized as valid by our verifier :)
We were afraid that no one would be able to solve the task from the beginning to the end. Fortunately, Victor Alyushin showed that our fears were groundless. You can find his write-up on the task at http://nightsite.info/blog/16542-phdays-2015-best-reverser.html. This is the second time Victor Alyushin has won the contest (he was the winner in 2013 as well).

A participant who wished to remain anonymous solved a part of the task and took second place.

Thanks to all participants!

20 comments:

  1. Certainly a fantastic piece of work ... It has relevant information. Thanks for posting this.Your blog is so interesting and very informative.I was in search of the information that can win my heart and then I found your blog and got everything from here. Thanks

    Assignment writing service reviews

    ReplyDelete
  2. This is great piece of information that you have come up with and it is really well explained. Great post on the best reverser write up analyzing and this really touched my heart.Top rated essay writing service

    ReplyDelete
  3. I wondered upon your blog and wanted to say that I have really enjoyed reading your blog posts. Any way I’ll be subscribing to your feed and I hope you post again soon.
    Home Automation in Chennai
    smart home in Chennai
    Home security in Chennai
    Burglar alarm in Chennai
    Door sensors Chennai

    ReplyDelete
  4. Hi buddy, your blog' s design is simple and clean and i like it. Your blog posts about Online Dissertation Help are superb. Please keep them coming. Greets!!
    Do My C Project

    ReplyDelete
  5. My friend recommended this blog and he was totally right keep up the fantastic work!
    assigment help

    ReplyDelete
  6. This is really a great stuff for sharing. Keep it up .Thanks for sharing.
    MBA Report Writing

    ReplyDelete
  7. This is great information for students. This article is very helpful i really like this blog thanks. I also have some information relevant for online dissertation help.
    Content Creation Service

    ReplyDelete
  8. Get the dissertation writing service students look for these days with the prime focus being creating a well researched and lively content on any topic.
    java programming assignments

    ReplyDelete
  9. Well thanks for posting such an outstanding idea. I like this blog & I like the topic and thinking of making it right.
    Law Assignment Help

    ReplyDelete
  10. I appreciate your efforts in preparing this post. I really like your blog articles.
    Psychology Projects

    ReplyDelete
  11. This a good way to appreciate the teacher as they put their efforts to train students. UK dissertation Writers appreciates the teachers.
    Cold Fusion Assignment Help

    ReplyDelete
  12. Great Information,it has lot for stuff which is informative.I will share the post with my friends.
    Coursework Homework Help

    ReplyDelete
  13. Thank you for taking the time to provide us with your valuable information. We strive to provide our candidates with excellent care and we take your comments to heart.As always, we appreciate your confidence and trust in us
    Java Training in Chennai

    ReplyDelete
  14. If you need to hire a real hacker to help spy on your partner's cell phone remotely, change your grades or boost your credit score. Contact this helpline 347.857.7580 or the email address expressfoundations@gmail.com

    ReplyDelete
  15. Thank you for taking the time and sharing this information with us. It was indeed very helpful and insightful while being straight forward and to the point.
    mcdonaldsgutscheine | startlr | saludlimpia

    ReplyDelete
  16. These reverser write up analyze have shown all the relevant data within the write ups and have shared all the updated touching all the nook and corner of the writings. However for students with writing difficulty can visit buy essay online

    ReplyDelete
  17. thank you for publishing this. im also looking for a reverse write up code. is it work in the linux os or not? is anyone have any idea about this?

    ReplyDelete
  18. is the reverse writing code is working correctly? for me there is some issues and its not working properly. can any one help me to solve this?

    ReplyDelete
  19. thank you. actually this is what im seeking for. i was looking for a reverser write-up analyzing code. i appreciate you to post it here.

    ReplyDelete