Targeted Malware Reverse Engineering Workshop follow-up. Part 1
On April 8, 2021, we conducted a webinar with Ivan Kwiatkowski and Denis Legezo, Senior Security Researchers from our Global Research & Analysis Team (GReAT), who gave live workshops on practical disassembling, decrypting and deobfuscating authentic malware cases, moderated by GReAT’s own Dan Demeter.
Ivan demonstrated how to strip the obfuscation from the recently discovered Cycldek-related tool, while Denis presented an exercise on reversing the MontysThree’s malware steganography algorithm. The experts also had a fireside chat with our guest Igor Skochinsky of Hex-Rays.
On top of that, Ivan and Denis introduced the new Targeted Malware Reverse Engineering online self-study course, into which they have squeezed 10 years of their cybersecurity experience. This intermediate-level training is designed for those seeking confidence and practical experience in malware analysis. It includes in-depth analysis of ten fresh real-life targeted malware cases, like MontysThree, LuckyMouse and Lazarus, hands-on learning with an array of reverse engineering tools, including IDA Pro, Hex-Rays decompiler, Hiew, 010 Editor, and 100 hours of virtual lab practice.
In case you missed the webinar – or if you attended but want to watch it again – you can find the video here: Targeted Malware Reverse Engineering Workshop (brighttalk.com).
With so many questions collected during the webinar – thank you all for your active participation! – we lacked the time to answer them all online, we promised we would come up with this blogpost.
Questions on the Cycldek-related tool analysis
- How do you decide whether the Cycldek-actors have adopted the DLL side-loading triad technique, or the actors normally using the DLL side-loading triad have adopted the design considerations from Cycldek?
Ivan: It is precisely because we cannot really differentiate between the two that we have been very careful with the attribution of this specific campaign. The best we can say at the moment is that the threat actor behind it is related to Cycldek.
Denis: Even in our training there is another track with .dll search order hijacking – LuckyMouse. I really would not recommend anyone to build attribution based on such a technique, because it’s super wide-spread among the Chinese-speaking actors. - Does the script work automatically, or do you have to add information about the specific code you are working with?
Ivan: The script shown in the webinar was written solely for the specific sample used in the demonstration. I prefer to write small programs addressing very specific issues at first, and only move on to developing generic frameworks when I have to, which is not the case for opaque predicates. - Is the deobfuscation script for the shellcode publicly available?
Ivan: It is derived from a publicly available script. However, my modifications were not made public; if they were, it would make the training a little too easy, wouldn’t it? - Decryption/deobfuscation seems to be very labor-intensive. Have you guys experimented with symbolic execution in order to automate the process? Have you built a framework that you use against multiple families and (data&code) obfuscation or you build tools on ‘as needed’ basis?
Ivan: I have always found it quicker to just write quick scripts to solve the problem instead of spending time on diving into symbolic execution. Same goes for generic frameworks, but who knows? Maybe one day I will need one.
Denis: Decryption/deobfuscation is mostly case-based, I agree, but we also have disassembler plugins to facilitate such tasks. By the way, such a code base and the habits are the reasons that create the threshold to change the disassembler. We have internal framework for asm layer decryption, you will meet him in advanced course, but it’s up to researcher to use it or not. - Any insight into the success rate of this campaign?
Ivan: We were able to identify about a dozen organizations attacked during this campaign. If you want to know more about our findings, please have a look at our blogpost. - Any hint on the code pattern that helped you connect with the Cycledek campaign?
Ivan: You can find more about this in our blogpost. Even more details are available through our private reporting service. Generally speaking, we have a tool called KTAE that performs this task, and of course the memory of samples we have worked on in the past. - About the jump instructions that lead to the same spot – how were they injected there? Manually using a binary editor?
Ivan: The opaque predicates added in the Cycldek shellcode were almost certainly inserted using an automated tool. - I am one of the people using the assembly view. After the noping stage usually I have to suffer the long scrolling. You mentioned there is a way to fix this?”
Ivan: Check out this script I published on GitHub a couple of months ago. - Can xmm* registers and Pxor be used as code patterns Yara signatures?
Ivan: This is in fact one of the signatures I wrote for this piece of malware.
Questions on analysis of the MontysThree’s malware steganography algorithm
- Do you think there was a practical reason to use steganography as obfuscation, or the malware developer did it just for fun?
Denis: In my experience, most steps the malefactors take are on purpose, not for fun. With steganography they are trying to fool the network security systems like IDS/IPS: bitmaps are not too suspicious for them. Let me also add that the campaign operators are human, too, so now and again there will be Easter eggs in their products — for example, take a look at the Topinambour track and the phrases used as decryption keys and beaconing. - What image steganography algorithm have you seen hiding in the wild recently, other than LSB?
Denis: As far as I know, it is LSB alright — Microcin, MontysThree. I would expect some tools to be creating such images for the operators. But take a look at the function we ended during the short workshop: depending on the decrypted steganography parameters, it could be not just LSB, but the “less significant half a byte” as well. - Are there any recent malware samples incorporating network steganography in their C&C-channels, the way the DoublePulsar backdoor did using SMB back in 2017?
Denis: I suppose you mean the broken SMB packages. Yes, the last trick of the kind I saw was the rare use of HTTP statuses as C2 commands. You might be surprised to learn how many of them there are in RFCs and how strange some of them are, like “I’m the kettle”.
Reverse Engineering: how to start a career, working routines, the future of the profession
- How does one get into malware reverse engineering? What are the good resources to study? How can one find interesting malware samples?
Ivan: You can find a solid introduction at https://beginners.re/. Next, check out https://crackmes.one/ which contains many programs designed to be reverse-engineered, so one can finally move on to malware samples. Worry not about finding the “interesting” ones early on; just try to get good at it, document what you do, and you will find yourself in no time being able to access all the data you could wish for.
Denis: Do you like meditating on the code and trying to understand it? Then I suppose you already have everything you need. I think you should not bother looking for interesting ones in the beginning (if I get your question right) – everything will do. In my experience, the fun ones are written by professional programmers, not malware writers, because they just cannot do away with their habit of structuring the data and code, making it multi-thread safe, etc. - Now an experienced malware reverse engineer, where did you start from? Do you have any solid math/programming background from where you moved on to malware reverse engineering? Or what would be the typical path?
Ivan: I have a software engineering background, and my math expertise is shaky at best. After having met so many people in this field, I can say confidently that there is no typical path beyond being passionate about the subject.
Denis: Personally I have a math/programming background, but I couldn’t agree more: it’s more about passion than any scientific education. - If you are reverse engineering malware, do you work as a team?
Ivan: While several researchers can investigate a campaign together, I usually work on samples alone. The time it takes to wrap up a case may vary between a week and several months, depending on the complexity of the investigation!
Denis: Reversing itself is not the task that is easy to distribute/parallel. In my experience, you would spend more time organizing the process than benefit from the work of several reversers. Typically, I do this part alone, but research is not limited to binary analysis: the quest, the sharing of previous experiences with the same malware/tools, and so forth — it is a team game. - What do you think about AI? Would it help to automate the reverse engineering work?
Ivan: I think at the moment it is still a lot more A than I. I keep hearing sales pitches about how it will revolutionize the infosec industry and I do not want to dismiss them outright. I am sure there are a number of tasks, such as malware classification, where AI could be helpful. Let’s see what the future brings!
Denis: OK, do you use any AI-based code similarity, for example? I do, and you know — my impression so far is we still need meat-based engineers who understand how it works to use it right. - How helpful is static analysis, considering the multiple advanced sandboxing solutions available today?
Ivan: Sandboxing and static analysis will always serve complementary purposes. Static analysis is fast and does not require running the sample. It is great to quickly gather information about what a program might do or for triage. Dynamic analysis takes longer, yields more details, but gives malware an opportunity to detect the sandboxed environment. Then, at the very end, you do static analysis again, which involves reverse-engineering the program with a disassembler and takes the longest. All have their uses.
Denis: Sometimes you need static analysis because of the multiple advanced anti-sandboxing tricks out there. You also reveal far more details through static analysis if you want to create better Yara rules or distinguish a specific part of custom code to attribute samples to specific developers. So it is up to you how deep the rabbit hole should be.
Tips on tools, IDA and other things
- Do you contribute to Lumina server? Does Kaspersky have any similar public servers to help us during our analysis?
Ivan: My understanding is that Lumina is most helpful when used by a critical mass of users. As such, I do not think it would make sense to fragment the community across multiple servers. If you are willing to share metadata about the programs you are working on with third-parties, I would recommend to simply go with an Hex-Rays’ instance.
Denis: No, I have never contributed to Lumina so far. I don’t think it is going to be too popular for threat intelligence, but let us wait and see — public Yara repositories are there, so maybe code snippets might also meet the community’s needs. - What tools and techniques do you recommend for calculating the code similarity of samples? Is this possible with IDA Pro?
Ivan: For this, we have developed a commercial solution called KTAE. That’s what we regularly use internally.
Denis: Personally, I am using our KTAE. As far as I know, the creating of custom FLIRT signatures right in IDA could partially cover this need. - Is there any specific reason why you are using IDA under wine? Does it have anything to do with the type of samples you are analyzing?
Denis: I used to have Windows IDA licenses and Linux OS historically, so wine is my way of using disassembler. It does not affect your analysis anyway — choose any samples you want under any OS. - What is your favorite IDA Pro plugin and why?
Ivan: One of the internal plugins developed by Kaspersky. Other than that, I use x64dbgida regularly and have heard great things about Labeless.
Denis: For sure our internal plugins. And it’s not because of the authorship, they just perfectly meet our needs. - Do you have a plan to create/open an API so we can create our own processor modules for decompilers (like SLEIGH in Ghidra)? The goal being to analyze VM-based obfuscation.
Igor: Unlikely to happen in the near future but that’s something we’re definitely keeping in our minds.
If you have any more questions about Ivan’s workshop on the Cycldek-related tool or about the Targeted Malware Reverse Engineering online course, please feel free to drop us a line in the comments box below or contact us on Twitter: @JusticeRage, @legezo and @IgorSkochinsky. We will answer the rest of the questions in our next blogpost – stay tuned!