Giter VIP home page Giter VIP logo

Comments (5)

re-fox avatar re-fox commented on June 12, 2024 1

There are certainly pros and cons for function ID. Looking forward to a future where Capa has multiple backends, I'd have to guess that each disassembly library has varying levels of function ID. Something like pure capstone has absolutely no idea, whereas IDA has a huge backing library of FLIRT. Capa, however, is equipped to carry the weight of function ID, where the disassembler may be ignorant.

I would have no expectation for Capa/Feye/Community to maintain a large library of signatures, but for more common functions, it could be useful. Exhaustive support is where it becomes cumbersome.

One solution is to build support for known functions. Where results are not rendered in the report if they match on a known function. One downside is if the known function rule triggers on a false positive, you could introduce a false negative which is not ideal.

Rendering the results and give them the function/ prefix/namespace like what @williballenthin was mentioning is a similar option.

Another idea could be to divide the rules into different categories. (this could be managed in the rule meta)

  • Categorization (for signatures against certain malware families, highly specific)
  • Techniques (a majority of existing rules would sit in here)
  • Informational (metadata, language, compiler, etc..)
  • Library (known library functions - could be filtered out in output)
    This is a hack of the existing namespacing that Capa already supports. Introducing malware categorization could be out of scope though. Yara may be better suited for that.

With that said, I do think that would be some utility out of having support for the “most common” (that could be up for debate) functions. Something like __security_init_cookie comes to mind. I could be wrong, but I can't imagine that those are updated on an aggressive cadence.

I'm of the mindset that rules and scanning are "cheap" and the more context the better.

from capa.

williballenthin avatar williballenthin commented on June 12, 2024 1

this is closed in #446

from capa.

mr-tz avatar mr-tz commented on June 12, 2024

Some form of library detection in the backend often times provides results that are much easier to process.

In the example (43c486...) below 20 capabilities are recognized in functions that IDA marks as library code.

vivisect (no library detection)

+------------------------------------------------------+------------------------------------------------------+
| CAPABILITY                                           | NAMESPACE                                            |
|------------------------------------------------------+------------------------------------------------------|
| execute anti-VM instructions (2 matches)             | anti-analysis/anti-vm/vm-detection                   |
| encode data using XOR                                | data-manipulation/encoding/xor                       |
| hash data using FNV (2 matches)                      | data-manipulation/hashing/fnv                        |
| contains PDB path                                    | executable/pe/pdb                                    |
| contain a thread local storage (.tls) section        | executable/pe/section/tls                            |
| accept command line arguments                        | host-interaction/cli                                 |
| query environment variable (2 matches)               | host-interaction/environment-variable                |
| set environment variable (4 matches)                 | host-interaction/environment-variable                |
| enumerate files via kernel32 functions (2 matches)   | host-interaction/file-system/files/list              |
| read file (4 matches)                                | host-interaction/file-system/read                    |
| write file (8 matches)                               | host-interaction/file-system/write                   |
| print debug messages (5 matches)                     | host-interaction/log/debug/write-event               |
| allocate thread local storage (3 matches)            | host-interaction/process                             |
| get process heap flags                               | host-interaction/process                             |
| get thread local storage value (3 matches)           | host-interaction/process                             |
| set thread local storage value (3 matches)           | host-interaction/process                             |
| terminate process (3 matches)                        | host-interaction/process/terminate                   |
| terminate process via fastfail (6 matches)           | host-interaction/process/terminate                   |
| create thread (3 matches)                            | host-interaction/thread/create                       |
| access PEB ldr_data                                  | linking/runtime-linking                              |
| link function at runtime (3 matches)                 | linking/runtime-linking                              |
| link many functions at runtime                       | linking/runtime-linking                              |
| linked against C++ regex library                     | linking/static/cppregex                              |
| parse PE header (5 matches)                          | load-code/pe                                         |
+------------------------------------------------------+------------------------------------------------------+

IDA Pro (ignores FLIRT identified functions)

2020-12-01_16-03-47

from capa.

mr-tz avatar mr-tz commented on June 12, 2024

see library function ID efforts, especially #433 and #435

from capa.

mr-tz avatar mr-tz commented on June 12, 2024

AWESOME!

from capa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.