Giter VIP home page Giter VIP logo

binary2strings's People

Contributors

glmcdona avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

binary2strings's Issues

Unable to install

Hello,

When trying to install binary2strings via pip, I get the following error:

$ pip install binary2strings

Collecting binary2strings
  Downloading binary2strings-0.1.5.tar.gz (4.4 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [6 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-drqgf3ft/binary2strings_a67e0f0eeb3f43679878c54527d48f08/setup.py", line 99, in <module>
          with open("readme.md", "r") as fh:
      FileNotFoundError: [Errno 2] No such file or directory: 'readme.md'
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

I forked this repo and changed readme.md to README.md, but now I get the following error when attempting to install via setup.py:

$ python3 setup.py install

/home/user/comp_id_poc/binary2strings/setup.py:9: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
  from distutils.version import LooseVersion
running install
/home/user/comp_id_poc/poc/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
/home/user/comp_id_poc/poc/lib/python3.10/site-packages/setuptools/command/easy_install.py:158: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
running bdist_egg
running egg_info
writing src/binary2strings.egg-info/PKG-INFO
writing dependency_links to src/binary2strings.egg-info/dependency_links.txt
writing top-level names to src/binary2strings.egg-info/top_level.txt
reading manifest file 'src/binary2strings.egg-info/SOURCES.txt'
adding license file 'LICENSE'
writing manifest file 'src/binary2strings.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
-- The C compiler identification is GNU 11.2.0
-- The CXX compiler identification is GNU 11.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- pybind11 v2.9.2 
-- Found PythonInterp: /home/user/comp_id_poc/poc/bin/python3 (found version "3.10.6") 
-- Found PythonLibs: /usr/lib/x86_64-linux-gnu/libpython3.10.so
-- Performing Test HAS_FLTO
-- Performing Test HAS_FLTO - Success
-- Configuring done
-- Generating done
-- Build files have been written to: /home/user/comp_id_poc/binary2strings/build/temp.linux-x86_64-3.10
[ 50%] Building CXX object CMakeFiles/binary2strings.dir/src/binary2strings/extracted_string.cpp.o
[ 50%] Building CXX object CMakeFiles/binary2strings.dir/src/binary2strings/binary2strings.cpp.o
/home/user/comp_id_poc/binary2strings/src/binary2strings/extracted_string.cpp:6:6: error: ‘wstring_convert’ in namespace ‘std’ does not name a template type
    6 | std::wstring_convert<std::codecvt_utf8<wchar_t>> _converter;
      |      ^~~~~~~~~~~~~~~
/home/user/comp_id_poc/binary2strings/src/binary2strings/extracted_string.cpp: In constructor ‘extracted_string::extracted_string(const wchar_t*, size_t, STRING_TYPE, int, int)’:
/home/user/comp_id_poc/binary2strings/src/binary2strings/extracted_string.cpp:31:20: error: ‘_converter’ was not declared in this scope
   31 |         m_string = _converter.to_bytes(string, string + size_in_bytes / 2);
      |                    ^~~~~~~~~~
/home/user/comp_id_poc/binary2strings/src/binary2strings/extracted_string.cpp: In member function ‘float extracted_string::get_proba_interesting()’:
/home/user/comp_id_poc/binary2strings/src/binary2strings/extracted_string.cpp:91:31: error: ‘exp’ was not declared in this scope
   91 |         return 1.0f / (1.0f + exp(-score));
      |                               ^~~
/home/user/comp_id_poc/binary2strings/src/binary2strings/binary2strings.cpp:100:11: error: expected constructor, destructor, or type conversion before ‘(’ token
  100 | __declspec(safebuffers) extracted_string* try_extract_string(const unsigned char* buffer, size_t buffer_size, long offset, size_t min_chars)
      |           ^
gmake[2]: *** [CMakeFiles/binary2strings.dir/build.make:90: CMakeFiles/binary2strings.dir/src/binary2strings/extracted_string.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[2]: *** [CMakeFiles/binary2strings.dir/build.make:76: CMakeFiles/binary2strings.dir/src/binary2strings/binary2strings.cpp.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:100: CMakeFiles/binary2strings.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
Traceback (most recent call last):
  File "/home/user/comp_id_poc/binary2strings/setup.py", line 101, in <module>
    setup(
  File "/home/user/comp_id_poc/poc/lib/python3.10/site-packages/setuptools/__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "/usr/lib/python3.10/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/usr/lib/python3.10/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/home/user/comp_id_poc/poc/lib/python3.10/site-packages/setuptools/command/install.py", line 74, in run
    self.do_egg_install()
  File "/home/user/comp_id_poc/poc/lib/python3.10/site-packages/setuptools/command/install.py", line 116, in do_egg_install
    self.run_command('bdist_egg')
  File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/home/user/comp_id_poc/poc/lib/python3.10/site-packages/setuptools/command/bdist_egg.py", line 164, in run
    cmd = self.call_command('install_lib', warn_dir=0)
  File "/home/user/comp_id_poc/poc/lib/python3.10/site-packages/setuptools/command/bdist_egg.py", line 150, in call_command
    self.run_command(cmdname)
  File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/home/user/comp_id_poc/poc/lib/python3.10/site-packages/setuptools/command/install_lib.py", line 23, in run
    self.build()
  File "/usr/lib/python3.10/distutils/command/install_lib.py", line 109, in build
    self.run_command('build_ext')
  File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/home/user/comp_id_poc/binary2strings/setup.py", line 38, in run
    self.build_extension(ext)
  File "/home/user/comp_id_poc/binary2strings/setup.py", line 68, in build_extension
    subprocess.check_call(['cmake', '--build', '.'] + build_args,
  File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--config', 'Release', '--', '-j2']' returned non-zero exit status 2.

This is a bit above my skill level at this point. Any ideas? Thanks!

pip install failed on Ubuntu 22.04.2 LTS / Debian 12.0

Since the 0.1.10 release I'm having installs failing when running pip3 install. Initial failures were on a Ubuntu 22.04.2 LTS box, with Python 3.10.6 / pip3 23.1.2 installed within a venv configuration. I've since replicated it on Debian 12 with Python 3.11.2 and Kali 2023.2 also with Python 3.11.2 / pip3 23.0.1. The build will fail both within a venv as well as with an user install with the following trace:

$ python3 -m venv test       
$ source test/bin/activate
                                                                                                                     
┌──(test)
└─$ pip3 install binary2strings
Collecting binary2strings
  Downloading binary2strings-0.1.10.tar.gz (9.3 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: binary2strings
  Building wheel for binary2strings (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building wheel for binary2strings (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [14 lines of output]
      running bdist_wheel
      running build
      running build_ext
      x86_64-linux-gnu-gcc -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/home/rokkr/Scripts/test-binary-strings/test/include -I/usr/include/python3.11 -c flagcheck.cpp -o flagcheck.o -std=c++17
      building 'binary2strings' extension
      creating build
      creating build/temp.linux-x86_64-cpython-311
      creating build/temp.linux-x86_64-cpython-311/src
      x86_64-linux-gnu-gcc -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -DVERSION_INFO=0.1.10 -Isrc -I/tmp/pip-build-env-fv5_42_5/overlay/lib/python3.11/site-packages/pybind11/include -I/home/rokkr/Scripts/test-binary-strings/test/include -I/usr/include/python3.11 -c src/binary2strings.cpp -o build/temp.linux-x86_64-cpython-311/src/binary2strings.o -std=c++17 -fvisibility=hidden -g0
      src/binary2strings.cpp:1:10: fatal error: binary2strings.hpp: No such file or directory
          1 | #include "binary2strings.hpp"
            |          ^~~~~~~~~~~~~~~~~~~~
      compilation terminated.
      error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for binary2strings
Failed to build binary2strings
ERROR: Could not build wheels for binary2strings, which is required to install pyproject.toml-based projects

Windows VS2022 installation

Hi, @glmcdona

Just find this repo interesting and would like to try, as alternative to strings.exe on Windows.

1. pip failed

  • System: Windows 10
  • Python: Python 3.10, installed via miniconda
  • C++ Compiler / IDE: I have Visual Studio 2019, 2022 both installed.

The output: it complains cl.exe failed.

2. C++ source modification suggestion

https://github.com/glmcdona/binary2strings/blob/main/src/extracted_string.cpp#L11

This line11 in extracted_string.cpp, convert NULL to std::string. However it will cause crash if you just try the minimal example on linux-x64 with clang:

#include <stdio.h>
#include <string.h>
int main()
{
    std::string m = (std::string)NULL;
    std::cout << "m is " << m << std::endl;
    return 0;
}

3. Build source and install on Windows

OK, based on the above two considerations:

  • I commented out line 11 in extracted_string.cpp
  • In an opened cmd.exe, python setup.py install still fail
  • Open VS2022 x64 native command prompt, then python setup.py install success.

I would suggest fix the pip install binary2strings experience on Windows. Thank you.

Crashes And Recognized

My parameter settings are min_chars = 4 and only_interesting = false.

There are two issues.

First, when I input a binary string of 0000000000, it crashes on "proba_interesting_avg = proba_interesting_avg_vect[i];".

Second, when I set the buffer very short, such as "ok" which only has 2 characters, it cannot be recognized.

Installation failed on Win 10 VS 2019 Py 3.8.3

Hi. Just tried to install and failed on both pip route and setup route. Python 3.8.3, Win 10.

setup.py install Log from VS2019 Developer Command Prompt attached. This was using the direct DL from the main branch.

Developer Command Prompt for VS 2019.txt

I spotted the releases area for v0.1.6 and tried that as well, same setup, also failed but figured out that it isn't happy with my 32-bit Python.

python 3.12/11 installation

when trying to pip install flare-floss in python 3.12 (windows 11) i get the following error

PS C:\Users\user> py -m pip install flare-floss
Collecting flare-floss
  Using cached flare_floss-3.0.1-py3-none-any.whl.metadata (5.5 kB)
Collecting tabulate==0.9.0 (from flare-floss)
  Using cached tabulate-0.9.0-py3-none-any.whl.metadata (34 kB)
Collecting vivisect==1.1.1 (from flare-floss)
  Using cached vivisect-1.1.1-py3-none-any.whl.metadata (2.8 kB)
Collecting viv-utils==0.7.9 (from viv-utils[flirt]==0.7.9->flare-floss)
  Using cached viv_utils-0.7.9-py2.py3-none-any.whl.metadata (1.2 kB)
Collecting pydantic==1.10.9 (from flare-floss)
  Using cached pydantic-1.10.9-py3-none-any.whl.metadata (147 kB)
Collecting tqdm==4.65.0 (from flare-floss)
  Using cached tqdm-4.65.0-py3-none-any.whl.metadata (56 kB)
Collecting networkx==3.1 (from flare-floss)
  Using cached networkx-3.1-py3-none-any.whl.metadata (5.3 kB)
Collecting halo==0.0.31 (from flare-floss)
  Using cached halo-0.0.31.tar.gz (11 kB)
  Preparing metadata (setup.py) ... done
Collecting rich==13.4.2 (from flare-floss)
  Using cached rich-13.4.2-py3-none-any.whl.metadata (18 kB)
Requirement already satisfied: pefile>=2022.5.30 in c:\users\user\appdata\local\programs\python\python312\lib\site-packages (from flare-floss) (2023.2.7)
Collecting binary2strings==0.1.13 (from flare-floss)
  Using cached binary2strings-0.1.13.tar.gz (59 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [21 lines of output]
      Traceback (most recent call last):
        File "C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 353, in <module>
          main()
        File "C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\user\AppData\Local\Temp\pip-build-env-_0up553l\overlay\Lib\site-packages\setuptools\build_meta.py", line 325, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\user\AppData\Local\Temp\pip-build-env-_0up553l\overlay\Lib\site-packages\setuptools\build_meta.py", line 295, in _get_build_requires
          self.run_setup()
        File "C:\Users\user\AppData\Local\Temp\pip-build-env-_0up553l\overlay\Lib\site-packages\setuptools\build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 30, in <module>
        File "C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\encodings\cp1255.py", line 23, in decode
          return codecs.charmap_decode(input,self.errors,decoding_table)[0]
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      UnicodeDecodeError: 'charmap' codec can't decode byte 0x8c in position 2769: character maps to <undefined>
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

which originated from binary2strings whcih is a dependency for the flare-floss package
i tried to install pip install binary2strings and got the same error

tried it on two computers, with windows11 and windows 10, both python 3.12 and python3.11, same result
and i used both cmd, PowerShell and the terminal of pycharm (also PS, in venv)

i prefer not to downgrade further, hoping anyone has a soltuion
thanks in advance!

Too many junk strings

First of all this is great. A problem I've experimented with and wanted to solve for a while.
So thanks a lot for your work.

I do notice there are still a lot of junk strings that get extracted.
If I do an extra filtering pass I fix a lot of them:

  1. Python decode all strings to ASCII. Wrapped in an empty (or specific on decoding error for purists) will catch and thus ignore all chars that are >127. Filters out many bad ones. Although obviously not so good if you actually want other than English strings.
  2. Filter out all strings that are a run of the same character.
  3. Filter all that don't have at least one English vowel (AEIOU upper and lower case). But like #1, not ideal for non-English strings.

Still after this additional filtering still several bogus strings.
Q: Needs a larger corpus of bad vs good strings to train on?

I know this is a pretty loaded subject. While English is relatively easy because of only ~127 codepoints, it's another story to try to encapsulate the UTF ranges.
I might have a better solution that I've experimented with based on someone elses work that has largely gone unnoticed.
It's based on a statistical DB based on ngrams, using a huge corpus collected over several languages and character sets.
It's a mostly pre-ML solution, that a lot of commercial text extraction, browser language detection, etc., use.
Which maybe ML could be applied to as well for a similar or hybrid solution.

This would add to your tool multi-character set detection (not just UTF), better initial filtering (a lot of bogus strings will have a low statistical matching score). At the same time it could detect what languages are being used.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.