Comments (11)
Hi!
Thanks for reporting the issue!
Which version of MSVC are you using?
I couldn't reproduce your issue, but I only tested with recent versions of MSVC.
However, while troubleshooting the problem, I found a few lines of code where path encoding wasn't correctly handled.
It might be related to your issue.
I just pushed a commit to the develop
branch that should fix those lines (d789fa6).
Hopefully, it should fix your issue, too.
from bit7z.
Appreciate the quick response! I'm using VS 2022.
Just tried the new code, still getting the same crash with BitFileCompressor compress().
BitFileExtractor extract() can't handle a UTF-8 character in the zip filename either. That is the more critical one for me at the moment because I can workaround the BitFileCompressor side if I need to by creating a zip file with a temporary filename and then renaming it outside of bit7z. But trying to do something similar on the extract side is much more problematic (might not have permissions to rename the zip file, etc).
This is where it is crashing when I try extract():
My build settings:
C:\bit7z\build>cmake ../ -DCMAKE_BUILD_TYPE=Debug
-- Selecting Windows SDK version 10.0.19041.0 to target Windows 10.0.19045.
-- Standard filesystem: YES
-- Target Version: 4.0.0
-- Compiler ID: MSVC
-- Compiler Version: 19.35.32215.0
-- Architecture: x64
-- Build Type: Debug
-- Language Standard for bit7z: C++17
-- Auto format detection: OFF
-- Regex matching extraction: OFF
-- Use std::byte: OFF
-- Use native string: OFF
-- Generate Position Independent Code: OFF
-- Disable Zip ASCII password check: OFF
-- 7-zip version: 23.01
-- Build tests: OFF
-- Build docs: OFF
-- Auto prefix long paths: OFF
-- Use the default codepage: OFF
-- Path sanitization: OFF
-- Code analysis: OFF
-- Static runtime: OFF
-- CPM: Adding package [email protected] (v23.01)
-- 7-zip source code available at C:/bit7z/build/_deps/7-zip-src
-- Configuring done
-- Generating done
-- Build files have been written to: C:/bit7z/build
from bit7z.
I'm using VS 2022.
Uhm, this is really strange, as I can't replicate the issue...
Just tried the new code, still getting the same crash with BitFileCompressor compress().
I just pushed some other minor fixes that should make the encoding handling even more robust.
However, at this point, I don't think the problem is in bit7z...
BitFileExtractor extract() can't handle a UTF-8 character in the zip filename either. That is the more critical one for me at the moment because I can workaround the BitFileCompressor side if I need to by creating a zip file with a temporary filename and then renaming it outside of bit7z. But trying to do something similar on the extract side is much more problematic (might not have permissions to rename the zip file, etc).
I see. Do you have this issue only with zip files or also with other formats like 7z?
Interesting! Bit7z internally uses MSVC's implementation of std::filesystem
. In this particular case, it uses the function std::filesystem::u8path
as it expects to construct a path
from a UTF-8 string.
Now, MSVC's fs::path
internally uses wide strings, so it needs to convert a UTF-8 string to a wide string.
The error message No mapping for the Unicode character exists in the target multi-byte code page.
makes me think that the error is that the original string you pass to bit7z is not actually UTF-8 encoded and hence contains some invalid UTF-8 sequences, causing the exception.
One possible way to avoid the throwing of the exception would be to use bit7z's own string conversion functions like this:
return fs::path{ bit7z::widen( str ) }; // instead of fs::u8path
However, if the original string is not UTF-8 encoded, this might not result in a correct path encoding.
Anyway, thank you for all the details you provided! 🙏
from bit7z.
My zip filename seems to be properly UTF-8 encoded (based on the fact that it appears in the debugger's watch window correctly when I use ",s8" modifier.
I see that you have some automated tests for testing files in the archive that are UNICODE characters but are there any tests for testing the actual archive filename itself? (for Windows)
from bit7z.
My zip filename seems to be properly UTF-8 encoded (based on the fact that it appears in the debugger's watch window correctly when I use ",s8" modifier.
Uhm ok, strange! I'll need to investigate this further, even though without replicating is a bit difficult to understand what the actual problem is!
I see that you have some automated tests for testing files in the archive that are UNICODE characters but are there any tests for testing the actual archive filename itself? (for Windows)
Actually, yeah, here and here the automated tests do exactly that: reading an archive with Unicode characters in its name!
from bit7z.
Uhm ok, strange! I'll need to investigate this further, even though without replicating is a bit difficult to understand what the actual problem is!
It's easy to replicate. Just use a UTF-8 encoded std::string for the archive filename, in Windows.
from bit7z.
It's easy to replicate. Just use a UTF-8 encoded std::string for the archive filename, in Windows.
Yeah, the problem is that when I try to replicate it, it works!
Just for context, I'm testing everything in a clean installation of Windows 10 inside a VM with the default regional and language settings:
The only things installed are 7-zip, Visual Studio 2022, CMake, and Git.
I cloned bit7z in the Documents folder, and created a CMake project in a separate folder called bit7z-test
:
cmake_minimum_required( VERSION 3.8.2 FATAL_ERROR )
project( bit7ztest )
set( CMAKE_CXX_STANDARD 11 )
set( CMAKE_CXX_STANDARD_REQUIRED ON )
add_executable( bit7ztest main.cpp )
add_subdirectory( ${PROJECT_SOURCE_DIR}/../bit7z/ ${CMAKE_CURRENT_BINARY_DIR}/bit7z )
target_link_libraries( bit7ztest PRIVATE bit7z )
target_compile_options( bit7ztest PRIVATE /utf-8 )
The test files (to be compressed) are in the test_unicode
folder in the project's folder:
I tested the compression, and it works just fine:
#include <iostream>
#include <direct.h>
#include <bit7z/bit7zlibrary.hpp>
#include <bit7z/bitexception.hpp>
#include <bit7z/bitfilecompressor.hpp>
using namespace bit7z;
auto main() -> int {
std::cout << "Current Code Page: " << GetACP() << std::endl;
_chdir(R"(C:\Users\Riccardo\Documents\bit7z-test\)");
try {
Bit7zLibrary lib{ R"(C:\Program Files\7-Zip\7z.dll)" };
BitFileCompressor compressor{ lib, BitFormat::Zip };
compressor.compressDirectory( "test_unicode/", u8"SottoMenù.zip" );
} catch (const bit7z::BitException& ex) {
std::cerr << ex.what() << std::endl;
}
return 0;
}
I tested extraction too, from the same archive created in the previous test:
#include <iostream>
#include <direct.h>
#include <bit7z/bit7zlibrary.hpp>
#include <bit7z/bitexception.hpp>
#include <bit7z/bitfileextractor.hpp>
using namespace bit7z;
auto main() -> int {
std::cout << "Current Code Page: " << GetACP() << std::endl;
_chdir(R"(C:\Users\Riccardo\Documents\bit7z-test\)");
try {
Bit7zLibrary lib{ R"(C:\Program Files\7-Zip\7z.dll)" };
BitFileExtractor extractor{ lib, BitFormat::Zip };
extractor.extract( u8"SottoMenù.zip", "test_extract/" );
} catch (const bit7z::BitException& ex) {
std::cerr << ex.what() << std::endl;
}
return 0;
}
Again, it worked without issues:
All the tests were performed using the develop
branch of bit7z.
from bit7z.
The only way I found to replicate your issue is by forcing the encoding of the string literal to be Windows-1252:
compressor.compressDirectory( "test_unicode/", "SottoMen\xF9.zip" );
But we already confirmed that you're using UTF-8 strings, so I'll need to investigate the issue further!
from bit7z.
So, I've released a new maintenance version v4.0.2 that contains all the fixes to the UTF-8 support I made on the develop
branch.
I don't know if you tested all these changes, but they could fix your issue.
If the issue persists, could you provide a more detailed stack trace of where the exception happens? At least on bit7z's side.
Also, are you using MSVC's /utf-8
option, or the /source-charset
and /execution-charset
options?
Any details you can provide will be really useful;
I'm continuing to try to replicate it, but I haven't been lucky so far, so I really appreciate any help you can provide!
from bit7z.
I'm still testing but it appears 4.0.2 fixed my issue (4.0.1 definitely still had the issue).
Thanks! I'll let you know if there are any issues but it's looking good.
from bit7z.
I'm still testing but it appears 4.0.2 fixed my issue (4.0.1 definitely still had the issue).
Thanks!
Great, you're welcome!
I'll let you know if there are any issues but it's looking good.
Thanks!
from bit7z.
Related Issues (20)
- [Bug]: Incorrect display and decompression of compressed file name HOT 4
- [Feature Request]: RenameOutput HOT 2
- [Feature Request]: custom suffix format HOT 1
- [Bug]: Issues regarding the Fat32 and Exfat file systems HOT 8
- [Bug]: Errors related to file compression HOT 4
- [Feature Request]: Can you provide a method for setting the file time HOT 1
- [build-error]: Failed to open the archive file: No such process HOT 2
- [Bug]: UpdateMode::Update not working as expected, throwing exception HOT 3
- [Feature Request]: Creation or extraction archive file in one line HOT 1
- [Feature Request]: Support for compressed packages in four formats: ace, img, uue, and war HOT 2
- [Feature Request]: Extracting files whose names contain forbidden characters HOT 1
- Do I need to build 7zip myself? HOT 6
- [Bug]:
- [Bug]: Multiple Definition Linker Error for IID_IUnknown HOT 10
- [Bug]: BIT7Z_DISABLE_USE_STD_FILESYSTEM not working HOT 4
- [Bug]: Compile error when bit7z.lib is introduced into the project. HOT 11
- [Feature Request]: Is "tstring_to_path" forgotten here? HOT 1
- [Bug]: HOT 3
- [Feature Request]: Compresses files open for writing by another applications HOT 1
- [Bug]: CreateObject 函数调用异常
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bit7z.