tsolomko / swcompression Goto Github PK
View Code? Open in Web Editor NEWA Swift framework for working with compression, archives and containers.
License: MIT License
A Swift framework for working with compression, archives and containers.
License: MIT License
Extracting a tarball containing symlinks results in symlinks with absolute paths. For example:
lrwxr-xr-x 1 me staff 44B Oct 20 13:52 linked.zip@ -> /Users/me/symlink-test/subdirectory/file.zip
drwxr-xr-x 3 me staff 96B Oct 20 13:52 subdirectory/
This makes the directory non-portable, as relocating symlink-test
will break the target path.
The solution for this is to see if the source and destination share a prefix, and if so, rewrite the destination to be relative. This will give a relative link, like so:
lrwxr-xr-x 1 me staff 21B Oct 20 14:40 linked.zip@ -> subdirectory/file.zip
drwxr-xr-x 4 me staff 128B Oct 20 13:55 subdirectory/
This also matches the results when using the Mac's Archive Utility to extract the files.
I have a fix coded up, but need to write some tests before I can open a PR.
Originally posted by wanderingbug April 10, 2024
Any chance of updating to include LZFSE? Apple hasn't updated their repo in 8 years and I would love a way to decompress on an EC2 instance instead of Apple hardware.
When decompress big files it takes a long time. Can you add progress callback to show that everything is okay?
Hello
I’m getting an LZMAError.notEnoughToRepeat error when uncompressing the 3rd sample file (sample-3.7z) downloaded from the website:
https://getsamplefiles.com/sample-archive-files/7z
*The error was thrown after a long wait.
The 7z file can be uncompressed correctly on my Mac using The Unarchiver
.
The code I use:
let data = try Data(contentsOf: fileURL)
let entries = try SevenZipContainer.open(container: data)
Thank you!
I’m wondering if it’ll be possible to add the LZ4 and ZStandard compression algorithms to this library in the future. Personally I would find it useful, as both have outstanding decompression times, and Zstd has a great compression ratio.
Ask a question。Does your library currently support ios9?
A simple issue with a simple fix. As of a recent Swift release, the compiler now diagnoses when redundant public
modifiers exist on a type. This is causing a warning to be raised in SWCompression
.
The warning is in file BZip2+BlockSize.swift
. The fix would be to remove the public
modifier on the extension (line 8). This would make the warning go away. The actual behaviour of the code would be unchanged.
Hi,
I'm new to SWCompression, so I'm just trying to get it setup with a Swift Package Manager project and I'm running into an issue when compiling.
Showing All Messages
: terminated(1): /Applications/Xcode-beta.app/Contents/Developer/usr/bin/git -C /Users/js/code/SPMTestProj/SPMTest/DerivedData/SPMTest/SourcePackages/checkouts/SWCompression submodule update --init --recursive output:
Submodule 'Tests/Test Files' (https://github.com/tsolomko/SWCompression-Test-Files.git) registered for path 'Tests/Test Files'
Cloning into '/Users/js/code/SPMTestProj/SPMTest/DerivedData/SPMTest/SourcePackages/checkouts/SWCompression/Tests/Test Files'...
git: 'lfs' is not a git command. See 'git --help'.
The most similar command is
log
error: external filter 'git lfs smudge %f' failed 1
error: external filter 'git lfs smudge %f' failed
fatal: 7z/SWCompressionSourceCode.7z: smudge filter lfs failed
Unable to checkout '543e5bc90d065c9cac22e501cff97d5c4489ce45' in submodule path 'Tests/Test Files'
A couple of notes:
Thanks for any help you can provide.
Hi @tsolomko,
I would like to know if it is possible to add GNU tar support to SWCompression.
I would like this feature because I am trying to build a deb
package that will be installed by apt
. apt
is able to read the first tar
file, but the second tar
causes apt
to crash with this error:
...
corrupted filesystem tarfile in package archive: unsupported PAX tar header type 'x'
termux-create-package fixed the "unsupported PAX tar header type" issue by using the GNU tar format.
NFPM also uses the GNU format for building deb
s.
Jeff
In All your tests you work with objects in memory
What about big files as example 6 or 7 gb?
First, thanks for the excellent library!
I am seeing a sporadic but persistent crash (in production) in TarContainer.open()
on the following line:
let entryData = data.subdata(in: dataStartIndex..<dataEndIndex)
It seems that the range supplied to data.subdata()
is out-of-range.
I don't know how this is possible, as the tarfiles are under my control and all have been tested to work. Crash stack trace is below.
Would you accept a PR that tested dataStartIndex
and dataEndIndex
against data.startIndex
and data.endIndex
and, if out of range, either:
libswiftFoundation.dylib Foundation.Data._Representation.subscript.getter : (Swift.Range<Swift.Int>) -> Foundation.Data
libswiftFoundation.dylib Foundation.Data.subdata(in: Swift.Range<Swift.Int>) -> Foundation.Data
MyApp function signature specialization <Arg[1] = Dead> of static SWCompression.TarContainer.open(container: Foundation.Data) throws -> [SWCompression.TarEntry]
MyApp static SWCompression.TarContainer.open(container: Foundation.Data) throws -> [SWCompression.TarEntry]
MyApp MyApp.FileHelper.(_installContent in _0551FF1AC8624441A93DA7C5AC53D8BC)(fromDownloadLocation: Foundation.URL, filename: Swift.String, progress: __C.NSProgress?) -> () ArchiveManager.swift:16
MyApp partial apply forwarder for closure #1 () -> () in MyApp.FileHelper.installContent(fromDownloadLocation: Foundation.URL, filename: Swift.String, progress: __C.NSProgress?) -> () FileHelper.swift:107
MyApp reabstraction thunk helper from @escaping @callee_guaranteed () -> () to @escaping @callee_unowned @convention(block) () -> () <compiler-generated>:0
libdispatch.dylib
_dispatch_call_block_and_release
SWCompression 4.5.5. Ubuntu 18.04.
I tested this on Swift 5.3 Release:
$ which swift
/home/xander/swift-5.3-RELEASE-ubuntu18.04/usr/bin/swift
$ swift --version
Swift version 5.3 (swift-5.3-RELEASE)
Target: x86_64-unknown-linux-gnu
and Swift for Tensorflow 0.10 (based on Swift dev trunk):
$ which swift
/home/xander/swift-tensorflow-RELEASE-0.10-cuda10.2-cudnn7-ubuntu18.04/usr/bin/swift
$ swift --version
Swift version 5.3-dev (LLVM 55d27a5828, Swift 6a5d84ec08)
Target: x86_64-unknown-linux-gnu
Error:
$ swift build
warning: dependency 'Metaprogramming' is not used by any target
.build/checkouts/SWCompression/Sources/ZIP/ByteReader+Zip.swift:36:86: error: cannot find 'kCFStringEncodingDOSLatinUS' in scope
static let cp437Encoding: CFStringEncoding = UInt32(truncatingIfNeeded: UInt(kCFStringEncodingDOSLatinUS))
^~~~~~~~~~~~~~~~~~~~~~~~~~~
[73/91] Compiling a_bitstr.c
I was use GzipArchive.archive to compress short text like "123".
Then I use GzipArchive.unarchive decompress the data, would throw error "wrongMagic" since input too short - bitReader.bitsLeft >= 33 * 8
in GzipArchive.swift line 81.
I'm trying to build a library that uses SWCompression
for watchOS, but it fails with:
error: The package product 'BitByteData' requires minimum platform version 5.0 for the watchOS platform, but this target supports 4.0 (in target 'SWCompression' from project 'SWCompression')
I noticed BitByteData
's released version (2.0.2) Package.swift
not defining any .platforms
, perhaps cutting a new release for this would fix it.
Hi ,
Will there be a release that will support XCode 11? Right now I'm pointing my cartfile to "develop" just to get the latest codebase that was build with swift 5.x
Thanks!
I've been using your code in production for my project Saily, and found this issue in Bugsnag backend.
0 libswiftFoundation.dylib Data.subscript.getter
1 chromatic LittleEndianByteReader.byte() (LittleEndianByteReader.swift:33:20)
2 chromatic specialized static LZMA2.decompress(data:) (LZMA2.swift:26:54)
3 chromatic static LZMA2.decompress(data:) (<compiler-generated>)
4 chromatic RepositoryCenter.downloadUpdatePackage(withBaseUrl:suffix:) (RepositoryCenter+Downloader.swift:131:44)
5 chromatic closure tsolomko/BitByteData#1 in closure #6 in RepositoryCenter.asyncUpdate(target:onComplete:) (RepositoryCenter+Internal.swift:304:39)
6 chromatic partial apply for closure tsolomko/BitByteData#1 in closure #6 in RepositoryCenter.asyncUpdate(target:onComplete:) (<compiler-generated>)
7 chromatic thunk for @escaping @callee_guaranteed () -> () (<compiler-generated>)
8 libdispatch.dylib __dispatch_call_block_and_release
9 ...
libdispatch etc, etc
Currently I can not provide you the sample data about it, but will back to you if I get anything new.
Here is the full report in casue you need it.
bugsnag_error_stacktrace_EXC_BREAKPOINT_event.txt
In my project, line 33 seems to be return data[offset]
.
public func byte() -> UInt8 {
defer { offset += 1 }
return data[offset] // <--
}
And in SWCompression, (LZMA2.swift:26:54)
public static func decompress(data: Data) throws -> Data {
let byteReader = LittleEndianByteReader(data: data)
// *
return try decompress(byteReader, byteReader.byte()) // <--
}
Hi,
Are there any plans to support the LZW algorithm?
Thanks!
Kind regards,
Remco Poelstra
Hi, thanks for providing the lib and especially for the support of LZ4. I'm trying to use this to decompress lz4 files using the following code:
do {
if let data = try? Data(contentsOf: sourceFilePath, options: .mappedIfSafe) {
print("data loaded")
let decompressedData = try LZ4.multiDecompress(data: data)
print("decompressed")
let combinedData = decompressedData.reduce(Data()) { (result, data) in
var mutableResult = result
mutableResult.append(data)
return mutableResult
}
FileManager.default.createFile(atPath: extractedFilePathName.path, contents: combinedData)
print("file written... in theory")
} else {
print("could not load")
}
} catch {
print("ran in error")
print(error)
}
It works. But compared to running lz4 via command line, it is really slow. My input has a size of around 30MB, extracted 126MB. Using the terminal it takes a second. Using the code above it takes about a minute. After "data loaded" and before "decompressed", the CPU goes up to 100% and memory goes up by roughly 125MB which is expected.
Is it expected to take so long to decompress?
I have also tried this code on your test archive SWCompressionSourceCode.tar.lz4
which I found in the test files repo. There the code just returns "corrupted". Am I loading the file wrong?
The process of decompressing an xz file using the SWCompression package I have found to be extremely time consuming when running in an iOS environment. For example I have a compressed file with a size of 45,7MB that is taking over 15 minutes to decompress, whereas decompressing this file on a MAC using standard decompression tools takes a matter of seconds. Is there a way to speed up the process within my iOS App?
For some reason, it looks like the version of LZ4 you have implemented seems not compatible with Apple's LZ4_RAW or the reference implementation at https://github.com/lz4/lz4. Is that possible or am I using it wrong?
I tested it with the develop branch and the tag 4.8.5
> wget https://github.com/richgel999/miniz/releases/download/3.0.1/miniz-3.0.1.zip
> ./swcomp zip -e foo miniz-3.0.1.zip -v
d = directory, f = file, l = symbolic link
f: miniz.c
f: miniz.h
f: examples/example1.c
Error: The operation could not be completed. The file doesn’t exist.
> ls foo
miniz.c miniz.h (There should be a directory examples)
> wget https://www.sqlite.org/2023/sqlite-amalgamation-3410200.zip
> ./swcomp zip -e foo sqlite-amalgamation-3410200.zip
... Successful ....
Is there any reason why the deployment target for iOS has to be 10.1? I can build the library just fine down to at least 8.0
Just parking this issue here, as it might help people who might be looking for a cross-platform solution for unarchiving/archiving large files.
My specific use case is needing to unarchive ZIP files that are several GBs compressed and several more GBs uncompressed.
After unzipping I need to get the file name (mainly the file suffix) so it can be stored in the sandbox directory.
I am working on a tool for my team to prep files for upload to cloud services as an archive. One of the steps is to turn bundle file types (.aplibrary, .fcpbundle, etc) into a single file for upload so the bundle flag doesn't get broken. However, some of these bundles are huge and even compressed are too large to upload.
I would like to be able to specify a max file size and that the tool create as many files as needed to archive the original in chunks up to max size.
Currently I am trying to use Process to run the built in /usr/bin/zip which can work, but is annoying and not very swifty. i.e. /usr/bin/zip -r -s 150g "(fileURL.path).zip", fileURL.lastPathComponent
It looks like the built in zip in macOS 13 & 14 is Info-ZIP https://infozip.sourceforge.net/
The ouput is a set of files like:
filename.z01
filename.z02
filename.z03
filename.zip
The Unarchiver (https://theunarchiver.com/) knows how to unzip these even though the built in Archive Utility doesn't.
I'm using SevenZipEntryInfo to get the list of files in a 7z archive. Is there a way to unzip a specific file individually, rather than unzipping all files using the open method?
Hello
I’m getting a BZip2Error.wrongCRC error using BZip2 with the following file:
problem.dat.zip
The file needs decompressing first before using:
let uncompressed = try Data(contentsOf: "problem.dat")
let compressed = BZip2.compress(data: uncompressed)
let backAgain = try BZip2.decompress(data: compressed)
Cheers.
Swift runtime failure: arithmetic overflow
Line 214 of BZip2.swift
The Packages file causing issues can be downloaded here
It appears the lfs issue is still there.
The main reason why I'm looking into this library is to run untar
on iOS.
This is shown here:
SWCompression/Sources/swcomp/Containers/TarCommand.swift
Lines 61 to 62 in b5c6296
However, swcomp
is not exposed when importing SWCompression
, so the write
method which I need is inaccessible.
Could swcomp be exposed, or could the write
method be added to the Container
class (signature example below)?
public protocol Container {
...
/// Write the entries on the filesystem
static func write(entries: [Entry], to url: URL) throws
}
We have implemented support for BSD AR file format at: https://github.com/Lessica/SWCompression/tree/feature-bsd-ar
The reason not opening a pull request is that we have noticed you are submoduling a git repo for test files and your pipeline seems to be a little heavy for configurations. During our development, we simplified these process and we make all our AR file test work locally.
We would like yourself to merge the codes if you are ok with it. Estimated more than one repo are required to update for this merge.
The package structure are a little different from the upstream, follow are the changes we made.
Sources/SWCompression/AR
.
├── ArContainer.swift
├── ArEntry.swift
├── ArEntryInfo.swift
├── ArError.swift
├── ArHeader.swift
├── ArParser.swift
├── ArReader.swift
├── ArWriter.swift
├── Data+Ar.swift
└── LittleEndianByteReader+Ar.swift
Are the sources for AR API interface, supports reading and writing.
Sources/SWCompression/Common/Extensions.swift
Is a extension for ourself.
SWCompression/TAR/TarReader.swift
Is a bug fix for Apple stuff returning mismatch item from their document. If a nil is returned, do not throw an error.
Tests/SWCompressionTests/
.
├── ArCreateTests.swift
├── ArReaderTests.swift
├── ArTests.swift
└── ArWriterTests.swift
Tests/SWCompressionTests/Test Files/AR
Are tests we made.
Following files are changed but not required for merge, generally speaking, we move the unicode test string payload to files.
Tests/SWCompressionTests/Constants.swift
SevenZipTests.swift
TarReaderTests.swift
TarTests.swift
ZipTests.swift
...
With above changes applied to, we can make another check in README~
| | Zlib | GZip | XZ | ZIP | TAR | 7-Zip | AR |
| ----- | ---- | ---- | --- | --- | --- | ----- | --- |
| Read | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Write | ✅ | ✅ | TBD | TBD | ✅ | TBD | ✅ |
Have a nice day~
Hardware Model: iPhone13,3
Process: chromatic
Identifier: wiki.qaq.chromatic.release
Version: 1641794667
Role: Foreground
OS Version: iOS 14.4
Exception Type: EXC_BREAKPOINT
Exception Subtype: KERN_INVALID_ADDRESS
0 chromatic MsbBitReader.bit() (chromatic)
1 chromatic DecodingTree.findNextSymbol() (chromatic)
2 chromatic specialized static BZip2.decode(_:_:) (chromatic)
3 chromatic specialized static BZip2.decompress(_:) (chromatic)
4 chromatic static BZip2.decompress(data:) (chromatic)
5 chromatic RepositoryCenter.downloadUpdatePackage(withBaseUrl:suffix:) (chromatic)
Have checked my build and should not be crashing on prediction. Having no idea about what cased the crash. Affecting 1 user over 10 times in a short time.
Hi @tsolomko!
I was wondering if it is possible to create a TAR archive from a directory using SWCompression. The example below shows how this could be done using tar
:
$ tar -cf archive.tar archive-folder
Right now I have this code:
// Both lines to add the folder.
let entry = TarContainer.Entry(info: .init(name: "folder-name", type: .directory), data: nil)
let entry = TarContainer.Entry(info: .init(name: "/path/to/folder-name", type: .directory), data: nil)
let data = try TarContainer.create(from: [entry])
Unfortunately, when I unarchive the tar.gz
, folder-name
, is an invalid directory.
Jeff
As of Carthage 0.37 (January 2021) the recommended way to build frameworks with Carthage and Xcode 12+ is to build a platform-independent XCFramework package (see Carthage README.md).
Our team is working on a MacOS project that uses the SWCompression and are currently trying to migrate to this new way of building our Carthage dependencies in order to get our project ready for its next release.
However, when I run carthage bootstrap --use-xcframeworks --platform macOS --no-use-binaries
(using Xcode 12.4 and Carthage 0.37) the build job for the SWCompression project fails with an exit code of 65. Looking at the build log it seems that the SWCompression project is unable to locate and include the BitByteData framework that has been built as an XCFramework package:
/Users/[...]/src/Carthage/Checkouts/SWCompression/Sources/ZIP/ZipEndOfCentralDirectory.swift:7:8: error: no such module 'BitByteData'
import BitByteData
^
** ARCHIVE FAILED **
The following build commands failed:
CompileSwift normal x86_64
CompileSwiftSources normal x86_64 com.apple.xcode.tools.swift.compiler
CompileSwiftSources normal arm64 com.apple.xcode.tools.swift.compiler
CompileSwift normal arm64
(4 failures)
Carthage has checked out the BitByteData project and successfully managed to build it, so it is definitely there, it's just not in the path and/or format that SWCompression expects it to be.
According to the Migration Steps in the Carthage README.md migrating the SWCompression project to use this new format should only require a single step:
For a framework target: In the Build Phases tab, in a Link Binary with Libraries phase, drag and drop each XCFramework you use from the Carthage/Build folder on disk.
I have a question related to a previously created issue and wanted to follow up. (#25)
Has there been any progress on resolving this issue regarding memory spikes when working with larger files.
For example, I'm currently trying to utilize the library to create a tar file and currently we are forced to pass the entire files data contents at once (load the entire files data into memory at a time) which causes rapid/exponential spikes in memory as Tar Entry objects are created. This could be fixed by reading chunks of a file at a time while creating the tar file.
Any feedback would be greatly appreciated.
Thanks a lot!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.