bit7z version 4.0.x Compilation options <p

PS. <div class="highlight highlight-source-c++ notranslate position-r

[Possible Bug]: BitArchiveReader::items is slow? about bit7z HOT 6 CLOSED

kleuter commented on June 12, 2024

[Possible Bug]: BitArchiveReader::items is slow?

from bit7z.

Comments (6)

rikyoz commented on June 12, 2024

Hi!

Calling BitArchiveReader::items takes good 6 seconds while official 7zFM.exe opens it in no time. Ideas?

Part of the reason is that items() is a vector of whole item objects (a.k.a. BitArchiveItemInfo).
This means that each BitArchiveItemInfo object stores all the properties about that item that bit7z could get via 7-Zip; these properties are stored in a map BitProperty (the property) -> BitPropVariant (the value of the property).

Unfortunately, 7-Zip doesn't provide a method to know which item properties are present or not (as far as I know). So what bit7z does is to try to get all of them (~95 iirc); if it fails, the property is not present, otherwise it is stored in the map.

Now, if you do all this for ~350000 items, it can have a significant overhead, obviously.
Bit7z does this because it doesn't know how the user wants to use the vector returned by items().
Specifically, the user might want to use the content of the vector after the BitArchiveReader went out of scope, and the archive was closed. So bit7z has to store every information about the items.

However, there's an alternative to items(): you can directly iterate over a BitArchiveReader. This way, you pretty much avoid the overhead of reading and storing all the properties of all the items.
BitArchiveReader has the begin()/cbegin()/end()/cend() methods that allow, for example, to iterate over it with a for each loop.
An iterator of BitArchiveReader points to a BitArchiveItemOffset object, which is basically just an index of an item within the archive.
BitArchiveItemOffset has the same exact interface of BitArchiveItemInfo, but doesn't store any property, it only stores the index of the item. When you ask for something like the path() of the item, bit7z will directly ask 7-Zip for its value.

For example:

BitArchiveReader reader{ lib, "qt-everywhere-src-6.7.0.zip", BitFormat::Zip };
const auto items = reader.items();
for ( const auto& item : items ) {
    std::cout << item.index() << ") " << item.path() << '\n';
}

becomes

BitArchiveReader reader{ lib, "qt-everywhere-src-6.7.0.zip", BitFormat::Zip };
for ( const auto& item : reader ) {
    std::cout << item.index() << ") " << item.path() << '\n';
}

Sorry for the wall-of-text, I just wanted to give some insight on why items() works like that and might have a significant overhead in some cases like yours.

from bit7z.

rikyoz commented on June 12, 2024

PS.
for ( uint32_t i = 0; i < itemsCount(); ++i )
I don't think it's a good idea to call itemsCount over 350.000 times (for this particular archive).
const uint32_t numberOfItems = itemsCount();
would be better, though it doesn't help anyways in the matter.

Nice catch!
I think the compiler might be able to optimise it out of the loop condition, but I'm not sure if it can in this particular case.
I'll fix it in the next minor release!
Thanks for pointing it out!

from bit7z.

rikyoz commented on June 12, 2024

I've done some quick benchmarks (without any rigorous statistical analysis):

// Optimizing std::cout on MSVC
setvbuf(stdout, nullptr, _IOLBF, 4096);
std::ios::sync_with_stdio(false);

// Printing the content of the archive
BitArchiveReader reader{ lib, "qt-everywhere-src-6.7.0.zip", BitFormat::Zip };
const auto start_items = std::chrono::high_resolution_clock::now();
const auto items = reader.items();
for ( const auto& item : items ) {
    std::cout << item.path() << '\n';
}
const auto end_items = std::chrono::high_resolution_clock::now();

const auto start_it = std::chrono::high_resolution_clock::now();
for ( const auto& item : reader ) {
    std::cout << item.path() << '\n';
}
const auto end_it = std::chrono::high_resolution_clock::now();

std::cout << std::endl;

std::cout << "Time taken\n";
std::cout << "  using items(): " << std::chrono::duration_cast<std::chrono::seconds>(end_items - start_items) << '\n';
std::cout << "  using iterators: " << std::chrono::duration_cast<std::chrono::seconds>(end_it - start_it) << '\n';

Results (release build, MSVC 2022):

Time taken
  using items(): 7s
  using iterators: 3s

Obviously, the more information you print, the slower the loop, in any case.
For example, if you want to imitate the l subcommand of 7z.exe

void print_item(const BitArchiveItem& item) {
    const auto date = item.itemProperty(BitProperty::MTime);
    const auto attr = item.isDir() ? "D...." : ".....";
    std::cout << date.getTimePoint() << ' ' <<
                 attr << ' ' <<
                 std::setw(8) << item.size() << ' ' <<
                 std::setw(8) << item.packSize() << ' ' <<
                 item.path() << '\n';
}
// use print_item(item); in the for loops

The results are:

Time taken
  using items(): 11s
  using iterators: 6s

In general, the version using iterators is always faster than the one using the items() function.
In my tests, iterators were consistently ~4s faster.

The 7z.exe is significantly slower. However, the comparison is a bit unfair, as it does more things, like parsing the CLI arguments, open the archive (in my program I didn't include this), etc.

>cmake -E time 7z l qt-everywhere-src-6.7.0.zip
<7z output>
Elapsed time (seconds): 67.6411

My guess is that 7zFMT.exe is even faster simply because it doesn't have to immediately display all the paths.
For example, if we non-recursively print only the items inside the root folder:

Time taken
  using items(): 4s
  using iterators: 0s

from bit7z.

kleuter commented on June 12, 2024

thanks a lot for such a detailed explanation, it works comparable now!

from bit7z.

rikyoz commented on June 12, 2024

You're welcome!

from bit7z.

rikyoz commented on June 12, 2024

Just for reference, besides calling itemsCount() only once, I found out that the items() method could be further optimized.

The biggest improvement was achieved by calling std::vector::reserve, which is possible since we already know the number of items that the vector will contain.

Another (smaller) improvement was achieved by moving the BitArchiveItemInfo objects that are pushed into the vector instead of copying them.

From my tests, after all the changes, itemsCount() is about 25% faster on average than before.
For example, with the same test program I used for this thread to emulate the 7z.exe program, items() is now only ~2s slower than the iterator version:

 Time taken
  using items(): 8s
  using iterators: 6s

The changes are available in the latest v4.0.7.

from bit7z.

[Possible Bug]: BitArchiveReader::items is slow? about bit7z HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent