The field total_combinations in the class RanCat (<a href="https://github.com/mattjega

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

total_combinations is unpopulated when file is loaded about rancat HOT 2 OPEN

ksreenivasan commented on June 2, 2024

total_combinations is unpopulated when file is loaded

from rancat.

Comments (2)

mattjegan commented on June 2, 2024

Hey, thanks for posting. The design motivation behind that was that we don't want to iterate over the whole file when we load it as it slows everything down if we have large files. You're right in that we could calculate it straight away and in fact, we used to.

The current flow is something like this:

"load" the file (simply set it to open in the Handler, and keep track of which line we're up to)
If the total amount of seen keys falls between 50% and 100% of the total combinations (that we know of, because we don't know the length of the list as we haven't read all the lines yet) then we read in another read_size lines (defaults to 1000)
As those new lines get read in, we update the total combinations

The choice to have a default of 1000 new reads seems good to me as 2 files with over 1000 lines each manage to go 500000 calls to .next() before requiring more file access.

Perhaps I need to document this feature a little more?

from rancat.

ksreenivasan commented on June 2, 2024

@mattjegan That makes sense.
Yeah.. A little more documentation will go a long way. I was able to make sense of most things looking at the code, but it will be faster to understand if there is documentation also.

from rancat.

Recommend Projects

total_combinations is unpopulated when file is loaded about rancat HOT 2 OPEN

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent