peter-lawrey,hugecollections-old

Comments (13)

RobAustin commented on July 19, 2024

we have raised the following issue to investigate this - https://higherfrequencytrading.atlassian.net/browse/HCOLL-135

from hugecollections-old.

peter-lawrey commented on July 19, 2024

While we routinely test a map with 1024 to 8192 segments, having more than
this appears to make it slower rather than faster. YMMV.

The issue you are seeing is likely to be a result of not testing this
scenario. Can you provide such a test to match your requirements?

Note: regardless of the number of threads you have, you only really have
the number of CPUs you have running at once. Ideally, you should be
spending a small portion of your time accessing a single Map, spending more
of your time in business logic and whatever else your program is doing.

BTW, There is another, perhaps more extreme solution. You can create an
off heap reference to all the values in the map and hold this reference.
This works fine provided you don't delete any of these keys. You can add
"row level" locking by adding a lock field to each entry (or possibly more
than one lock field). This provide as fine grain control as you might
need. Also, we support CAS and atomic add operations which avoid the need
for locking at all.

On 13 September 2014 10:08, andycrutchlow [email protected] wrote:

Note : referring to ChronicleMap / ChronicleMapBuilder but I'm actually
using SharedHashMap and SharedHashMapBuilder.
So my actual Issue is that setting actualEntriesPerSegment to a value less
than 8 (i.e. 1 to 7) results in IllegalArgumentException .. Segment is
full...
Is this an intentional restriction?
Seems to occur as a consequence of :
long sizeOfBitSets() {
return align64(entriesPerSegment / 8);
}

I ask because I would like to be able to set actualEntriesPerSegment to 1.
My reason for that is because, given the recent change which fixed a
previous problem, concurrent reads/writes to different map entries that
happen to be in the same segment are effectively queued/blocked behind
eachother reducing throughput. So I think a solution would be to have each
entry in its own segment.
I appreciate that were I populating the map with millions of entries then
a dedicated segment and associated Bytes object would probably not be
practical. But in the use case i am exploring I will have 1000's or tens of
1000's of entries and the nature of the accessing and mutating of
individual entries tends to be in bursts clustering around 'similar' key
values whose hashing value seems to result in a high likelyhood they will
have been located in same segment. I understand that I could further
randomize keys to reduce this likelyhood...but i would rather not have to
and continue using what are the natural keys..and also it will just reduce
the likelyhood and not erradicate it.

—
Reply to this email directly or view it on GitHub
#44.

from hugecollections-old.

andycrutchlow commented on July 19, 2024

For testcase its really just same code i provided before and you subsequently enhanced...

https://github.com/OpenHFT/HugeCollections/tree/master/collections/src/test/java/net/openhft/collections/dvgthreadbug

TestSharedHashMap.java

To save pasting in again...this is what I changed :
shm = SharedHashMapBuilder.of(String.class, TestDataValue.class)
.generatedValueType(true)
.entrySize(1024) //Without this get "Segment is full or has no ranges of 2 continuous free blocks"...with it (trying to pad) get Segment is full, no free entries found
.actualEntriesPerSegment(1)
.file(new File("/run/shm/TestOpenHFT"))
.create();

We are in danger of getting into indepth architectural discussion..but anyway here goes..
I am delegating thread management/scheduling to akka which i trust will do a reasonable job of optimizing threads to cpu's. Individual actors operate on their respective and exclusive chunk of shared memory i.e. an entry hosted in sharedhashmap. Each entry is a fairly complex object, pre-allocated, ready for all computation and with placeholders for all data that an actor will need during its lifetime whilst performing business logic..so all offheap..no ongoing object creation performed by actor...in effect i am doing as per your suggested extreme solution and holding on to references hosted in the map...once steady state reached there should be very little getting or aquiring through map..and there will be no deleting of entries.
Other dependent actors in JVM can read from parent actors entry i.e. access it thru sharedmap (retaining references in the process) and using appropriate read-lock barriers, triggered by actor-to-actor messages. Other external processes can similary read from sharedmem via lookup and 'harvest' the current state of an actors data.
Problem is at a top level events/messages tend to arrive in bursts clustered around similar keys..for example.. in FX market data typically when a SPOT rate message arrives so do related outright rate messages arrive e.g.1week, 2week,1Month etc .. and when this happens each update/execution of business logic performed (reading/writing into its exclusive sharedmemory entry) by an actor dedicated to a specfic rate , whilst autonomous, can be held up/blocked just because another actor's entry happens to be in same segment and it to recently received an event to trigger business logic and, assuming there is a spare cpu inconjunction with akka thread scheduling, means its running concurrently.

from hugecollections-old.

andycrutchlow commented on July 19, 2024

To return to the actual issue as raised : setting SharedHashMapBuilder.actualEntriesPerSegment(1) results in exceptions being thrown as soon as map is added to or accessed e.g. aquireUsing called. Not sure if I made that totally clear in first description.

Is it possible to know the likelyhood that this issue will be addressed at some point in the future or indeed if it is seen as an issue at all. There is no rush or urgency to this...I'd just like to know sooner rather than later and if not then I'll proceed down the path of me randomizing keys before supplying to map to get a better spread for my (domain specific) set of keys so that they are spread throughout the segments to get better read/write concurrency into and out of existing map entries.

Alternatively, perhaps there is an alternate way of achieving number of seqments broadly equal to number of entries and segments containing few (ideally 1) entries through existing parameters available in SharedHashMapBuilder. I have no issue with predetermining and fixing this upfront as in my case i can predetermine valid key combinations...number of entries etc.

from hugecollections-old.

peter-lawrey commented on July 19, 2024

Is it possible to know the likelyhood that this issue will be addressed
at some point in the future or indeed if it is seen as an issue at all.

I see it as an issue and it should be addressed. At a minimum it should
reject the option if it cannot be supported.

The problem with an actual entries of 1 is that you can only have 1 entry
per segment. If you attempt to add a second entry which happens to fall
into the same segment it will fail.

You are better off making the number of segments a power of two around the
number of entries. This way each segment will have typically 1 entry, but
should there be more it won't fail.

SharedHashMap is designed to be a pay-for-what-you-eat data structure. It
tries to only use as much CPU cache or main memory as you need and no more.
e.g. if you create a 1 TB volume on Linux but only use 1 GB, it will use a
little over 1 GB of disk and memory.

On 16 September 2014 08:13, andycrutchlow [email protected] wrote:

To return to the actual issue as raised : setting
SharedHashMapBuilder.actualEntriesPerSegment(1) results in exceptions being
thrown as soon as map is added to or accessed e.g. aquireUsing called. Not
sure if I made that totally clear in first description.

Is it possible to know the likelyhood that this issue will be addressed at
some point in the future or indeed if it is seen as an issue at all. There
is no rush or urgency to this...I'd just like to know sooner rather than
later and if not then I'll proceed down the path of me randomizing keys
before supplying to map to get a better spread for my (domain specific) set
of keys so that they are spread throughout the segments to get better
read/write concurrency into and out of existing map entries.

Alternatively, perhaps there is an alternate way of achieving number of
seqments broadly equal to number of entries and segments containing few
(ideally 1) entries through existing parameters available in
SharedHashMapBuilder. I have no issue with predetermining and fixing this
upfront as in my case i can predetermine valid key combinations...number of
entries etc.

—
Reply to this email directly or view it on GitHub
#44 (comment)
.

from hugecollections-old.

leventov commented on July 19, 2024

@andycrutchlow eventually some work is done related to this and now any number of either actualSegments or actualEntriesPerSegment, even non-power-of-2, is expected to be used precisely. Check out the next Chronicle Map alpha release (or wait for 2.0 final release)

from hugecollections-old.

andycrutchlow commented on July 19, 2024

Just got round to trying this..
I have got the latest Java-Lang and HugeCollections.

With the following setup :

SharedHashMap<String, PricingStackInterface> shm =
SharedHashMapBuilder.of(String.class, PricingStackInterface.class)
.generatedValueType(true)
.entries(2048)
.entrySize(13020)
.actualEntriesPerSegment(1) //HERE
//.actualSegments(2048)
.file(new File(sharedMemoryPath))
.create();

First time I add an entry I get the exception below.

Caused by: java.lang.IllegalArgumentException: Segment is full or has no
ranges of 2 continuous free blocks
at
net.openhft.collections.AbstractVanillaSharedHashMap$Segment.alloc(VanillaSharedHashMap.java:971)
at
net.openhft.collections.AbstractVanillaSharedHashMap$Segment.putEntry(VanillaSharedHashMap.java:928)
at
net.openhft.collections.AbstractVanillaSharedHashMap$Segment.acquire(VanillaSharedHashMap.java:798)
at
net.openhft.collections.AbstractVanillaSharedHashMap.lookupUsing(VanillaSharedHashMap.java:421)
at
net.openhft.collections.AbstractVanillaSharedHashMap.acquireUsing(VanillaSharedHashMap.java:410)

Ideally I want to configure so I have 1 entry per segment. My entry size if
13020 bytes and I want 2048 entries.

Am I using incorrect settings to achieve this?

Should I be using Chronicle Map ? I seem to recall that i'd have to migrate
to java 1.8 and ideally i want to stay 1.7 for now.

On Sun, Oct 26, 2014 at 5:19 PM, Roman Leventov [email protected]
wrote:

@andycrutchlow https://github.com/andycrutchlow eventually some work is
done related to this and now any number of either actualSegments or
actualEntriesPerSegment, even non-power-of-2, is expected to be used
precisely. Check out the next Chronicle Map alpha release (or wait for 2.0
final release)

—
Reply to this email directly or view it on GitHub
#44 (comment)
.

from hugecollections-old.

leventov commented on July 19, 2024

Yes, you should use ChronicleMap, HugeCollections are not updated anymore.

No, ChronicleMap doesn't require Java 8, it works with Java 7.

1 entry per segment is pointless. We cannot force entries to be hold in different segments, they will cluster randomly to 2-3-4 entries per segment, others will be empty. Any entries per segment config < 50-100 doesn't actually work, I think. Why you think you need 1 entry per segment?

Also, I wouldn't be so confident about entry size, if both key and value sizes are constant, use constantKeySizeBySample() and constantValueSizeBySample().

from hugecollections-old.

RobAustin commented on July 19, 2024

I suggest you use the last released version, as we are in the middle of refactoring our API at the moment and the snapshot is likely to be in flux over the next few days, the documentation of the last released version is here.

https://github.com/OpenHFT/Chronicle-Map/blob/chronicle-map-2.0.5a/README.md https://github.com/OpenHFT/Chronicle-Map/tree/chronicle-map-2.0.5a

On 6 Nov 2014, at 03:21, andycrutchlow [email protected] wrote:

Just got round to trying this..
I have got the latest Java-Lang and HugeCollections.

With the following setup :

SharedHashMap<String, PricingStackInterface> shm =
SharedHashMapBuilder.of(String.class, PricingStackInterface.class)
.generatedValueType(true)
.entries(2048)
.entrySize(13020)
.actualEntriesPerSegment(1) //HERE
//.actualSegments(2048)
.file(new File(sharedMemoryPath))
.create();

First time I add an entry I get the exception below.

Caused by: java.lang.IllegalArgumentException: Segment is full or has no
ranges of 2 continuous free blocks
at
net.openhft.collections.AbstractVanillaSharedHashMap$Segment.alloc(VanillaSharedHashMap.java:971)
at
net.openhft.collections.AbstractVanillaSharedHashMap$Segment.putEntry(VanillaSharedHashMap.java:928)
at
net.openhft.collections.AbstractVanillaSharedHashMap$Segment.acquire(VanillaSharedHashMap.java:798)
at
net.openhft.collections.AbstractVanillaSharedHashMap.lookupUsing(VanillaSharedHashMap.java:421)
at
net.openhft.collections.AbstractVanillaSharedHashMap.acquireUsing(VanillaSharedHashMap.java:410)

Ideally I want to configure so I have 1 entry per segment. My entry size if
13020 bytes and I want 2048 entries.

Am I using incorrect settings to achieve this?

Should I be using Chronicle Map ? I seem to recall that i'd have to migrate
to java 1.8 and ideally i want to stay 1.7 for now.

On Sun, Oct 26, 2014 at 5:19 PM, Roman Leventov [email protected]
wrote:

@andycrutchlow https://github.com/andycrutchlow eventually some work is
done related to this and now any number of either actualSegments or
actualEntriesPerSegment, even non-power-of-2, is expected to be used
precisely. Check out the next Chronicle Map alpha release (or wait for 2.0
final release)

—
Reply to this email directly or view it on GitHub
#44 (comment)
.

—
Reply to this email directly or view it on GitHub #44 (comment).

from hugecollections-old.

peter-lawrey commented on July 19, 2024

At the risk of repeating what has been said already...

Segment is full or has no ranges of 2 continuous free blocks

The problem is that while you have specified an entrySize(13020) you are
writing an entry which is bit bigger than this. This is why it is
reporting it needs "2 continuous free blocks" As you are trying to keep
the number of entries per segment to a minimum, the segment is filling up
fast.

I seem to recall that i'd have to migrate
to java 1.8 and ideally i want to stay 1.7 for now.

We support Java 7 best. Java 8 will still work but we don't optimise for
lambdas or streams yet.

I suggest try Chronicle Map 2.0.5a. The latest version of Chronicle Map
supports concurrent readers meaning there is less advantage in having more
segments

On 6 November 2014 03:21, andycrutchlow [email protected] wrote:

Just got round to trying this..
I have got the latest Java-Lang and HugeCollections.

With the following setup :

SharedHashMap<String, PricingStackInterface> shm =
SharedHashMapBuilder.of(String.class, PricingStackInterface.class)
.generatedValueType(true)
.entries(2048)
.entrySize(13020)
.actualEntriesPerSegment(1) //HERE
//.actualSegments(2048)
.file(new File(sharedMemoryPath))
.create();

First time I add an entry I get the exception below.

Caused by: java.lang.IllegalArgumentException: Segment is full or has no
ranges of 2 continuous free blocks
at
net.openhft.collections.AbstractVanillaSharedHashMap$Segment.alloc(VanillaSharedHashMap.java:971)

at
net.openhft.collections.AbstractVanillaSharedHashMap$Segment.putEntry(VanillaSharedHashMap.java:928)

at
net.openhft.collections.AbstractVanillaSharedHashMap$Segment.acquire(VanillaSharedHashMap.java:798)

at
net.openhft.collections.AbstractVanillaSharedHashMap.lookupUsing(VanillaSharedHashMap.java:421)

at
net.openhft.collections.AbstractVanillaSharedHashMap.acquireUsing(VanillaSharedHashMap.java:410)

Ideally I want to configure so I have 1 entry per segment. My entry size
if
13020 bytes and I want 2048 entries.

Am I using incorrect settings to achieve this?

Should I be using Chronicle Map ? I seem to recall that i'd have to
migrate
to java 1.8 and ideally i want to stay 1.7 for now.

On Sun, Oct 26, 2014 at 5:19 PM, Roman Leventov [email protected]

wrote:

@andycrutchlow https://github.com/andycrutchlow eventually some work
is
done related to this and now any number of either actualSegments or
actualEntriesPerSegment, even non-power-of-2, is expected to be used
precisely. Check out the next Chronicle Map alpha release (or wait for
2.0
final release)

—
Reply to this email directly or view it on GitHub
<
https://github.com/OpenHFT/HugeCollections/issues/44#issuecomment-60532771>

.

—
Reply to this email directly or view it on GitHub
#44 (comment)
.

from hugecollections-old.

andycrutchlow commented on July 19, 2024

The reason i want 1 entry per segment is that the read/write concurrency is
restricted to per segment i think. By virtue of the natural keys I'm using
i seem to get high likelyhood of entries that i wish read/write to at same
time occupying same segment so not good throughput.
On Nov 5, 2014 10:31 PM, "Roman Leventov" [email protected] wrote:

Yes, you should use ChronicleMap, HugeCollections are not updated anymore.

No, ChronicleMap doesn't require Java 8, it works with Java 7.

1 entry per segment is pointless. We cannot force entries to be hold in
different segments, they will cluster randomly to 2-3-4 entries per
segment, others will be empty. Any entries per segment config < 50-100
doesn't actually work, I think. Why you think you need 1 entry per segment?

Also, I wouldn't be so confident about entry size, if both key and value
sizes are constant, use constantKeySizeBySample() and
constantValueSizeBySample().

—
Reply to this email directly or view it on GitHub
#44 (comment)
.

from hugecollections-old.

andycrutchlow commented on July 19, 2024

So peter..are you saying that the error is actually that the segment is not
big enough to hold just 1 entry at the size i have specified...so if i pad
it a bit more it will work?
On Nov 6, 2014 6:52 AM, "Andrew Crutchlow" [email protected] wrote:

The reason i want 1 entry per segment is that the read/write concurrency
is restricted to per segment i think. By virtue of the natural keys I'm
using i seem to get high likelyhood of entries that i wish read/write to at
same time occupying same segment so not good throughput.
On Nov 5, 2014 10:31 PM, "Roman Leventov" [email protected]
wrote:

Yes, you should use ChronicleMap, HugeCollections are not updated anymore.

No, ChronicleMap doesn't require Java 8, it works with Java 7.

1 entry per segment is pointless. We cannot force entries to be hold in
different segments, they will cluster randomly to 2-3-4 entries per
segment, others will be empty. Any entries per segment config < 50-100
doesn't actually work, I think. Why you think you need 1 entry per segment?

Also, I wouldn't be so confident about entry size, if both key and value
sizes are constant, use constantKeySizeBySample() and
constantValueSizeBySample().

—
Reply to this email directly or view it on GitHub
#44 (comment)
.

from hugecollections-old.

leventov commented on July 19, 2024

There is no good way to implement what you are asking for with HugeCollections, really. You can specify entriesPerSegment(10) and have x10 memory overuse, but still won't be sure that due to randomness outbreak you won't have 11 entries clustered to the single segment, causing an error. Also, as Peter noted your actual entrySize is not 13020.

ChronicleMap now have read-write locks, which are potentially very benefitable if you have highly-concurrent read access. Also remember that your number of segments typically shouldn't be larger than doubled number of concurrent accessor threads. So if you want about 1000 segments, you should have about 500 concurrent accessors. Do you really have?

from hugecollections-old.

ChronicleMap / ChronicleMapBuilder - actualEntriesPerSegment restricted to minimum of 8 about hugecollections-old HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent