VanillaShortShortMultiMap is full when capacity is still available about hugecollections-old HOT 5 CLOSED

peter-lawrey commented on July 19, 2024

VanillaShortShortMultiMap is full when capacity is still available

from hugecollections-old.

Comments (5)

peter-lawrey commented on July 19, 2024

The problem you have is you are always remobing entries from the early
segments. This means the later segments are full while the early segments
are relatively empty. Do this long enough and you keep all the keys which
happen fall in to the last segments.

You would get a similar problem with ConcurrentHashMap though it would not
be as obvious.

You can remove this issue by having only one segment so there is no
partitioning but you lose any concurrency.
On 12/06/2014 7:43 PM, "mspychalla" [email protected] wrote:

The following code reproduces an issue of the map being full despite
having 25% of its capacity still available.

import java.io.File;
import java.io.IOException;
import java.util.Iterator;

import net.openhft.collections.SharedHashMap;
import net.openhft.collections.SharedHashMapBuilder;

public class SharedHashMapBug {

public static void main(String[] args)
{
SharedHashMap<String, byte[]> sharedHashMap = null;
Iterator it = null;
int numEntries = 2048;
int entrySize = 32768;
int removeThreshold = (int)(numEntries * 0.75);

SharedHashMapBuilder builder = new SharedHashMapBuilder();
builder.entries(numEntries);
builder.entrySize(entrySize);
String shmPath = System.getProperty("java.io.tmpdir") + System.getProperty("file.separator") + "SharedHashMap";

File file = new File(shmPath);
file.delete();

try {
    sharedHashMap = builder.create(new File(shmPath), String.class, byte[].class);
} catch (IOException e) {
    e.printStackTrace();
}

if(sharedHashMap != null)
{
    byte[] value = new byte[30000];

    int index = 0;
    while(index < (numEntries * 1024))
    {
        String key = String.valueOf(index);

        if(sharedHashMap.longSize() > removeThreshold)
        {
            if((it == null) || (!it.hasNext()))
            {
                it = sharedHashMap.keySet().iterator();
            }

            if(it.hasNext())
            {
                String removalKey = it.next();
                byte[] removalValue = sharedHashMap.remove(removalKey);

                if(removalValue == null)
                {
                    System.out.println("no entry to remove for " + removalKey);
                }
            }
        }

        //System.out.println("put(" + key + ", " + value + " );");
        sharedHashMap.put(key, value);

        ++index;
    }
}
}

}

Here is the output when run:

Exception in thread "main" java.lang.IllegalStateException:
VanillaShortShortMultiMap is full
at
net.openhft.collections.VanillaShortShortMultiMap.nextPos(VanillaShortShortMultiMap.java:199)
at
net.openhft.collections.AbstractVanillaSharedHashMap$Segment.put(VanillaSharedHashMap.java:823)
at
net.openhft.collections.AbstractVanillaSharedHashMap.put0(VanillaSharedHashMap.java:348)
at
net.openhft.collections.AbstractVanillaSharedHashMap.put(VanillaSharedHashMap.java:330)
at SharedHashMapBug.main(SharedHashMapBug.java:62)

—
Reply to this email directly or view it on GitHub
#32.

from hugecollections-old.

leventov commented on July 19, 2024

I would add that you can make SHM to have only one segment by this call: builder.actualSegments(1);. I've tried to run this test with one segment, but it took really long time (I haven't waited the end).

from hugecollections-old.

RuedigerMoeller commented on July 19, 2024

Problem with one segment is, that iteration performance gets to a point where it is unusable as the iteration involves locking a full segment and captures all segment entries in an ArrayDeque.
I run into the error message by pure adding ..

from hugecollections-old.

peter-lawrey commented on July 19, 2024

Perhaps you could describe what you are trying to do as you shouldn't be
iterating over large collections. No matter what you do this isn't going to
be fast.
On 22/06/2014 11:28 AM, "RuedigerMoeller" [email protected] wrote:

Problem with one segment is, that iteration performance gets to a point
where it is unusable as the iteration involves locking a full segment and
captures all segment entries in an ArrayDeque.
I run into the error message by pure adding ..

—
Reply to this email directly or view it on GitHub
#32 (comment)
.

from hugecollections-old.

RuedigerMoeller commented on July 19, 2024

I am evaluating the use for a distributed realtime data grid. In case data is not indexed, I need a "full table scan" initially when a new subscriber comes in (afterwards it keeps up to date by listening to change broadcasts).
With sharding to 11 nodes we currently can iterate/test 75 million records per second. We can't index everything so fallback to full scan for non-critical cases.

I got better performance now when using many segments. If one follows advice above, iteration performance becomes unusable for large tables, as creating the values() set takes several seconds if the map has 5 million entries. Configuration of SharedHashmap imo needs more documentation. As the underlying algorithms are not that easy to read from the code, it is not clear what the implications are when increasing/decreasing number of segments and changing segment size, entry size.

I initially ran into the exception above, later on I got "VanillaMultiMap is full". Now I am running with:

        map = new SharedHashMapBuilder()
                         .entrySize(128)
                         .minSegments(10000) 
                         .actualEntriesPerSegment(10*1000)
                         .create(new File(finam), String.class, byte[].class);

which allows me to go up to 5 million records and reasonable iteration performance (well could be better, but first I want to get things to run).

You probably burn many potential users due to lack of doc, from my experience most people give up pretty quick when they run into exceptions with some initial simple testcases (even though its just a matter of misconfiguration).
-ruediger

from hugecollections-old.

VanillaShortShortMultiMap is full when capacity is still available about hugecollections-old HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent