Chronicle Map not thread-safe in certain circumstances about hugecollections-old HOT 14 CLOSED

peter-lawrey commented on July 19, 2024

Chronicle Map not thread-safe in certain circumstances

from hugecollections-old.

Comments (14)

peter-lawrey commented on July 19, 2024

Once you have a reference to an off heap entry it works much the same as if
you have an on heap object. The entry may have come from a thread safe
collection but that doesn't make the entry also inherently thread safe.
To make your code thread safe you need to add entry/record level locking.
See IntValue or JavaBeanInterface for examples. One you have added a lock
field, you need to busyLock () before accessing it and finally unlock () it.
On 11/09/2014 8:04 PM, "andycrutchlow" [email protected] wrote:

When concurrent threads reading/writing to DIFFERENT map entries through
standard DataValueClass generated getter/setters (marshalling to
corresponding UNSAFE methods) and if these map entries phyically reside in
the SAME SEGMENT then there is possibility of corruption of data.
Problem seems since version 3.0.2 collections i.e. I cannot reliably
reproduce on that version..but I can pretty consistantly reproduce on
latest 3.2.1 collections and 6.4.6 lang on two completely different
platforms i.e. ubuntu 12.04 AMD Athlon(tm) 64 X2 Dual Core Processor 3600+
× 2 and on MacOSX 10.7.5 2.2GHz Core i7 using the same test program below :
NOTE : ignore the junit test result .. its the logging that indicates the
actual problem.

package com.nimrod.test;

import static org.junit.Assert.*;

import java.io.File;
import java.io.IOException;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.Executors;
import java.util.concurrent.ThreadPoolExecutor;

import net.openhft.collections.SharedHashMap;
import net.openhft.collections.SharedHashMapBuilder;
import net.openhft.lang.model.DataValueClasses;

import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class TestSharedHashMap {
final static Logger logger =
LoggerFactory.getLogger(TestSharedHashMap.class);
SharedHashMap shm;
TestDataValue data1 =
DataValueClasses.newDirectReference(TestDataValue.class);
TestDataValue data2 =
DataValueClasses.newDirectReference(TestDataValue.class);
TestDataValue data3 =
DataValueClasses.newDirectReference(TestDataValue.class);
TestDataValue data4 =
DataValueClasses.newDirectReference(TestDataValue.class);
TestDataValue data5 =
DataValueClasses.newDirectReference(TestDataValue.class);
TestDataValue data6 =
DataValueClasses.newDirectReference(TestDataValue.class);
TestDataValue data7 =
DataValueClasses.newDirectReference(TestDataValue.class);
TestDataValue data8 =
DataValueClasses.newDirectReference(TestDataValue.class);

ThreadPoolExecutor threads = (ThreadPoolExecutor)Executors.newFixedThreadPool(2);

final CountDownLatch endGate = new CountDownLatch(2);

@before
public void setUp() throws Exception {
try {
shm = new SharedHashMapBuilder()
.generatedValueType(true)
.entries(2048)
.entrySize(256)
.file(new File("/run/shm/TestOpenHFT")).kClass(String.class).vClass(TestDataValue.class).create();

//For older versions
// shm = new SharedHashMapBuilder()
// .generatedValueType(true)
// //.entries(2048)
// .entrySize(1024)
// .create(new
File("/run/shm/TestOpenHFT"),String.class,TestDataValue.class);
    //Map entries are loaded into various segments in mappeddatastore
    shm.acquireUsing("1111.2222.A0", data1);

    shm.acquireUsing("1111.2222.B0", data2);

    shm.acquireUsing("1111.2222.A1", data3);

    shm.acquireUsing("1111.2222.B1", data4);

    shm.acquireUsing("1111.2222.A2", data5);

    shm.acquireUsing("1111.2222.B2", data6);

    //data7 typically points to data in same segment as data1 points to i.e. segmentnum=0
    //To see segments i added line
    //LOG.info("key="+key+" segmentnum="+segmentNum);
    //in method : V lookupUsing(K key, V value, boolean create) in class VanillaSharedHashMap

    shm.acquireUsing("1111.2222.A3", data7);

    shm.acquireUsing("1111.2222.B3", data8);
} catch(Exception e) {
    endGate.countDown();
    endGate.countDown();
    e.printStackTrace();
}
}

@after
public void tearDown() {
try {
endGate.await();
shm.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}

@test
public void test() {
//Simple case just to prove all is OK
data1.setLongString("AAAAAAAAAAAAAAAAAAAA");
assertEquals("AAAAAAAAAAAAAAAAAAAA", data1.getLongString());
data7.setLongString("BBBBBBBBBBBBBBBBBBBB");
assertEquals("BBBBBBBBBBBBBBBBBBBB", data7.getLongString());
logger.info("Start threads to test concurrent read/write in same segment .. this should usually fail");
//..change data7 to another e.g. data2 and should work
threads.execute(new TestTask(data1,"AAAAAAAAAAAAAAAAAAAA"));
threads.execute(new TestTask(data7,"BBBBBBBBBBBBBBBBBBBB"));
}

class TestTask implements Runnable {
TestDataValue data;
String msg;
TestTask(TestDataValue data, String msg) {
this.data = data;
this.msg = msg;
}
public void run() {
for(int i=0;i<100;i++) {
data.setLongString(msg);
logger.info("test "+i+" "+data.getLongString());
try {
assertEquals(msg, data.getLongString());
} catch (AssertionError e) {
logger.error("AssertionError",e);
break;
}
}
endGate.countDown();
}

}

}

—
Reply to this email directly or view it on GitHub
#43.

from hugecollections-old.

andycrutchlow commented on July 19, 2024

If the two threads where operating on the same off heap entry potentially concurrently then absolutely locking would be needed. But the case i describe above has two threads accessing/mutating two completely different off heap entries with two different physical memory locations. thread 1 has aquired entry referenced by key "1111.2222.A0" and is using datavalue class data1 which is pointing to it. Meanwhile thread 2 has aquired entry referenced by key "1111.2222.A3" and is using datavalue class data7 which is pointing to it. If i were to use buyLock on either/both i would be locking 2 completely different entries.
The issue seems to be around segments. By virtue of key hashing data1 and data7 are pointing to sharedmemory locations in the same segment...all the other keys in the above test program result in entries in different segments and if the above test is run with any other than data7 it works as expected. To demonstrate change line :
threads.execute(new TestTask(data7,"BBBBBBBBBBBBBBBBBBBB"));
to be for data2 or data3 etc ... but NOT data1 as that would require locks.

from hugecollections-old.

peter-lawrey commented on July 19, 2024

In that case, it sounds like a bug or at least a usability issue. Entries
in the same segment are consecutive in memory and it is possible to one to
overwrite another if you write more data than you should. The protection
around this is not ideal.
I will investigate this today.
On 12/09/2014 7:34 AM, "andycrutchlow" [email protected] wrote:

If the two threads where operating on the same off heap entry potentially
concurrently then absolutely locking would be needed. But the case i
describe above has two threads accessing/mutating two completely different
off heap entries with two different physical memory locations. thread 1 has
aquired entry referenced by key "1111.2222.A0" and is using datavalue class
data1 which is pointing to it. Meanwhile thread 2 has aquired entry
referenced by key "1111.2222.A3" and is using datavalue class data7 which
is pointing to it. If i were to use buyLock on either/both i would be
locking 2 completely different entries.
The issue seems to be around segments. By virtue of key hashing data1 and
data7 are pointing to sharedmemory locations in the same segment...all the
other keys in the above test program result in entries in different
segments and if the above test is run with any other than data7 it works as
expected. To demonstrate change line :
threads.execute(new TestTask(data7,"BBBBBBBBBBBBBBBBBBBB"));
to be for data2 or data3 etc ... but NOT data1 as that would require locks.

—
Reply to this email directly or view it on GitHub
#43 (comment)
.

from hugecollections-old.

andycrutchlow commented on July 19, 2024

I pretty sure i am not writing more data than i should ... the string field defined as max 256 ... i should have included the interface :
package com.nimrod.test;
import net.openhft.lang.model.constraints.MaxSize;
public interface TestDataValue {
void setLongString(@maxsize(256) String longString);
String getLongString();
}
Also...as i mentioned, if I use an earlier version of collections and java-lang e.g. 3.0d and 6.3.3 respectively, its works as expected...so that would suggest something has changed.

from hugecollections-old.

RobAustin commented on July 19, 2024

We have raised the coresponsing JIRA - https://higherfrequencytrading.atlassian.net/browse/HCOLL-131

from hugecollections-old.

danielshaya commented on July 19, 2024

Hi Andy

Thanks for your feedback, we are very pleased you are using SharedHashMap
and are grateful that you are contributing to our efforts to improve the
quality of OpenHFT software.

It would enormously helpful to us if you could spend a few minutes
answering the following questions:

We understand that people might have privacy concerns and that you might
not want to answer all the questions. Please don't worry if that's the
case. Also the more information we can gather the better but feel free to
write one liners if you want!

What's the name of your company?
How many employees are there at the company?
Please describe the project in which OpenHFT software is being used
and how it has helped.
Is it in evaluation or production?
Would you consider paying for a support model
http://openhft.net/support/ which could include priority support,
training, bespoke software changes.
How can we improve your experience with OpenHFT software?

Thank you

Daniel

On Fri, Sep 12, 2014 at 1:34 PM, andycrutchlow [email protected]
wrote:

If the two threads where operating on the same off heap entry potentially
concurrently then absolutely locking would be needed. But the case i
describe above has two threads accessing/mutating two completely different
off heap entries with two different physical memory locations. thread 1 has
aquired entry referenced by key "1111.2222.A0" and is using datavalue class
data1 which is pointing to it. Meanwhile thread 2 has aquired entry
referenced by key "1111.2222.A3" and is using datavalue class data7 which
is pointing to it. If i were to use buyLock on either/both i would be
locking 2 completely different entries.
The issue seems to be around segments. By virtue of key hashing data1 and
data7 are pointing to sharedmemory locations in the same segment...all the
other keys in the above test program result in entries in different
segments and if the above test is run with any other than data7 it works as
expected. To demonstrate change line :
threads.execute(new TestTask(data7,"BBBBBBBBBBBBBBBBBBBB"));
to be for data2 or data3 etc ... but NOT data1 as that would require locks.

—
Reply to this email directly or view it on GitHub
#43 (comment)
.

from hugecollections-old.

andycrutchlow commented on July 19, 2024

happy to provide more info...but not on an issue thread. contact me directly through my email instead.

from hugecollections-old.

danielshaya commented on July 19, 2024

Of course - please send your preferred contact address to
[email protected].

Regards

On Fri, Sep 12, 2014 at 4:05 PM, andycrutchlow [email protected]
wrote:

happy to provide more info...but not on an issue thread. contact me
directly through my email instead.

—
Reply to this email directly or view it on GitHub
#43 (comment)
.

from hugecollections-old.

peter-lawrey commented on July 19, 2024

There was a thread safety issue around both readUTF and writeUTF as you suspected. I have added a test here

https://github.com/OpenHFT/HugeCollections/tree/master/collections/src/test/java/net/openhft/collections/dvgthreadbug

You can see I have included the generated code for the interface. You can dump this generated code with

-Ddvg.dumpcode=true

Can you confirm that the latest snap shot fixes this issue?

from hugecollections-old.

andycrutchlow commented on July 19, 2024

got latest snapshot and yes I can confirm the fix has addressed the issue. thanks for fast turnaround.

from hugecollections-old.

peter-lawrey commented on July 19, 2024

Having a good unit test really helps.

On 12 September 2014 22:29, andycrutchlow [email protected] wrote:

got latest snapshot and yes I can confirm the fix has addressed the issue.
thanks for fast turnaround.

—
Reply to this email directly or view it on GitHub
#43 (comment)
.

from hugecollections-old.

peter-lawrey commented on July 19, 2024

Note: I added to the test a check that one million get/sets for each thread
produced less then 2 MB of garbage.

On 12 September 2014 22:32, Peter Lawrey [email protected] wrote:

Having a good unit test really helps.

On 12 September 2014 22:29, andycrutchlow [email protected]
wrote:

got latest snapshot and yes I can confirm the fix has addressed the
issue. thanks for fast turnaround.

—
Reply to this email directly or view it on GitHub
#43 (comment)
.

from hugecollections-old.

andycrutchlow commented on July 19, 2024

nice..and improved a few other things in the test code also i see.

On Fri, Sep 12, 2014 at 5:33 PM, Peter Lawrey [email protected]
wrote:

Note: I added to the test a check that one million get/sets for each
thread
produced less then 2 MB of garbage.

On 12 September 2014 22:32, Peter Lawrey [email protected] wrote:

Having a good unit test really helps.

On 12 September 2014 22:29, andycrutchlow [email protected]
wrote:

got latest snapshot and yes I can confirm the fix has addressed the
issue. thanks for fast turnaround.

—
Reply to this email directly or view it on GitHub
<
https://github.com/OpenHFT/HugeCollections/issues/43#issuecomment-55463241>

.

—
Reply to this email directly or view it on GitHub
#43 (comment)
.

from hugecollections-old.

peter-lawrey commented on July 19, 2024

Note, if you dump the code and include it in your source, it is easier to debug.

from hugecollections-old.

Chronicle Map not thread-safe in certain circumstances about hugecollections-old HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent