jankotek / jdbm3 Goto Github PK
View Code? Open in Web Editor NEWEmbedded Key Value Java Database
Embedded Key Value Java Database
All methods in there should be inlined. We should document and create assertions around there.
Is there a proper method of checking whether a named collection already exists in a database? I tried calling DB.getHashMap("name_that_doesnt_exist") thinking it might return null if I haven't previously called DB.createHashMap("name_that_doesnt_exist"). Instead, the database throws an IllegalArgumentException. Is/Will there be a way to check this other than catching the exception?
Sorry if this isn't really an "issue", I couldn't find a mailing list so I posted here.
JDBM has many options which are hard-coded in source code. This includes: page size, btree size, debug, cache sizes...
We should put all those options into single file called 'CompileConfig.java'. This would be central place where advanced user could modify JDBM for their needs.
@see http://pastebin.com/RuXFsxD1 (Base class TestBean2.java)
@see http://pastebin.com/Yb2Hd1Zu (Derived class DerivedBean.java)
@see http://pastebin.com/EKuHNiTi (Unit test)
Unit test doesn't pass, because fields from TestBean2 base class are null. JDBM File doesn't contains inherited field definitions.
DynamoDB is a simple key-value store. How much work would it be for me to modify JDBM3 to use DynamoDB with key-values of <block-id, block-contents> in place of filesystem calls, so I could use JDBM3 in a multi-user environment?
DynamoDB doesn't supports transactions, but it does support conditional updates. I'm ok with disabling transaction support and caching. But I'm wondering what would happen during a b-tree leaf node split that required the parent be split as well, if the server crashed half-way through the split. Would it leave the b-tree in an unusable state, or would whatever was just added simply be lost?
I'm sure glad to see that you're doing this. I want to be able to store sorted collections in DynamoDB, and I was afraid that I was going to have to implement the b-tree logic myself. I know that DynamoDB supports sorted collections using two-part keys, but there are performance implications, so I'd rather not use them if I can avoid it.
Sorry for posting this as an issue, but I haven't found a forum for jdbm 3...
I'd like to open a map as soon as my program starts, by calling a single method that will open that map if it exists already in the given db file, or will create the map if it does not exist. However, if I try to do this using getTreeMap (for example), the initial call fails since the map doesn't exist; and if I try to do it using createTreeMap, calls after the first fail, as the map already exists.
So far, the only way I can see to do this is to use a try statement, which seems extremely ugly to me.
Am I missing something obvious in the API?
Thanks,
Ken (email: ykkenmcd at gmail.com)
Iterations=0 Objects=0, elapsed=1ms
Iterations=5000 Objects=4666, elapsed=3049ms
java.io.IOError: java.io.IOException: Cannot allocate memory
at net.kotek.jdbm.DBStore.commit(DBStore.java:573)
at net.kotek.jdbm.DBCache.commit(DBCache.java:343)
at net.kotek.jdbm.BTreeBench.doTest(BTreeBench.java:59)
at net.kotek.jdbm.BTreeBench.test_001(BTreeBench.java:42)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at com.intellij.junit3.JUnit3IdeaTestRunner.doRun(JUnit3IdeaTestRunner.java:139)
at com.intellij.junit3.JUnit3IdeaTestRunner.startRunnerWithArgs(JUnit3IdeaTestRunner.java:52)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:202)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
Caused by: java.io.IOException: Cannot allocate memory
at java.nio.MappedByteBuffer.force0(Native Method)
at java.nio.MappedByteBuffer.force(MappedByteBuffer.java:154)
at net.kotek.jdbm.StorageDiskMapped.sync(StorageDiskMapped.java:176)
at net.kotek.jdbm.RecordFile.sync(RecordFile.java:376)
at net.kotek.jdbm.TransactionManager.synchronizeLogFromMemory(TransactionManager.java:123)
at net.kotek.jdbm.TransactionManager.start(TransactionManager.java:221)
at net.kotek.jdbm.RecordFile.commit(RecordFile.java:237)
at net.kotek.jdbm.PageManager.commit(PageManager.java:203)
at net.kotek.jdbm.DBStore.commit(DBStore.java:564)
... 24 more
If I reduce the test to doTest(db, tree, 601); it passes but doTest(db, tree, 611); does not. Do you have any suggestions?
The machine has 16 GB of memory, 14 GB of swap and 180 GB of free disk. Running Centos 6.2.
This class is bloated and most methods should be pushed down to managers. We should document formats and put assertions around those methods.
JDBM should have website with documentation and samples. We should use Maven2 to generate it. Base address should be http://jdbm.kotek.net
and there should be link from Github page.
This looks like a promising library. Could you add a mailing group for this tool. Personally, I would like to ask questions on the library than go into the source code each time to find it out.
A suggestion: In disableTransactions environment when use StorageDiskMapped, I hope that exposed the sync () method, allows users to call in order to prevent loss of data.
In Alpha2 using this code:
public class DbOp implements Runnable
{
private static DB db;
private static String mapName = "map";
public static void main(String[] args) throws InterruptedException
{
db = new DBMaker("db").disableTransactions().disableCache().build();
if( db.getHashMap(mapName) == null ) db.createHashMap(mapName);
Thread thread = new Thread(new DbOp());
thread.start();
new DbOp().run();
thread.join();
db.close();
}
public void run()
{
Map<Integer, String> map =db.getHashMap(mapName);
for(int i = 1; i < 10000; i++)
map.put(i, map.get(i-1) + i );
}
}
Get an error error: double get for block 0
We should benchmark JDBM against LevelDB. If there is performance gap in some areas, we should fix it or document it. Performance problem may indicate design fault or suboptimal code. We should solve this before feature freeze.
Test will be separate project at https://github.com/jankotek/JDBM3-performance-test
JDBM2 had Secondary Collections, it was view with existing collection with an alternative date. This made API bloated and complicated.
We should introduce collection binding (as set of static method). It should take two existing collections, attach listener on first, and modify second accordingly. This should be significantly simpler than Secondary Collection while having 80% functionality. User will also have better control over its data.
Zero page stores informations about store. Currently its format definition is spreader across store. We should document it, put assertions and bring it to single place.
After massive delete lot of free records are generated. This slows down future updates.
When record is updated/inserted, this class makes naive linear scan across table of free records. This is very slow. Currently it is major bottleneck in JDBM3. There is workaround with auto-defragmentation, but this will go away after rewrite.
Hi there,
Would it be possible to get a NavigableMap interface to the TreeMaps instead of the SortedMap interface?
We are writing a thesis about data modelling in java and we are considering using JDBM3 for caching the data. The problem is that we need the added functionality of the NavigableMap interface, especially the subMap(K fromKey, boolean fromInclusive, K toKey, boolean toInclusive) method, where we can choose wether or not the intervals should be open or not.
Since we are using keys with no natural successor method it is not possible for us to use the subMap method in the SortedMap interface to get all the intervals we need.
How difficult would it be to provide the NavigableMap interface instead of the SortedMap interface?
Should we try to modify the code ourself or do you know of any issues that makes this difficult, so it would be better to look for alternatives?
Best regards
Tobias and Martin.
Final fields, variables, methods and classes may bring some performance gain. We target various platforms (Sun JRE, Android), so lets not guess and just make everything final.
Using Alpha2
DB db = new DBMaker("db").build();
db.createTreeMap("treemap");
db.getCollections();
With mapped ByteBuffer we could avoid some byte[] copying and improve performance. So we need to verify if using ByteBuffer all the way to DataInputOutput actually improves performance.
JDBM will probably be used in enterprise environment. We need way to collect debug info, warn users and report errors. So we should start using java.util.Logger
.
There are a few traps in what inexperienced user could fall into. For example data import with transactions enabled. This would generate traffic in support group which could be avoided. So JDBM should sense
such situation and log an warning: "Looks like you are doing batch-import. Try 'DB.disableTransactions()' to speed up this."
Debug and trace statements should be controlled by static final boolean field, so JDBM could be compilable without debug code.
JDBM3 uses two different page types: logical and physical.
Logical stores meta-info such as free records locations. It is set of individually updated numbers, read/updates typically have 4 bytes.
Physical stores actual data. Read/updates typically have bytes.
Both type of pages have currently the same page size: 4KB. This is probably best size for physical records, but it may not be best for logical pages.
We need some performance tests which could show how performance depends on Logical page size. If this tests shows that Logical and Physical pages should have different sizes, we need to do major redesign in page manager.
I would like to use JDBM3 as a dependency in my Java application but it could not be found in one of the global maven repositories.
JDBM currently uses some private API from 'com.sun' package. This is needed for reasonable performance. We should provide an alternative to those API, so JDBM runs on Android.
Related compiler warnings:
[WARNING] /home/jan/IdeaProjects/JDBM3/src/main/java/net/kotek/jdbm/StorageDiskMapped.java:[3,15] sun.misc.Cleaner is Sun proprietary API and may be removed in a future release
[WARNING] /home/jan/IdeaProjects/JDBM3/src/main/java/net/kotek/jdbm/SerialClassInfo.java:[342,23] sun.reflect.ReflectionFactory is Sun proprietary API and may be removed in a future release
[WARNING] /home/jan/IdeaProjects/JDBM3/src/main/java/net/kotek/jdbm/SerialClassInfo.java:[343,31] sun.reflect.ReflectionFactory is Sun proprietary API and may be removed in a future release
[WARNING] /home/jan/IdeaProjects/JDBM3/src/main/java/net/kotek/jdbm/StorageDiskMapped.java:[118,12] sun.misc.Cleaner is Sun proprietary API and may be removed in a future release
[WARNING] /home/jan/IdeaProjects/JDBM3/src/main/java/net/kotek/jdbm/StorageDiskMapped.java:[118,42] sun.nio.ch.DirectBuffer is Sun proprietary API and may be removed in a future release
Example bellow fails on commit (but if commit chunk size equals data size 100000, also fails on put i = 32597)
build = new DBMaker("test.tree.dbm").setMRUCacheSize(100).build();
Map<String, String> treeMap = build.createTreeMap("treeMap");
for (int i = 0; i < 100000; i++) {
System.out.println(i);
treeMap.put(i + "asdddfffffffffffffffffffdgf" + i + "sddfdfsfddddddddddddddddd" + i, "dsfgfg.dfcdfsgfgfffffffffffffffffdddddddddd");
if (i % 10000 == 0) {
build.commit();
}
}
build.commit();
build.close();
java.lang.StackOverflowError
at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:240)
at java.lang.StringCoding.encode(StringCoding.java:272)
at java.lang.StringCoding.encode(StringCoding.java:284)
at java.lang.String.getBytes(String.java:986)
at net.kotek.jdbm.BTreeNode.writeKeys(BTreeNode.java:1237)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:983)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
int size = 100000;
int commitSize = 100000;
DB build = new DBMaker("test.hash.dbm").setMRUCacheSize(100).build();
Map<String, String> hashMap = build.createHashMap("hashMap");
for (int i = 0; i < size; i++) {
hashMap.put(i + "asdddfdgf" + i + "sddfdfsf" + i, "dsfgfg.dfcdfsgfg");
if (i % commitSize == 0) {
build.commit();
}
}
build.commit();
build.calculateStatistics()
build.close();
java.lang.Error: double get for block -1
at net.kotek.jdbm.RecordFile.get(RecordFile.java:138)
at net.kotek.jdbm.PageManager.getNext(PageManager.java:163)
at net.kotek.jdbm.DBStore.calculateStatistics(DBStore.java:678)
at net.kotek.jdbm.DBCache.calculateStatistics(DBCache.java:360)
at util.JDBMTest.testH(JDBMTest.java:52)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at com.intellij.junit3.JUnit3IdeaTestRunner.doRun(JUnit3IdeaTestRunner.java:139)
at com.intellij.junit3.JUnit3IdeaTestRunner.startRunnerWithArgs(JUnit3IdeaTestRunner.java:52)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:202)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
int size = 100000;
int commitSize = 100000;
DB build = new DBMaker("test.hash.dbm").setMRUCacheSize(100).build();
Map<String, String> hashMap = build.createHashMap("hashMap");
for (int i = 0; i < size; i++) {
hashMap.put(i + "asdddfdgf" + i + "sddfdfsf" + i, "dsfgfg.dfcdfsgfg");
if (i % commitSize == 0) {
build.commit();
}
}
build.commit();
build.close();
build.calculateStatistics()
java.lang.NullPointerException
at net.kotek.jdbm.DBCache.calculateStatistics(DBCache.java:360)
JDBM has record limit 8 MB. We should provide optional compression, if size of serialized data is over certain size. (1MB by default).
It should be implemented as recursive SerializationHeader
.
Inserting a moderate number of small objects causes a StackOverflowError.
The problem persists even with the cache disabled.
Tested using JDBM-3.0-alpha-1
import net.kotek.jdbm.*;
import java.util.SortedMap;
public class App
{
public static void main( String[] args )
{
String fileName = "helloWorld";
DB db = new DBMaker(fileName).build();
SortedMap<Integer, float[]> map = db.createTreeMap("mapName");
float[] featureVector = new float[100];
for (int i = 0; i < 100000; i++) {
map.put(i, featureVector);
}
db.commit();
db.close();
}
}
...
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.update2(DBStore.java:326)
at net.kotek.jdbm.DBStore.update(DBStore.java:306)
at net.kotek.jdbm.DBCache.purgeEntry(DBCache.java:478)
at net.kotek.jdbm.DBCache.cachePut(DBCache.java:417)
at net.kotek.jdbm.DBCache.insert(DBCache.java:156)
at net.kotek.jdbm.BTreeNode.writeValues(BTreeNode.java:1028)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:986)
at net.kotek.jdbm.BTreeNode.serialize(BTreeNode.java:43)
We need to check store format and various packed fields. Store format is going into freeze at Beta1.
Documenting is probably best way to check this for design issues.
Store should be part of public API. It should be an abstract class with static factory methods (makeStoreMemory, makeStoreRAF...). DBConfig
should take Store
as an constructor argument.
This should prevent users from forking JDBM.
JDBM tests currently runs for couple of minutes. But we need better test suite to test JDBM before each release. It should run approximately week and should require 4 GB RAM and 100 GB HDD space.
I would like to keep tests fast, but this test should be also part of JDBM. So JDBM will have two modes:
There will be static final boolean field in CompileConfig.TESTED_RELEASE.
When set to false, only basic unit tests will be run for couple of minutes. At initialization JDBM will also log an warning that this release is dev version and did not passed acceptance tests.
When set to true, full battery of acceptance tests will be run. This will require significant resources and will for several days, before all tests pass. At startup JDBM will not display any warnings as this release probably passed all tests.
In git sources and during development time this flag will be off. Before release I flip it on locally and build release jar.
We need to document serialization format. This way we can check it for design issues.
JDBM uses low level key-value store. Key is long pointer to physical location in store. Value is an object serializable to byte[].
Access to this interface is necessary if client wants implement new collection without modifying JDBM.
This interface was public in previous releases. I wanted to make JDBM3 simpler and this was removed from public API. But now I want to avoid forks and fragmentation, so this should be public again.
Introduce size
field into HTree
. This adds some overhead to inserts, but should decrease confusion. java.util.HashMap
contains this field.
When free phys record is released, it should check its neighbors. If those are free, it should merge and create only single free records. If phys page contains single free records it should be freed.
We need to set minimal records size to 6 and add reverse pointer into free phys records.
JDBM2 had collection listeners, which would notify about changes. This was made private, but should be part of public interface again. There should be static method which would register this listener on an existing JDBM collection.
JDBM currently have some assertions. This is not enough, we should be much more paranoid and have MUCH more assert statements. Each method should should verify all its arguments.
Java assert()
is not good. JDBM tries to minimize jar file. Assert code and messages are always linked into class files.
Better is to have static final boolean field which would control assertions. If this field is off, javac outputs 'unreachable code warning' and removes assertion code. Something like:
if(CompileConfig.ASSERT && recid<0)
throw new IllegalArgumentException("recid is negative: "+recid);
By default flag will be on. We always release two jar files, one with assertions included for debuging, second without assertions for production. Both releases will have to pass acceptance tests.
Some records are discarded when saving many values in one commit while soft cache is enabled. There is no problem with cache disabled or MRU cache. In my configuration, transaction are disabled, i don't know if problem occurs when transaction are enabled.
This problem doesn't occur in JDBM2.
Is there any KEY-SET and KEY-LIST implementation in plan ?
When phys record is released, we should trim it. Free records should not stretch across multiple physical pages. Pages between head and tail should be freed.
After this we can decrease size in free-phys-man to 2bytes and fit free-phys-record into single long.
We need to delete a structure (HashMap, TreeMap, or another ...) from a JDBM Database. We also need to retrieve a list name of all existing structures in the JDBM Database.
Is it possible to perform this with actual API ?
testHcreate 11245 ms.
testTcreate 2831 ms.
rw-r--r-- 1 terry LD\domain users 62M 3 фев 12:20 test.hash.dbm.d.0
-rw-r--r-- 1 terry LD\domain users 654K 3 фев 12:20 test.hash.dbm.i.0
-rw-r--r-- 1 terry LD\domain users 2B 3 фев 12:20 test.hash.dbm.t
-rw-r--r-- 1 terry LD\domain users 20M 3 фев 12:20 test.tree.dbm.d.0
-rw-r--r-- 1 terry LD\domain users 166K 3 фев 12:20 test.tree.dbm.i.0
-rw-r--r-- 1 terry LD\domain users 2B 3 фев 12:20 test.tree.dbm.t
int size = 300000;
int commitSize = 50000;
DB build = new DBMaker("test.hash.dbm").setMRUCacheSize(100).build();
Map<String, String> hashMap = build.createHashMap("hashMap");
final Stopwatch start = Stopwatch.start();
for (int i = 0; i < size; i++) {
hashMap.put(i + "asdddfdgf" + i + "sddfdfsf" + i, "dsfgfg.dfcdfsgfg" + i);
if (i % commitSize == 0) {
build.commit();
}
}
build.commit();
build.close();
System.out.println("testHcreate " + start.newLoop() + " ms. ");
build = new DBMaker("test.tree.dbm").setMRUCacheSize(100).build();
Map<String, String> treeMap = build.createTreeMap("treeMap");
for (int i = 0; i < size; i++) {
treeMap.put(i + "asdddfdgf" + i + "sddfdfsf" + i, "dsfgfg.dfcdfsgfg" + i);
if (i % commitSize == 0) {
build.commit();
}
}
build.commit();
build.close();
System.out.println("testTcreate " + start.newLoop() + " ms. " );
ApacheDS is major user of JDBM1 and they maintain their own fork. We should find out if there are any patches we could integrate. They may also have some comments.
JDBM2/3 already integrates fix for BTree space leak from Apache fork.
Their fork is also currently pushed as Apache Foundation subproject, Maybe we could integrate both projects somehow.
Dear Team,
I found that JDBM gives good performance.
Is it possible to simultaneously access data with two different threads(One Thread inserting data into database) and the other thread retriveing the data and updating the retrieved data?
In simple i need to access the same database using two different threads at the same time.
I need a sample for the same...
JDBM should be able to read data from ZIP file or resources in JAR files. Basic support is already implemented, but we need more tests, documentation and public API.
We also need API for creating this store.
JDBM currently uses ReadWriteLock
and plain synchronized
. This is not enough for our needs. We should use tri-state locks:
It should behave similar as ReadWriteLock
, but read-for-write
should not block other read
, but should block other read-for-write
. Example usage:
public V ConcurrentMap.putIfAbsent(K key, V value) {
lock.readWriteLock().lock();
try{
V oldKey = get(key)
if(oldKey!=null)
return oldKey;
lock.writeLock.lock();
try{
return put(key, value);
}finally{
lock.writeLock.unlock()
}
}finally {
lock.readWriteLock().unlock();
}
}
I was getting/putting and committing a bunch of objects to the database. Got an exception eventually:
Exception in thread "main" java.io.IOError: java.io.IOException: Serializer readed more bytes than is record size.
at net.kotek.jdbm.BTreeSortedMap.put(BTreeSortedMap.java:257)
at PersistenceTest.main(PersistenceTest.java:61)
Caused by: java.io.IOException: Serializer readed more bytes than is record size.
at net.kotek.jdbm.BTreeLazyRecord.fastDeser(BTreeLazyRecord.java:73)
at net.kotek.jdbm.BTreeNode.readValues(BTreeNode.java:1002)
at net.kotek.jdbm.BTreeNode.deserialize(BTreeNode.java:943)
at net.kotek.jdbm.BTreeNode.deserialize(BTreeNode.java:43)
at net.kotek.jdbm.DBStore.fetch2(DBStore.java:386)
at net.kotek.jdbm.DBStore.fetch(DBStore.java:356)
at net.kotek.jdbm.DBCache.fetch(DBCache.java:269)
at net.kotek.jdbm.BTreeNode.loadNode(BTreeNode.java:829)
at net.kotek.jdbm.BTreeNode.insert(BTreeNode.java:388)
at net.kotek.jdbm.BTreeNode.insert(BTreeNode.java:389)
at net.kotek.jdbm.BTree.insert(BTree.java:281)
at net.kotek.jdbm.BTreeSortedMap.put(BTreeSortedMap.java:255)
... 1 more
Some beans with many null values fields can't be deserialized by JDBM.
@see http://pastebin.com/RuXFsxD1 (Bean with some null values fields)
@see http://pastebin.com/v7LAaKCR (Unit test, crashing on second execution when database already exists)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.