theseion / fuel Goto Github PK
View Code? Open in Web Editor NEWFuel, the Smalltalk object serializer
Home Page: https://theseion.github.io/Fuel
License: MIT License
Fuel, the Smalltalk object serializer
Home Page: https://theseion.github.io/Fuel
License: MIT License
Eliot wrote:
First, it may be more flexible to implement fuelCluster as fuelClusterIn: anFLAnalyser so that if one is trying to override certain parts of the mapping framework an implementation can access the analyser to find existing clusters, e.g.
MyClass>>fuelClusterIn: anFLAnalyser
^self shouldBeInASpecialCluster
ifTrue: [anFLAnalyser clusterWithId: MySpecialCluster id]
ifFalse: [super fuelClusterIn: anFLAnalyser]
This makes it easier to find a specific unique cluster to handle a group of
objects specially.
------
Now, I much prefer the method name #fuelClusterFor: anFLAnalyzer
----
Original issue reported on code.google.com by marianopeck
on 19 Jun 2011 at 1:06
Just try to serialize a Character with a value bigger than 255. As this test:
testCharacter
self assertSerializationIdentityOf: $a.
"If the code is less than 255 the same instance is used. But if it is bigger, new ones are created. For more info read Character class comment and Character class >> value."
self assertSerializationEqualityOf: (Character value: 12345). "Japanese Hiragana 'A' "
Characters whose code are under 255 are reused and the same intances are used.
But if bigger, new instances are used. So
FLCharacterCluster >> serialize: aCharacter on: aWriteStream
aWriteStream nextPut: aCharacter codePoint
Is incorrect since that only works with less than 255.
Original issue reported on code.google.com by marianopeck
on 20 Jun 2011 at 3:02
Try to put a Stack and remove recursion while visiting the graph. Maybe times
can improve, or at least profiling with MessageTally will be easier.
Original issue reported on code.google.com by [email protected]
on 15 May 2011 at 7:52
New versions should be faster
Original issue reported on code.google.com by [email protected]
on 13 May 2011 at 7:42
Fuel should be configurable for all those places where certain actions could be
different. We can have a class FuelSettings. I do not recommend call it
FuelConfiguration so that not to confuse to Metacello configuration.
This class can have a lot of class side variables, or...a singlton. And it can
be kind of builder. The ones I use in the followng example are the ones I think
that should be the default value.
FuelSettings
supportCycles: true;
serializationType: #binary; "or.. serializationType: #texrt;"
stringsCompression: false;
doubleBuffer: true;
classSerializationType: #internal "or... external"
Then, it is a design problem how to take into account ssuch settings from Fuel
core. For example, do we use a flag and and if ? or a strategy pattern ?
For more details, check thread with subject "Fuel deberia poder ser configurado"
Original issue reported on code.google.com by marianopeck
on 24 Mar 2011 at 9:24
so that we can use Fuel in headless mode.
one more thing to put in FuelSettings...
Original issue reported on code.google.com by marianopeck
on 28 Apr 2011 at 8:16
After Fuel.MartinDias.233 it is broken
Original issue reported on code.google.com by [email protected]
on 30 May 2011 at 7:06
It would be cool if Fuel can talk to an abstrct stream and not couple with the
current/old Pharo/Squeak streams. Fuel should talk to an AbstractFuelStream.
This class should define the needed API by Fuel, and all methods implemented
like ^ self subclassResponsability.
There should be then, different implemenetations: NormalFuelStream and
XstreamFuelStram
we can start with NormalFuelStream. This can be seen as the "adapter/streategy
design pattern".
BTW...we should see the impact in the benchmarks of this change.
Once this issue is done, we can test Xstream and see if it works and if it
improves the performance.
Original issue reported on code.google.com by marianopeck
on 24 Mar 2011 at 9:33
If we can do focus on improving the classes with most instances it could be
cool.
SpaceTally new printSpaceAnalysis
Original issue reported on code.google.com by marianopeck
on 18 May 2011 at 3:18
SM framework DOES distinguish between serialization and encoding. THis is
interesting for a couple of reasons:
- It is better from OOP point of view
- It makes it easy to port to other smalltalk
"Another note: S&M makes a distinction between serialization and
encoding. Storing objects first takes a directed graph of objects and
serializes it into a sequence of objects. The encoder is then
responsible for transforming the sequence of objects into a sequence
of bytes. Loading is the reverse - a decoder turns the bytes back into
objects and a materializer turns the sequence of objects back into a
graph. That's kind of an interesting contrast with the Fuel strategy
of separating encoding of instances from encoding of references."
"Well, there are a few reasons for this design:
- It makes the code a lot cleaner. I've been working on this on and
off for a few years now and have gone through several complete
rewrites. I had an all-in-one design, an all-in-the-model design, even
a serializer-combinator design. Once I arrived at this, the code
became much cleaner and more flexible. It's much easier for the
application to customize serialization than my previous designs which
were more entangled with the object model they were serializing.
- It's more portable. MC2 is meant to run on several platforms. I've
done partial ports to VisualWorks and Gemstone (never finished), which
taught me a lot about how to (a) have the serialization code run on
multiple platforms, (b) make it possible for the serialized objects to
be decoded on another platform. Separating the serialization from the
encoding really helps here, because all the really platform-specific
stuff is at the encoding level. So it's not so much about swapping out
encoders to have different representations at the byte level as having
a common representation between dialects.
- The last reason is a bit more subtle. I've found this business of
drawing serialization boundaries is a really important semantic
exercise in for state management. For example, in Altitude, I found
that once I had a good serialization defined, it also served really
well as a deep equality test. At first, I just serialized the to
objects and compared the bytes, but later I optimized this by
implementing an encoder/decoder that would use the serializer to walk
the object graph and compare it with another object graph using a
materializer. That was a simple equality test, but it could be used
for other purposes - say generating a diff, measuring similarity, or
creating a non-byte-oriented hash of the object graph. I don't really
know what's possible here, but it's something I'm going to explore."
For more details, read the thread with subject "[Pharo-project] serializing a
class"
Original issue reported on code.google.com by marianopeck
on 24 Mar 2011 at 9:51
Tests and implementation needed
Original issue reported on code.google.com by [email protected]
on 19 Apr 2011 at 2:36
Would be nice to have a Fuel logo.
Original issue reported on code.google.com by [email protected]
on 17 Apr 2011 at 2:48
Henry said:
1)
You shouldn't need more than one Cluster class for variableWord classes either,
FLLargeNegativeIntegerCluster, FLLargePositiveIntegerCLuster, FLFloatCluster
etc. should could all be removed in favor of a FLVariableWordClusterSerializer,
which has the actual class it is serializing/materializing as in instvar.
serialize:
store variablesize
store named vars
store variable words
materialize:
read variablesize.
create new instance of that size.
restore named vars.
read variable words into instance in a single operation.
There's lots of mucking about to get byte ordering consistent when storing to
file, and then restoring it when loading, look at nextWordsInto: , and
restoreEndianness.
If stored/loaded in the platform order, you wouldn't need the restoreEndianness
step if platform didn't change.
The way it is now, (ie. always saved in big-endian) you almost always have to
do the step, since intel cpu's are little endian.
The following writes out a variableWord object inplatform-specific byteorder:
wideStr := 'Hello widestring (€) primitive!'.
FileStream newFileNamed: 'test.bin' do: [:fileStream |
fileStream binary;
nextPutAll: wideStr].
FileStream fileNamed: 'test.bin' do: [:fileStream |
fileStream binary.
fileStream contents]
Not sure how well it works with the buffered stream though, since nextPutAll:
has different behaviour for streams using internal collections, you'd have to
test if it's worth it. Either way, there shouldn't be a need for more than one
class with serialize/materialize pseudocode as described above.
Also: For variableBytes classes, if you rewrite:
WriteStream
nextPutAll: aCollection
| newEnd |
collection class instSpec == aCollection class instSpec ifFalse:
[^ super nextPutAll: aCollection ].
newEnd := position + aCollection size.
newEnd > writeLimit ifTrue:
[self growTo: newEnd + 10].
collection replaceFrom: position+1 to: newEnd with: aCollection startingAt: 1.
position := newEnd.
^ aCollection
You can now pass all variableByte classes (Like ByteString) directly to a
stream with a ByteArray collection, and the replaceFrom:to:with:startingAt:
primitive will work correctly, just like the file primitive does
This means you don't need special Serializers for these either, using f.ex.
clunky nextStringPutAll: methods with manual asByteArray conversions.
byteStr := 'Hello primitive!'.
FileStream newFileNamed: 'test.bin' do: [:fileStream |
fileStream binary;
nextPutAll: byteStr].
internalStream := #[] writeStream.
internalStream nextPutAll: byteStr.
FileStream fileNamed: 'test.bin' do: [:fileStream |
fileStream binary.
fileStream contents = internalStream contents]
TLDR; Try not to use iteration reading in single elements of a variable object,
but rather use the batch processing methods provided by streams, which are
already (well, in some cases at least) optimized. Fix bugs in those existing
ones instead :)
Cheers,
Henry
------
for mor detais read thread with subject: "FUEL improvements [WAS] Re:
Increasing the performances of a Seaside application"
Original issue reported on code.google.com by marianopeck
on 19 Jun 2011 at 1:11
The Moose export/import extension shows while serializing large object graphs
that a progress bar would be really nice, to bring feedback to the Moose user
while it waits, in some cases for more than a minute.
Original issue reported on code.google.com by [email protected]
on 29 Mar 2011 at 3:47
We should create more benchmarks for ALL the special objets that we do special
management (almost each that implements #fuelSerializer).
Original issue reported on code.google.com by marianopeck
on 1 Jun 2011 at 3:20
Anyway, this feature must be optional.
Original issue reported on code.google.com by [email protected]
on 18 Apr 2011 at 10:40
testCreateHierarchyWithExistingClassesAndSubclasses
"Tests materialization of a hierarchy of classes that are not defined in the
image and classes that indeed are defined. In addition, the serialized class
has instances that remain in the image (they are not serialized). This
demonstrates that the instance variable 'subclasses' should be properly
serialized and materialized"
that test demonstrate one of this problems. Another problem is when at
materialization time of the subclasses is not present any more. That should be
another test case :)
Original issue reported on code.google.com by marianopeck
on 6 Apr 2011 at 12:48
S&M does:
serializeMethodContext: aContext
self serialize: aContext with:
[encoder encodeContext: aContext.
self serialize: aContext sender.
self serialize: aContext pc.
self serialize: aContext stackPtr.
self serialize: aContext closure.
self serialize: aContext receiver.
self writePointersFrom: aContext ]
encodeContext: aContext
self writeType: 21.
self serializer serialize: aContext method
decodeContext
| method |
method := materializer materialize.
^ MethodContext newForMethod: method.
materializeContext: aContext
| sender pc stackp closure receiver |
self addArrow: aContext.
sender := self materialize.
pc := self materialize.
stackp := self materialize.
closure := self materialize.
receiver := self materialize.
aContext
setSender: sender
receiver: receiver
method: aContext method
arguments: #() ;
pc: pc ;
stackp: stackp ;
closure: closure.
self readPointersInto: aContext.
^ aContext
Original issue reported on code.google.com by marianopeck
on 20 May 2011 at 9:03
Replace 'materialization' term in Fuel for something more common in the
serialization vocabulary. The term 'deserialization' is a good candidate.
This includes Fuel implementation, web pages, and paper.
Original issue reported on code.google.com by [email protected]
on 17 Apr 2011 at 12:53
we can get an example from squeakdbx.org
and I have a user/pass for such pier so that you can take a look.
In addition we can ask RMOD the image
Original issue reported on code.google.com by marianopeck
on 18 May 2011 at 3:12
We want to prune (replace by nil) or graft (replace by another subgraph) the
graph at some objects.
The first criteria we will support are "the object is included in this
collection" or "the object is instance of this class".
But another interesting criteria we should explore is "depth is less than X"
i.e. "distance from root is less than X".
Original issue reported on code.google.com by [email protected]
on 16 Jun 2011 at 9:59
The file size is an important measure to do.
Original issue reported on code.google.com by [email protected]
on 16 Jun 2011 at 5:57
Not to use ClassBuilder at all, not puting the class in Smalltalk globals, no
notifications, nothing. All this should be the behavior as default
Original issue reported on code.google.com by marianopeck
on 18 May 2011 at 3:14
Fuel should let me declare that certain instance variable of a class should be
transiente (not serialized). For example, in the Java world, we can do
something along: "public transiente int age" .
In gemstone, they have DbTransient and they can do "aClass
makeInstancesDbTransient" But I don't want all instances of a class to be
transient. Instead, CERTAIN instances variables, of a certain class.
It might also be useful to have a FLTransient class that would function like
MReference - a ValueHolder (TransientValue) that always serializes as nil.
For more details read the thread with subject
- "and S$M ?? Fwd: Does GemStone support CERTAIN transient instance variables?"
- "Does GemStone support CERTAIN transient instance variables?"
- "#fuelTransientInstanceVariables was: Re: [Moose-dev] Re: Binary
import/export in Moose"
Original issue reported on code.google.com by marianopeck
on 24 Mar 2011 at 9:47
we have for methods but not for classes. The problem right now is that they
require special managemnt to serialize, which is diffrent from the rest :(
Original issue reported on code.google.com by marianopeck
on 18 May 2011 at 3:16
We are using 2 or 4 bytes depending on the size of the serialized object.
However, it should be possible to use 1 and 3 as well. There is a trade off
between the IFs and the bytes to be written, but it is worth it to give it a
try. To really benchmark it, bench must have a large set of integer of
different sizes where such integers can be encoded in 1,2,3 and 4 bytes.
So, we should change this:
for: anSmallInteger
"Returns the right subclass instance"
((anSmallInteger bitAnd: 16rFFFF) = anSmallInteger)
ifTrue: [ ^ FLUInt16Cluster instance ].
anSmallInteger > 0
ifTrue: [ ^ FLPositiveSmallIntegerCluster instance ]
ifFalse: [ ^ FLNegativeSmallIntegerCluster instance ]
to handle FLUInt8 and FLUInt24 ...
Original issue reported on code.google.com by marianopeck
on 19 Jun 2011 at 9:06
Once issue 3 is done...we should test XStream, but simply. Just using XStream.
Then, in a future, we can try to use the marshaling feature of XStreams
Original issue reported on code.google.com by marianopeck
on 24 Mar 2011 at 9:35
We could do something like the VM does. Instead of putting a reference to a
SamllInteger, we directly encode the SmallInteger in the reference. This make
dirty a little the code and add some IFs maybe....but we should check because
the performance can be much better.
It is worth it to give it a try.
Original issue reported on code.google.com by marianopeck
on 19 Jun 2011 at 9:01
From Eliot:
" Lastly, the class-side cluster ids are a bit of a pain. It would be nice
to know a) are these byte values or general integer values, i.e. can there be
more than 256 types of cluster?, and b) is there any meaning to the ids? For
example, are clusters ordered by id, or is this just an integer tag? Also,
some class-side code to assign an unused id would be nice.
You might think of virtualizing the id scheme. For example, if FLCluster maintained a weak array of all its subclasses then the id of a cluster could be the index in the array, and the array could be cleaned up occasionally. Then each fuel serialization could start with the list of cluster class names and ids, so that specific values of ids are specific to a particular serialization."
Original issue reported on code.google.com by [email protected]
on 17 Jun 2011 at 7:52
what can be serialized/unserialized?: most of the serialization frameworks
have certain classes that cannot be serialized, and an error is thrown at
runtime if you try to do that. In Java this is done using the Serializable
interface, and you CANNOT serialize classes like \ct{Thread} or \ct{Socket}. In
Python pickle, the same:
http://docs.python.org/library/pickle.html#what-can-be-pickled-and-unpickled
So, I would love that Fuel let me override a method #isFuelSerializable in a
class to answer ^ false, and throws an error if I try to do that.
Original issue reported on code.google.com by marianopeck
on 1 Jun 2011 at 2:21
MyClass >> initializeFromFruelMaterializacion: fuelMaterializer
self doZaraza.
The cluster serializers should send such message in fact. So each cluster
chooses whether to do it ot not.
Pay attention to the possilbe overhead of sending a message to each
materialized object.
Original issue reported on code.google.com by marianopeck
on 20 May 2011 at 5:47
In the Fuel extension for Moose, this improved a lot the performance on
serialization. Actually this is a lack in current file streams on writing in
Pharo/Squeak because there is not write buffering.
There is not need on deserialization (reading), because the file streams
already do implement buffering.
Code snippet:
serializeBuffered: anObject
| bufferStream |
bufferStream := ByteArray new writeStream.
(FLSerializer on: bufferStream)
serialize: anObject
aStream binary nextPutAll: bufferStream contents.
Original issue reported on code.google.com by [email protected]
on 17 Apr 2011 at 2:42
Useful when you want to store a graph of objects into a no-sql database where
you want to sotre the graph as a blob.
It was a request in:
http://forum.world.st/ESUG-SummerTalk-Fuel-binary-object-serializer-tp3548159p35
49353.html
Original issue reported on code.google.com by marianopeck
on 29 May 2011 at 1:23
Since that group was created for the Moose extension, I think there is no need
to include benchmarks.
Original issue reported on code.google.com by [email protected]
on 5 Apr 2011 at 3:01
what can be serialized/unserialized?: most of the serialization frameworks
have certain classes that cannot be serialized, and an error is thrown at
runtime if you try to do that. In Java this is done using the Serializable
interface, and you CANNOT serialize classes like Thread or Socket.In Python
pickle, the same:
http://docs.python.org/library/pickle.html#what-can-be-pickled-and-unpickled
So, I would love that Fuel let me override a method #isFuelSerializable in a
class to answer ^ false, and throws an error if I try to do that. Or similar...
Original issue reported on code.google.com by marianopeck
on 24 Mar 2011 at 9:58
Yanni Chiu said in the mailing list: "====
Yes, security. Here's my first post again, with different formatting:
In another use case, I'd like to serialize from one image, and deserialize in
another image - *under end user control*. [e.g. web app]
The issue here is that "nasty" code could be introduced:
- capture the Fuel output
- deserialize, add nasty code, re-serialize
- then send onward for import to image.
Would it be possible to have some sort of "virus" filter?
====
So a simple "safe-mode" option on de-serialization would probably be
sufficient."
FOR MORE details read
http://forum.world.st/ESUG-SummerTalk-Fuel-binary-object-serializer-tp3548159p35
50017.html
Original issue reported on code.google.com by marianopeck
on 29 May 2011 at 1:21
Apart from measuring time, we should also measure memory usage. This would be
really good for the journal version of the paper also.
Original issue reported on code.google.com by marianopeck
on 16 Jun 2011 at 2:36
It would be cool if Fuel supports partial loading. So that it can materialize
only one/few objects of a full serialized subgraph.
Oracle Coherence has the nice feature of being indexed which allows for
extracting values without deserializing the whole object graph (partial
loading).
Original issue reported on code.google.com by marianopeck
on 24 Mar 2011 at 9:53
Tests has to be updated since code changed. 2 test errors.
Original issue reported on code.google.com by [email protected]
on 24 May 2011 at 6:42
The user should be able to say that a certain class is a Singleton and so that
avoid Fuel creating duplicates.
settings
when: [:obj | obj kindOf: MyClass ]
materializeWithSelector: #uniqueInstance.
settings
when: [:obj | obj == MyClass uniqueInstance ]
materializeWithSelector: #uniqueInstance.
settings
when: [:obj | obj == MyClass uniqueInstance ]
materializeSending: #uniqueInstance to: MyClass.
MyClass >> fuelSerializar
^ FLUserDefinedSingletonClusterSerializer instanceWithSelector: #uniqueInstance
Original issue reported on code.google.com by marianopeck
on 20 May 2011 at 10:54
ImageSegment does that and I think Fuel should do it. We need to write a test
that demonstrtate the problem first
read this thread:
http://forum.world.st/Why-Sets-need-to-be-rehashed-when-loading-them-from-an-Ima
geSegment-td3206850.html
Original issue reported on code.google.com by marianopeck
on 6 Apr 2011 at 12:45
Text or binary: There are serializers like "pickle" in Python or Google
Protocol Buffers, that let you choose between text and binary representation.
For debugging or while developing one can just use text based, which is easy to
see, inspect and modify, and then, at production time, you can switch to a
binary format. This is really cool and I would love to have it in Fuel.
Maybe we can do it after issue 3
Original issue reported on code.google.com by marianopeck
on 24 Mar 2011 at 9:56
All the feedback I gave to martin must be integrated. There are some cleanings
to do.
Original issue reported on code.google.com by marianopeck
on 18 May 2011 at 3:12
Several times, an object wants to be serialized differently. For example, it
may want to be replaced by a kind of proxy, or change some values, etc.
Fuel should support a way to customize that for each class. ReferenceStream
supports this with the methods #objectForDataStream: refStrm
This message is sent once the object is going to be written into the stream, at
serialization time. The idea is that an object may not serialize itself as it
is but something different. Most of the times, it is a DiskProxy, but not
necessary.
This doesn't go with the idea of Fuel, where the objects do not decide
theirself how to be serialized/materialized but instead delegate to a cluster.
So Fuel should support that a class can implement #fuelObjectToSerialize which
answers the object to be serialized. This method is called by serialization
time and the answer to such message is the object to be serialiezed.
There are two possibilities: 1) use the default FLObjectCluster or 2) create a
specific cluster
If 1) we must implement
Object >> #fuelObjectToSerialize
^ self
and then someone can implement
MyClass >> fuelObjectToSerialize
^ self proxyForMe: self
If we do 2) is good because we do not impact in overhead, but the user will
also have to do:
MyClass >> fuelCluster
^ FLCustomObjectCluster instance
And FLCustomObjectCluster can be a subclass of FLObjectCluster.
Let's do an example. There is an object A whose #fuelObjectToSerialize answers
A'. A points to B and A' has a refence to C. The serializer should follow with
C and B will never be traversed nor serialized. C will be traversed and written
into the stream.
Then it is necessary to implement #materializeFromObject: anObject
MyClass >> materializeFromObject: anObject
^ self getOriginalObjectFromProxy: anObject
When this is done, notify to Esteban Lorenzano
Original issue reported on code.google.com by marianopeck
on 16 Jun 2011 at 1:01
Once issue:1 is finish, in a future, FuelSettings should have an integration
with Pharo Settings framework.
Settings info:
- http://book.pharo-project.org/book/CustomizingPharo/DeclaringSetting
- https://gforge.inria.fr/frs/download.php/28243/Settings.pdf
Original issue reported on code.google.com by marianopeck
on 24 Mar 2011 at 9:26
S&M has special management for Dictionaries. Maybe we should do the same:
serializeDictionary: aDictionary
self serialize: aDictionary with:
[encoder encodeDictionary: aDictionary.
aDictionary keys asArray sort do:
[ :key |
self serialize: key.
self serialize: (aDictionary at: key) ] ]
materializeDictionary: aDictionary
| size |
self addArrow: aDictionary.
size := self materialize.
size timesRepeat:
[ aDictionary
at: self materialize
put: self materialize ].
^ aDictionary
Original issue reported on code.google.com by marianopeck
on 18 May 2011 at 3:17
So that to be faster.
Original issue reported on code.google.com by marianopeck
on 20 May 2011 at 10:52
In order to have more accuracy on the materialization, the method #trailer
should be kept.
Original issue reported on code.google.com by [email protected]
on 17 Apr 2011 at 1:35
that means removing #classAndTraitCollection
Original issue reported on code.google.com by marianopeck
on 20 May 2011 at 7:02
The code for Serialization and Materialization should be in different packages.
This is useful for example, for the Pharo minimal kernel, where they want to
laod binary code..So, having the minimal code to load is useful.
If there are things shared between both, Serialization and Materialization,
then it should be put in a FuelCore or FuelShared package.
Finally, ConfigurationOfFuel should be updated so that to create 2 new possible
groups: "serializer" and "materializer", for those that only need them and not
Fuel completely.
Original issue reported on code.google.com by marianopeck
on 24 Mar 2011 at 9:40
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.