Giter VIP home page Giter VIP logo

Comments (16)

RobAustin avatar RobAustin commented on July 1, 2024

Are you using this method in Lang directly or via an HFT collections class ? such as SharedHashMap ?

On 27 May 2014, at 16:38, RuedigerMoeller [email protected] wrote:

Hi Peter,

As we have hundreds of datastructures, we go down the serialization route. I need a way to efficiently plug in a custom serializer.

checking AbstractBytes

public void writeObject(@nullable Object obj) {
if (obj == null) {
writeByte(NULL);
return;
}

    Class<?> clazz = obj.getClass();
    final BytesMarshallerFactory bytesMarshallerFactory = bytesMarshallerFactory();
    BytesMarshaller em = bytesMarshallerFactory.acquireMarshaller(clazz, false);
    if (em == NoMarshaller.INSTANCE && autoGenerateMarshaller(obj))
        em = bytesMarshallerFactory.acquireMarshaller(clazz, true);

    if (em != NoMarshaller.INSTANCE) {
        if (em instanceof CompactBytesMarshaller) {
            writeByte(((CompactBytesMarshaller) em).code());
            em.write(this, obj);
            return;
        }
        writeByte(ENUMED);
        writeEnum(clazz);
        em.write(this, obj);
        return;
    }
    writeByte(SERIALIZED);
    // TODO this is the lame implementation, but it works.
    try {
        ObjectOutputStream oos = new ObjectOutputStream(this.outputStream());
        oos.writeObject(obj);
    } catch (IOException e) {
        throw new IllegalStateException(e);
    }
    checkEndOfBuffer();
}

Is there a way to basically shortcut this routine (e.g. also avoid

Class<?> clazz = obj.getClass();
final BytesMarshallerFactory bytesMarshallerFactory = bytesMarshallerFactory();
BytesMarshaller em = bytesMarshallerFactory.acquireMarshaller(clazz, false);
as this lookup can be like 10% percent in case of smallish objects. If we can agree on some 'pluggable' interface I can do the work and contribute. Or is there another way to customize serialization ?

regards,
Rüdiger


Reply to this email directly or view it on GitHub.

from hugecollections-old.

RuedigerMoeller avatar RuedigerMoeller commented on July 1, 2024

I want to use SharedHashMap. As fast-serialization does some trickery to avoid hash lookups and potenitally blur locality like object.getClass() and instanceof, serialization performance of small objects might be affected if serialization is called as last resort, so I need a hook which kicks in earlier. This can be quite notable when putting smallish objects using serialization.

from hugecollections-old.

leventov avatar leventov commented on July 1, 2024

@RuedigerMoeller There is BytesMarshallable and there is a fast path for writing it both in VSHM and AbstractBytes.writeInstance() for writing it.
If I understand your idea right.

from hugecollections-old.

RuedigerMoeller avatar RuedigerMoeller commented on July 1, 2024

I disagree. BytesMarshallable requires a lot of changes to existing code. Think of a system with hundreds of datastructures. Nobody is willing to pay the price for custom/hand written serialization. So I need to use object serialization. I have a very well performing implementation of generic objectserialization which I want to plug in. BytesMarshallable does not cut it. And even if I patch out the ObjectSerialization, the path still is:

if (BytesMarshallable.class.isAssignableFrom(objClass)) {
                ((BytesMarshallable) obj).writeMarshallable(this);
            } else if (Externalizable.class.isAssignableFrom(objClass)) {
                ((Externalizable) obj).writeExternal(this);
            } else if (CharSequence.class.isAssignableFrom(objClass)) {
                writeUTFΔ((CharSequence) obj);
            } else {
                writeObject(obj);
            }

I mean the instanceof chain adds serious overhead when serializing small objects (which FST does in the area of some 100 nanos if used/tuned right). Additionally you grep String objects away from serialization ..
I would need basically a plug to completely replace the decision tree for en/decoding, anyway I can fork or write a wrapper. Just some input from someone evaluating this ..

from hugecollections-old.

leventov avatar leventov commented on July 1, 2024

@RuedigerMoeller you mean adding methods like customKeySerialization(BiConsumer<Bytes, K> serializer) and for value accordingly to SharedHashMapBuilder API would be useful?

from hugecollections-old.

RobAustin avatar RobAustin commented on July 1, 2024

@RuedigerMoeller

you may find this interface useful

net.openhft.collections.ReplicatedSharedHashMap.EntryExternalizable

its implemented by :

net.openhft.collections.VanillaSharedReplicatedHashMap

from hugecollections-old.

peter-lawrey avatar peter-lawrey commented on July 1, 2024

To make it completely pluggable you can avoid using writeObject() all
together. You can use instead the OutputStream/InputStream, or write you
own serializer/deserializer which writes/reads the data how you wish.
writeObject is provided as a convenience, but if it doesn't do what you
need, don't call it.

On 27 May 2014 21:02, Roman Leventov [email protected] wrote:

@RuedigerMoeller https://github.com/RuedigerMoeller you mean adding
methods like customKeySerialization(BiConsumer<Bytes, K> serializer) and
for value accordingly to SharedHashMapBuilder API would be useful?


Reply to this email directly or view it on GitHubhttps://github.com//issues/24#issuecomment-44327551
.

from hugecollections-old.

leventov avatar leventov commented on July 1, 2024

@BoundedBuffer ReplicatedSharedHashMap.EntryExternalizable is between memory and wire, @RuedigerMoeller is talking about serializations between native Java and memory.

from hugecollections-old.

leventov avatar leventov commented on July 1, 2024

@peter-lawrey the problem is that VSHM do call writeObject() inside.

from hugecollections-old.

peter-lawrey avatar peter-lawrey commented on July 1, 2024

I was thinking of Chronicle, where is entirely a choice. ;)

If you want to avoid looking up a marshaller for each class, what is the
alternative you want to use? Can you use a mutable wrapper?

Map<String, MyBytesMarshallableRef> map = shared map.
MyBytesMarshallableRef ref = new MyBytesMarshallableRef();

ref.value = myRandomType.

map.put(key, ref);

if (map.getUsing(key, ref) != null) {
// ref is set and found

}

On 27 May 2014 21:29, Roman Leventov [email protected] wrote:

@peter-lawrey https://github.com/peter-lawrey the problem is that VSHM do
call
writeObject() inside.


Reply to this email directly or view it on GitHubhttps://github.com//issues/24#issuecomment-44330969
.

from hugecollections-old.

RobAustin avatar RobAustin commented on July 1, 2024

Yes - agreed, VSHM does call writeObjetc() but EntryExternalizable does not call writeObject().

Sent from my iPad

On 27 May 2014, at 21:29, Roman Leventov [email protected] wrote:

@peter-lawrey the problem is that VSHM do call writeObject() inside.


Reply to this email directly or view it on GitHub.

from hugecollections-old.

RuedigerMoeller avatar RuedigerMoeller commented on July 1, 2024

Correct me if I oversee something. My intention is to completely replace Bytes<=>Object transformation (e.g. avoid per-class marshaller lookup and instanceof-chain).

@peter

If you want to avoid looking up a marshaller for each class, what is the
alternative you want to use? Can you use a mutable wrapper?

Just provide a delegation mechanism. The lookup can be avoided as frequently all values have same type, so a custom serializer could cache a marshaller. Seems ridiculous, but hash lookups always add to cache pollution. As encoding is the main performance bottleneck for offheap storage (If one has to deal with random serializable classes), a lot of trickery can be done to speed up (e.g. pre-known objects which are encoded by e.g. a short, partial/lazy decoding etc.).

@BoundedBuffer - I am not too deep into the HFT classes, so I am not aware of the role of entryexternalizable. have to figure out ;)

@leventov

you mean adding methods like customKeySerialization(BiConsumer<Bytes, K> serializer) and for >value accordingly to SharedHashMapBuilder API would be useful?

Yep, something along the lines of this. Does this exist ?

I thought about it the night and maybe I am better off completely wrapping the map and just put byte arrays or Bytes from the wrapper (unfortunately each library has its own flavour of Bytez abstraction ..). On the other hand your shared map could get a significant speed boost if custom serialization is pluggable.

BTW thanks for quick feedback :-)

from hugecollections-old.

leventov avatar leventov commented on July 1, 2024

@RuedigerMoeller

Does this exist ?

Not yet. We will consider adding such thing, thanks for the idea.

from hugecollections-old.

RobAustin avatar RobAustin commented on July 1, 2024

@RuedigerMoeller
What are your time scales for this, we've added a task on our internal JIRA system to add this functionality.

HCOLL-91 SHM key/value serializer abstraction (for configuration and speed)

from hugecollections-old.

RuedigerMoeller avatar RuedigerMoeller commented on July 1, 2024

Awesome ! If it comes within say 3 month its ok for me.

from hugecollections-old.

RobAustin avatar RobAustin commented on July 1, 2024

OK - We'll aim for that.

from hugecollections-old.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.