Giter VIP home page Giter VIP logo

Comments (13)

rdingwall avatar rdingwall commented on August 11, 2024

In your example yes, you are buffering the entire contents into your MemoryStream, which then needs to be seeked back to the start before WCF will start reading from it.

The problem happens because of the design of the WCF streaming API:

  • When writing a FileStream, or making an HttpRequest, you get access to the output stream, and write to it yourself. This is the model Protobuf-net-data (and most other libraries that deal with streams) supports.
  • In WCF, if you are sending a streamto the client, you have to return an instance of a stream, and WCF will read it until the stream ends. You cannot directly write to the output stream yourself.

It's a fairly common problem in WCF, usually solved by writing a custom Stream with read implementation that serializes a little bit (and has to do a little bit of buffer management for when the serialized size doesn't match the requested amount etc). There is some discussion and workarounds for this problem here: http://stackoverflow.com/questions/2726527/wcf-and-streaming-requests-and-responses

Unfortunately this would be impossible with the current version of protobuf-net-data because ProtoDataWriter writes the entire data reader at once. We need more overloads that only write one row at a time. This would probably be a good thing to have, not sure when I would have a chance to do it though...

from protobuf-net-data.

ChubbyArse avatar ChubbyArse commented on August 11, 2024

OK thanks Richard, does this also sound like the technique you've described?

http://code.msdn.microsoft.com/windowsdesktop/Custom-WCF-Streaming-436861e6

In your estimation - how long would it take to implement something like this in protobuf-net-data. I ask as I would probably start this, but if it's going to take more than a few days to someone who hasn't touched this assembly before, then I shan't even start!

Obviously, if I did code a solution for this I would feed it back in.

from protobuf-net-data.

rdingwall avatar rdingwall commented on August 11, 2024

Yep pretty much, except that one is alternating between two sources (that memStreamReadStatus). We would just be reading from one source which would be much simpler. And single threaded.

I have written a very similar thing internally at a client before (from another serializer) so I can more-or-less see the code in my head already... but not sure exactly when I might get a chance to do it. If you can wait I can probably get something into NuGet in the next week.

If that's too long, you're more than welcome to give it a shot! Just note unit tests will be key for verifying the read stream is pulling back everything verbatim.

from protobuf-net-data.

ChubbyArse avatar ChubbyArse commented on August 11, 2024

Thanks for response Richard. If you think you can get something up and running without too much trouble in a week or so, then I won't attempt to make a start. My project is already tight (aren't they always) and doesn't leave much room for prototyping or any other flights of fancy!

My plan was to incur the memory footprint in the interim and get something like you are suggesting written in once the pressure was off. Would your solution mean that the data would stream from the database, through the WCF service without de-serialisation until it reached the consumer and they read the stream back into a reader.

Anything you produce, I'd be happy to test and provide feedback.

Cheers

from protobuf-net-data.

rdingwall avatar rdingwall commented on August 11, 2024

Have done some work towards this in the protodatastream branch, but hit a snag where we can't force protobuf-net's ProtoWriter to flush between rows. My current implementation doesn't flush till you've written all the rows... buffering the entire thing in memory which is not what we want.

I've emailed Marc, hopefully he can make the Flush method public or propose some other workaround.

from protobuf-net-data.

ChubbyArse avatar ChubbyArse commented on August 11, 2024

Thanks for your work on this - hopefully Marc will be able to accommodate the change.

It would be pretty neat to stream directly from the database through a WCF service - wouldn't it?

from protobuf-net-data.

rdingwall avatar rdingwall commented on August 11, 2024

Hey I've got something working, if you grab the code here...

https://github.com/rdingwall/protobuf-net-data/tree/protodatastream

... and compile the .NET 2.0 version, you'll find a new ProtoDataStream class:

var reader = command.ExecuteReader();
var stream = new ProtoDataStream(reader);
return new GetReportDataResponse { DataReaderStream = stream };

Which should hopefully let you deserialize back into an IDataReader on the other side! Let me know how it goes and I'll get a new version out into NuGet later this week.

from protobuf-net-data.

ChubbyArse avatar ChubbyArse commented on August 11, 2024

Got it and moved the new objects into .Net40 (as we are working against a proto-buf 4.0 build).

All existing unit tests are working fine. Next step, I'll build a test that streams data from the service for 5 minutes, through the client into a file and monitor the memory usage. I'll also switch back to the serialised implementation and post back my findings here.

I'll take the final build once you've checked into the main branch then.

Thanks for your work on this Richard - hopefully it's function you can promote!

Alex

from protobuf-net-data.

ChubbyArse avatar ChubbyArse commented on August 11, 2024

Hi Richard, This is looking really good - really encouraging results here.

I've setup a test WCF service that streams out 5 million rows and this looks fine. Performance Monitor indicates that the host processes memory footprint does't grow (stays around 34MB) while streaming out the request.

However, the client is using the ProtoDataReader and is incurring a growing memory footprint (the 5 million rows grows the memory for the process from 15MB standing to 300MB after processing the first 5 million.

I then request the 5 million stream again, and the memory footprint grows to around 350MB (ish) and then drop to 200MB. I suspect this is the GC kicking in and clearing down objects left lying around??

I'm using this code in the client - so I'm not holding anything from the reader in memory:

[code]
IDataReader reader = new ProtoDataReader(response.DataStream);

                while (reader.Read())
                {
                    rowCounter++;
                    if (rowCounter % 1000000 == 0)
                    {
                        Console.WriteLine("Processed {0} records on iteration {1}", rowCounter, i);
                    }
                }

                reader.Close();
                reader.Dispose();

[/code]

Do you suspect there may be some areas that could be cleared up as we go, without relying on the GC?

from protobuf-net-data.

rdingwall avatar rdingwall commented on August 11, 2024

Cool good to hear it's working okay. Memory usage sounds about normal for this sort of thing (in .NET) - if you could force a GC.Collect() a few times (not just once) after you finish streaming the memory should drop back down in perfmon. If it doesn't drop down... then we should investigate further :)

from protobuf-net-data.

rdingwall avatar rdingwall commented on August 11, 2024

Have just pushed a new version into NuGet including ProtoDataStream.

http://nuget.org/packages/protobuf-net-data/2.0.6.611

Let me know if you have any issues!

Cheers,

Rich

from protobuf-net-data.

ChubbyArse avatar ChubbyArse commented on August 11, 2024

Hi Richard, have the streaming changes for WCF been included in the .Net40 source now for build 2.0.6.614??

Thanks

from protobuf-net-data.

rdingwall avatar rdingwall commented on August 11, 2024

Yes, ProtoDataStream was released in 2.0.6.611.

from protobuf-net-data.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.