Comments (13)
In your example yes, you are buffering the entire contents into your MemoryStream, which then needs to be seeked back to the start before WCF will start reading from it.
The problem happens because of the design of the WCF streaming API:
- When writing a FileStream, or making an HttpRequest, you get access to the output stream, and write to it yourself. This is the model Protobuf-net-data (and most other libraries that deal with streams) supports.
- In WCF, if you are sending a streamto the client, you have to return an instance of a stream, and WCF will read it until the stream ends. You cannot directly write to the output stream yourself.
It's a fairly common problem in WCF, usually solved by writing a custom Stream with read implementation that serializes a little bit (and has to do a little bit of buffer management for when the serialized size doesn't match the requested amount etc). There is some discussion and workarounds for this problem here: http://stackoverflow.com/questions/2726527/wcf-and-streaming-requests-and-responses
Unfortunately this would be impossible with the current version of protobuf-net-data because ProtoDataWriter writes the entire data reader at once. We need more overloads that only write one row at a time. This would probably be a good thing to have, not sure when I would have a chance to do it though...
from protobuf-net-data.
OK thanks Richard, does this also sound like the technique you've described?
http://code.msdn.microsoft.com/windowsdesktop/Custom-WCF-Streaming-436861e6
In your estimation - how long would it take to implement something like this in protobuf-net-data. I ask as I would probably start this, but if it's going to take more than a few days to someone who hasn't touched this assembly before, then I shan't even start!
Obviously, if I did code a solution for this I would feed it back in.
from protobuf-net-data.
Yep pretty much, except that one is alternating between two sources (that memStreamReadStatus). We would just be reading from one source which would be much simpler. And single threaded.
I have written a very similar thing internally at a client before (from another serializer) so I can more-or-less see the code in my head already... but not sure exactly when I might get a chance to do it. If you can wait I can probably get something into NuGet in the next week.
If that's too long, you're more than welcome to give it a shot! Just note unit tests will be key for verifying the read stream is pulling back everything verbatim.
from protobuf-net-data.
Thanks for response Richard. If you think you can get something up and running without too much trouble in a week or so, then I won't attempt to make a start. My project is already tight (aren't they always) and doesn't leave much room for prototyping or any other flights of fancy!
My plan was to incur the memory footprint in the interim and get something like you are suggesting written in once the pressure was off. Would your solution mean that the data would stream from the database, through the WCF service without de-serialisation until it reached the consumer and they read the stream back into a reader.
Anything you produce, I'd be happy to test and provide feedback.
Cheers
from protobuf-net-data.
Have done some work towards this in the protodatastream branch, but hit a snag where we can't force protobuf-net's ProtoWriter to flush between rows. My current implementation doesn't flush till you've written all the rows... buffering the entire thing in memory which is not what we want.
I've emailed Marc, hopefully he can make the Flush method public or propose some other workaround.
from protobuf-net-data.
Thanks for your work on this - hopefully Marc will be able to accommodate the change.
It would be pretty neat to stream directly from the database through a WCF service - wouldn't it?
from protobuf-net-data.
Hey I've got something working, if you grab the code here...
https://github.com/rdingwall/protobuf-net-data/tree/protodatastream
... and compile the .NET 2.0 version, you'll find a new ProtoDataStream class:
var reader = command.ExecuteReader();
var stream = new ProtoDataStream(reader);
return new GetReportDataResponse { DataReaderStream = stream };
Which should hopefully let you deserialize back into an IDataReader on the other side! Let me know how it goes and I'll get a new version out into NuGet later this week.
from protobuf-net-data.
Got it and moved the new objects into .Net40 (as we are working against a proto-buf 4.0 build).
All existing unit tests are working fine. Next step, I'll build a test that streams data from the service for 5 minutes, through the client into a file and monitor the memory usage. I'll also switch back to the serialised implementation and post back my findings here.
I'll take the final build once you've checked into the main branch then.
Thanks for your work on this Richard - hopefully it's function you can promote!
Alex
from protobuf-net-data.
Hi Richard, This is looking really good - really encouraging results here.
I've setup a test WCF service that streams out 5 million rows and this looks fine. Performance Monitor indicates that the host processes memory footprint does't grow (stays around 34MB) while streaming out the request.
However, the client is using the ProtoDataReader and is incurring a growing memory footprint (the 5 million rows grows the memory for the process from 15MB standing to 300MB after processing the first 5 million.
I then request the 5 million stream again, and the memory footprint grows to around 350MB (ish) and then drop to 200MB. I suspect this is the GC kicking in and clearing down objects left lying around??
I'm using this code in the client - so I'm not holding anything from the reader in memory:
[code]
IDataReader reader = new ProtoDataReader(response.DataStream);
while (reader.Read())
{
rowCounter++;
if (rowCounter % 1000000 == 0)
{
Console.WriteLine("Processed {0} records on iteration {1}", rowCounter, i);
}
}
reader.Close();
reader.Dispose();
[/code]
Do you suspect there may be some areas that could be cleared up as we go, without relying on the GC?
from protobuf-net-data.
Cool good to hear it's working okay. Memory usage sounds about normal for this sort of thing (in .NET) - if you could force a GC.Collect() a few times (not just once) after you finish streaming the memory should drop back down in perfmon. If it doesn't drop down... then we should investigate further :)
from protobuf-net-data.
Have just pushed a new version into NuGet including ProtoDataStream.
http://nuget.org/packages/protobuf-net-data/2.0.6.611
Let me know if you have any issues!
Cheers,
Rich
from protobuf-net-data.
Hi Richard, have the streaming changes for WCF been included in the .Net40 source now for build 2.0.6.614??
Thanks
from protobuf-net-data.
Yes, ProtoDataStream was released in 2.0.6.611.
from protobuf-net-data.
Related Issues (20)
- Deserialization error "Arithmetic operation resulted in an overflow" HOT 2
- howto Connection.Close() in WCF transferMode="Streamed" and ProtoDataReader HOT 3
- how to return a stream, wrapped in a message contract in WCF transferMode="Streamed" and ProtoDataReader HOT 2
- Unless computed columns are included, serialization does not work in Mono HOT 3
- How can I stop/cancel my server work? HOT 2
- ProtoDataStream to work as cursor HOT 3
- We need Metadata also. HOT 2
- Could you please adding more guide/documentation on how to quick start?
- Port .net core? HOT 8
- Support data type 'System.DateTimeOffset' HOT 1
- Serialize DataSet HOT 1
- Fill the ringbuffer as data is being transmitted
- Possible bug in RecordReader HOT 3
- ProtoDataReader.GetOrdinal case-sensitivity bug
- Examples of streaming across the network?
- Incompatibility with protobuf-net v3
- How to deserialize binary data in javascript
- Update benchmark (benchmark link broken)
- New Deserialize Option - Don't close underlying stream HOT 2
- DeserializeDataSet with multiple tables
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from protobuf-net-data.