Comments (4)
The bad news: there are some XLS writers in the wild that store the underlying data in ways which require seeking to the end of the file first, so there's no general way of doing this without writing to a temporary file first (I'll explain why at the end).
The good news: someone already made a module that writes to a temporary file
Why you can't generally parse XLS in a streaming manner: XLS files consist of binary sub-files within a larger container (referred to as "Compound File Binary File Format" -- https://github.com/SheetJS/js-cfb is the library that just implements parsing the container). The container is a series of blocks, and every block after the first one contains an address for the next block. The header block contains a directory for where certain files start. There is no requirement that the different blocks corresponding to a file are in order, and there is no requirement that the entire directory appear before the actual data. Some clever XLS writers exploit this behavior to allow for streaming (in the obvious way: the writer writes the data before the directory, and the receiver can calculate what the directory looks like based on the data it already received)
As a concrete example: http://svn.apache.org/repos/asf/poi/trunk/test-data/spreadsheet/45290.xls is from the POI test suite. In this case, the directory appears near the end of the file, so you'd have to basically read the entire file to parse properly.
from j.
@SheetJSDev ahhhhhhh, thanks for the explanation!
XLS is frikkin crazy. I'll definitely check out the excel-stream module, it looks like what I need
from j.
so there's no general way of doing this without writing to a temporary file first
Why can't you buffer up stdin and then read it?
from j.
@sindresorhus good point :)
from j.
Related Issues (13)
- Support CSV and TSV HOT 6
- standalone HOT 2
- Define Decimal Precision HOT 3
- How to default all sheets HOT 2
- Reading Wrong date from DBF HOT 1
- password protected files HOT 2
- Segmentation fault when accessing a xlsb file HOT 6
- ODS support HOT 3
- Clipboard support HOT 3
- to_json omits empty cells HOT 2
- Make __rowNum__ non-enumerable HOT 2
- Merging various projects (js-xls, js-xlsx, js-harb) HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from j.