Giter VIP home page Giter VIP logo

excel-stream's Introduction

excel-stream

A stream that converts excel spreadsheets into JSON object arrays.

Examples

// stream rows from the first sheet on the file
var excel = require('excel-stream')
var fs = require('fs')

fs.createReadStream('accounts.xlsx')
  .pipe(excel())  // same as excel({sheetIndex: 0})
  .on('data', console.log)
// stream rows from the sheet named 'Your sheet name'
var excel = require('excel-stream')
var fs = require('fs')

fs.createReadStream('accounts.xlsx')
  .pipe(excel({
     sheet: 'Your sheet name'
  }))
  .on('data', console.log)

stream options

The options object may have the same properties as csv-stream and these two additional properties:

  • sheet: the name of the sheet you want to stream. Case sensitive.
  • sheetIndex: the sheet number you want to stream (0-based).

Usage

npm install -g excel-stream
excel-stream < accounts.xlsx > account.json

options

newline delimited json:

excel-stream --newlines

formats

each row becomes a javascript object, so input like

foo, bar, baz
  1,   2,   3
  4,   5,   6

will become

[{
  foo: 1,
  bar: 2,
  baz: 3
}, {
  foo: 4,
  bar: 5,
  baz: 6
}]

Don't Look Now

So, excel isn't really a streamable format. But it's easy to work with streams because everything is a stream. This writes to a tmp file, then pipes it through the unfortunately named j then into csv-stream

License

MIT

excel-stream's People

Contributors

anandthakker avatar dominictarr avatar gtramontina avatar guumaster avatar max-mapper avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

excel-stream's Issues

Error on creating tmp file

Lib (or some dependency) assumes that /home/<userdir>/tmp directory exists. Otherwise following error occurs:

tream.js:94
      throw er; // Unhandled stream error in pipe.
            ^
Error: ENOENT, open '/home/<username>/tmp/_1415470413656'

Shouldn't it write to /tmp by default?

EACCES error

I'm running this with Meteor, and hosted on Meteor's Galaxy service.
I don't have access to changes permissions on the filesystem.

It seems to be reading from a tmp file, and when it does this, it produces an EACCES error. Is there any option to do assemble the file in memory?

excel-stream converts strings to ints

I have an excel spreadsheet that has a phone number column, I have set the format of the data to text as I am expecting it as a string (there are leading zeros). No matter what I try excel-stream still parses these into integers. I have tried putting them into international format with the leading '+', these are stripped out and the resulting value converted to int.

Is there something I'm doing wrong here?

Large files not assembled perfectly with GridFS stream

A very large file of 100MB, produces some incorrect values. Seems like there are scrambled values in half of a row, but not all values are scrambled.
I'm streaming from Mongo Gridfs and not the filesystem. Any reason why the file might get scrambled with Gridfs?

I can't share the file to reproduce the issue, so I'm not sure how to reproduce.

Comma in cell value causes issues

When my cell value included a comma (','), the row value I got inside 'data' event handler callback was truncated at the point of the comma, and was suffixed with a double quote sign ('"').

So, for example, a value of abcd, efg would result in abcd".

Nondeterministic collisions on temp file in highly parallel environments

The generation of temporary filenames in index.js is insufficient to allow this library to be used in highly parallel environments where it is reasonable to expect that multiple processes may start on the same millisecond.

This happens regularly in our environment because two separate excel-stream processes generate the same temp file name and collide with each other:

Uncaught exception:

[Error: j: error parsing [...]\AppData\Local\Temp\_1474854568720: Error: Corrupted zip or bug : unexpected signature (\x2F\x2E\x72\x65, expected \x50\x4B\x03\x04)
]
Error: j: error parsing [...]\AppData\Local\Temp\_1474854568720: Error: Corrupted zip or bug : unexpected signature (\x2F\x2E\x72\x65, expected \x50\x4B\x03\x04)

    at [...]\node_modules\excel-stream\index.js:49:34
    at ConcatStream.<anonymous> ([...]\node_modules\excel-stream\node_modules\concat-stream\index.js:36:43)
    at emitNone (events.js:72:20)
    at ConcatStream.emit (events.js:166:7)
    at finishMaybe ([...]\node_modules\excel-stream\node_modules\concat-stream\node_modules\readable-stream\lib\_stream_writable.js:475:14)
    at afterWrite ([...]\node_modules\excel-stream\node_modules\concat-stream\node_modules\readable-stream\lib\_stream_writable.js:361:3)
    at nextTickCallbackWithManyArgs (node.js:464:18)
    at process._tickDomainCallback (node.js:403:17)

Temporary filename generation should be handled in a secure and standard way, such as that provided by the 'tmp' library.

Fails silently on Node v0.10.31 + OS X / Linux

When the error described in SheetJS/j#4 happens, it's swallowed and you just get empty, apparently successful output.

E.g.:

$ j accounts.xlsx 
node v0.10.31 is known to crash on OSX and Linux, refusing to proceed.
see https://github.com/SheetJS/j/issues/4 for the relevant discussion.
see https://github.com/joyent/node/issues/8208 for the relevant node issue
node v0.10.31 is known to crash on OSX and Linux, refusing to proceed.
see https://github.com/SheetJS/j/issues/4 for the relevant discussion.
see https://github.com/joyent/node/issues/8208 for the relevant node issue

.../node_modules/j/j.js:243
        throw "node v0.10.31 bug";
        ^
node v0.10.31 bug

But

$ cat accounts.xlsx | excel-stream 
[

]

.

Multiple sheets

Hi,
Does your brilliant lib handle multiple sheets perchance?

Empty value in excel is converted into 0

Every empty value in excel is converted to 0 due to fact that:

isNaN('') => false

and line 41: _data[k.trim()] = isNaN(value) ? value : +value

makes empty record to 0.

I suggest following to check if value is number or not:

(isNaN(value) || value === '') ? value : +value;

you can find working sample here.

empty value will be converted to 0 which is unexpected

As the following codes show, isNaN("") equals false, _data[k.trim()] will get the value 0.

_data[k.trim()] = isNaN(value) ? value : +value

should the result "" be more reasonable? i.e.:

_data[k.trim()] = (value === "" || isNaN(value)) ? value : +value

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.