Giter VIP home page Giter VIP logo

rtf-parse's Issues

Tokenizer.process is called recursively

We need to refactor mentioned method, so that it's not being called recursively. Because of that currently it calls exception about too big call stack when working with big RTF files.

Merge subsequent text entries

When creating the model it would be nice to join all the subsequent text models, even if they were splitted by \r\n line separator.

Paragraph support

Paragraphs in RTF are somewhat funny, as they're enclosed between \pard and \par command. It's probably also enclosed in a group.

{\pard This is my fancy text.\par}

Event for model instances

We need an event to be fired as model entry is added.

Currently we have only Tokenizer.matched event, which is triggered for Token instances. Need the same thing for models.

Actually it's possible that this event will be helpful for #5.

Provide unified names for commands

Currently the command model operates on "raw" command value, picked while parsing. It means that the same command might be \rtf1 as well as \rtf1 (space at the end). Instead model should have a property like name, where it would be unified simply to rtf1.

Better bmp handling

While #12 added initial handling for images, the support is not yet full featured.

One thing I saw missing is support for bmp images.

To reproduce just create a RFT using WordPad on Win10 and paste some image from the clipboard. Then use example/images.js to extract images from this file.

Produced bmps have some correct information, because if I open it in graphic app it shows it, however the size of image is incorrect.

Implement RTF model

Now that PoC for token parsing is ready we need to (based on tokens) create a RTF model. And this is what actually people using the lib will work with.

In this project RTF model would mean pretty much the same what DOM means for HTML.

Expose model classes

All the Model subclasses should be exposed so that these can be used conveniently with Model.getChildren() method and so on.

Implement missing applyToModel methods

At least token.Escape class still has no applyToModel method implemented.

For a brief moment I was thinking about softening "virtual" implementation in base Token intefrace, but it's actually better when it pops as soon as dev forgots to implement one.

Refactor tokens

It became clear that token classes needs to be a spearate classes from AST model types. So it makes perfect sense to create a dedicated namespace for tokens to keep things clear.

Error on building with webpack - Uglify Error

Hello, First thanks for this great job.

I'm using this plugin in a web based application. And at the end I'm trying to get a build with

"npm run bulild"

but getting this error :
ERROR in 0.a9abdcd49066c2a9cfb7.chunk.js from UglifyJs
Unexpected token: name (Parser) [0.a9abdcd49066c2a9cfb7.chunk.js:96957,7]

The problem is in node_modules/rtf-parse/Parser.js because it is written by using ES6 and uglify js can't handle EF6.

I'm using this config with webpack but it doesn't help anyway.


module: {
    loaders: [{
      test: /\.js$/, // Transform all .js files required somewhere with Babel
      loader: 'babel-loader',
      exclude: /node_modules\/(?!(rtf\-parse)\/).*/,
      query: options.babelQuery,
    }, {

Can you please post releases which is converted to ES5 or do you have an idea how can I fix my problem?

Provide an option to early return

A great addition would be a possibility to early return from parsing.

There are some cases for this:

  • Say you might want simply to parse up to the point where you find interesting data, no further parsing is needed.
  • You want to parse just a part of RTF, and return as fast as possible.

I'll explain second case:

For this instance I might have loaded 30mb rtf with a picture on it's very end. I can play smart guy, and just find {\pict position in the string, and start parsing from there, all the way until I got to matching GroupEnd - so that I have all the picture data loaded. I'm not interested in parsing whole file, so having this information I want to abort parsing.

Methods for traversing model

Now that we have some fundamental, we can focus on methods that will help us having the actual work done.

We need some convenient ways to work with models e.g. parentModel.find( curModel => curModel.value !== 'foo' ).

It also should allow for nested search.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.