Giter VIP home page Giter VIP logo

xlparser's Introduction

XLParser

A C# Excel formula parser with the following properties:

  • High compatiblity
    XLParser has been tested on over a million real-world formulas and has a 99.9% succesful parse rate.
  • Compact parse trees
    XLParser was designed for and is used in research of Excel spreadsheets and refactoring, all of which are easier if parse trees are smaller
  • Compact grammar
    Our grammar contains less than 100 tokens and production rules, and is thus easy to implement in another language or parser generator.

Quickstart

You can parse formulas and view parse trees in the online demo.

or

Use the NuGet package.

or

  1. Download the latest release
  2. Extract somewhere convenient
  3. Build Irony GrammarExplorer, see the Debugging section below, and open it
  4. Click on the ... button at the top and select Add Grammar
  5. Point to the XLParser.dll file in the folder you extracted and click ok
  6. You can now parse formulas and see the trees in the Test tab

Background

XLParser is the reference implementation of the Excel grammar published in the paper "A Grammar for Spreadsheet Formulas Evaluated on Two Large Datasets" by Efthimia Aivaloglou, David Hoepelman and Felienne Hermans.

XLParser can parse Excel formulas and is intended to facilitate the analysis of spreadsheet formulas, and for that purpose produces compact parse trees. XLParser has a 99.99% success rate on the Enron and EUSES datasets. Note however that XLParser is not very restrictive, and thus might parse formulas that Excel would reject as invalid, keep this in mind when parsing user input with XLParser.

XLParser is based on the C# Irony parser framework.

Using XLParser

Building

Open the XLParser.sln file in src/ in Visual Studio 2015 or higher and press build. The dependencies are already included in compiled form in this repository.

Using in your project

The easiest way to add the dependency to your project is with NuGet

The ExcelFormulaParser class is your main entry point. You can parse a formula through ExcelFormulaParser.Parse("yourformula").

ExcelFormulaParser has several useful methods that operate directly on the parse tree like AllNodes to traverse the whole tree or GetFunction to get the function name of a node that represents a function call. You can Print any node.

FormulaAnalyzer contains some example functionality for analyzing the parse tree.

Debugging

Irony, the parser framework XLParser uses, includes a tool called the "grammar explorer". This is a great way to play around with the grammar and parse trees. To use this tool, you first need to build it once by opening the Irony solution (lib/Irony/Irony_All.2012.sln) and building it with release configuration. After that you can use the binary in lib/Irony/Irony.GrammarExplorer/bin/Release/Irony.GrammarExplorer.exe.

To load the XLParser grammar, first make sure you have built XLParser. Then open the GrammarExplorer and add the grammar (... button) from src/XLParser/bin/Debug/XLParser.dll.

In Visual Studio you can see the printed version of any node during debugging by adding yournode.Print(),ac in the watch window.

Documentation

See the doc folder for further documentation

License

All files of XLParser are released under the Mozilla Public License 2.0.

Roughly this means that you can make any alterations you like and can use this library in any project, even closed-source and statically linked, as long as you publish any modifications to the library.

xlparser's People

Contributors

dhoepelman avatar felienne avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.