Giter VIP home page Giter VIP logo

mathmltools's Introduction

Made With Java

MathML Tools

Maven License DOI Travis Code Coverage Code Quality
MavenCentral License DOI Build Status Test Coverage Maintainability

MMLTools Logo

MathML Tools is an open source project for processing content MathML within Java. It provides tools to load, store, check validity and automatically repair and enhance documents for the new MathML 3.0 standard. Furthermore, we provide Java adapters to convert LaTeX to MathML, full compatibility for our developed gold standard, programming language independent libraries of useful XPath and XQuery strings, and distance measure algorithms to compare two MathML documents.

Install Instructions

Since this is an open API, there is nothing to install. If you want to use the API for you own project you can use maven central as explained in section Maven Central below. If you want to download the sources, run tests and change something in the code, follow the guide in section Local Installation below.

Maven Central

The project is structured into specialized packages (maven-modules) that you can easily and separately include into your projects. For example, if you just want to process MathML documents, the core module perfectly fits your needs. We use maven for our build process and the entire project is available on Maven central (see the maven badge above). Note that specialized modules automatically imports the core module. For example, if you wish to use our similarity module in your project, you only need to add the following snippet to your dependencies pom:

<dependency>
    <groupId>com.formulasearchengine.mathmltools</groupId>
    <artifactId>mathml-similarity</artifactId>
    <version>2.0.1</version>
</dependency>

Local Install

To download the project and run the tests you need git and mvn installed. First download the sources into a directory.

mkdir mathtools
cd mathtools
git clone https://github.com/ag-gipp/MathMLTools.git .

Now you can install the project locally via maven, which automatically runs the tests.

mvn clean install

Don't panic if you see error messages. We have written tests that expects exceptions. If this process finished without errors you should see something like this in the end of the log.

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] MathML Tools ....................................... SUCCESS [  3.234 s]
[INFO] MathML Libraries ................................... SUCCESS [  1.493 s]
[INFO] MathML Utilities ................................... SUCCESS [  6.425 s]
[INFO] MathML Core ........................................ SUCCESS [02:31 min]
[INFO] MathML Converters .................................. SUCCESS [ 41.476 s]
[INFO] MathML Similarity .................................. SUCCESS [  7.930 s]
[INFO] MathML Gold Standard ............................... SUCCESS [  0.710 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------

Project Structure

  • MathML Converters (mathml-converters): Collection of tools for converting LaTeX to MathML. Also includes the canonicalization tool.

  • MathML Core (mathml-core): To load, store, check validity, repair and manipulate MathML documents.

  • MathML Gold Standard (mathml-gold): Process the MathMLBen gold standard within Java.

  • MathML Libraries (mathml-libs): Collection of XPath and XQuery strings for content MathML (includes Java pojos).

  • MathML Similarity (mathml-similarity): Collection of distance measurements for MathML documents.

  • MathML Utilities (mathml-utils): Useful utility functions and definitions (always included in the other modules above).

mathmltools's People

Contributors

andreg-p avatar dependabot[bot] avatar fhamborg avatar jimmyli97 avatar physikerwelt avatar snyk-bot avatar vstange avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mathmltools's Issues

Update CAS Translator Support

The CAS translator is outdated now. Once there is a new stable version ready, we need to update the reflection calls for proper usage in converters.

Refactor MathMLTools to an open API for MML tasks

MathMLTools will be the parent repository (and parent maven project) for all MathML related tasks.

It is planned as multi-module maven project. The following tasks have to be done

  • build parent pom in MathMLTools repo
  • move MathMLConverters and MathMLSim to this repo
  • move converters from mathosphere/pomlp to MathMLConverters
  • refactor MathMLTools itself into other submodules (MathMLTools in MathMLTools is confusing)
  • Create a README to give an overview of the submodules
  • Solve snapshots by create a deploy system to maven central (see #12)
  • Update related projects! (Known dependencies: vmext-demo) <= Update: Not needed since we use a new version 1.0.0 for the refactored project -> there will be no conflicts)

Restrict Releases to Master-Branch Updates

@physikerwelt Could it be that you forget to update the POM-versions (or we forget to specify that in travis)? vmext-demo depends on v. 2.0.2
https://github.com/ag-gipp/vmext-demo/blob/9a27896a1242d823d13f1161f49066c801c218ad/pom.xml#L61-L65

but all poms are still on 2.0.2-SNAPSHOT:
https://github.com/ag-gipp/MathMLTools/blob/77c69b6366a5b8720796d1cd9d155ba26c2f1b20/pom.xml#L9

Also, the releases are already on 2.0.4: https://github.com/ag-gipp/MathMLTools/tree/2.0.4
as well as maven central: https://search.maven.org/artifact/com.formulasearchengine.mathmltools/mathmltools/2.0.4/jar

Something we missed in .travis.yml?

Add javascript or xpath libraries

I was wondering if we should add libraries that can be used independently of Java and Maven. The idea came up since we are facing same or similar issues with MML in VMEXT (see gipplab/vmext#28).

I'm not sure if it is worth to provide a whole javascript based API of our MathMLTools project (maybe in distant future). But I think a good idea would be a library of useful XPath commands. Perhaps a YAML file accessible from one of our servers.

What do you think @physikerwelt @vstange?

EnrichedMathMLTransformer should accept any mathml input

  • create test case for EnrichedMathMLTransformer with the following input
<math xmlns="http://www.w3.org/1998/Math/MathML" display="block" alttext="a">
  <semantics>
    <mi>a</mi>
    <annotation encoding="application/x-tex">a</annotation>
  </semantics>
</math>
  • update endpoint /math/mathoid in vmext-demo and recheck

LaTeXML service ends point does not work anymore

Calling LaTeXML server does not work anymore. Thus multiple tests fail due to 404 and other errors. Furthermore, a simple fix (changing URL) does not work, since it seems the new server endpoints are different.

Currently, I try to figure out why curl request work but RestTemplate calls does not work. One possible problem might be that RestTemplate is old and as of newer versions of org.springframework.web will become deprecated as well. A better alternative should be WebClient (introduced with 5.0, see here).

Setup auto deploy for refactored project

I don't know how to setup an auto deployment for this multi-module project now. Is it enough to enter all settings in the parent pom? Do we need to add deployments to each module?

Heuristic for scanning formulas is faulty

The scanFormulaNode can't recognize formulas not containing "apply"- or "mrow"-nodes. The mistake is based on my false assumption that there should always be a surrounding element.

Instead, the method should examine the semantic elements more closely.

CAS Translator creates linked libs folder

The CAS translator tries to load resources from the libs folder.
In its current state, the folder is needed as a subfolder of the project.
As a workaround, I created a linked folder to the original folder which solves the problem. However, I should implement another solution for this.

Problems with namespace prefixes

Currently one needs to know before parsing the expression if there are prefixes in the MML string or not. This is very unhandy and should be fixed.

A nice feature would be to add and remove namespace prefixes on the fly.

List of Minor Features

Provide the following features to makes it easier to use the API:
Converters:

  • Scripts for downloading and install sources of converters (because of license problems we will not provide the tools directly)

Gold:

  • Easier switch between local and remote mode (explain config file)
  • Download entire MathMLBen and store once (provide function)

Sim:

  • Add tree edit distance (RTED) from mathosphere

XMLHelper string2Doc ignores parameter

The string2Doc in XMLHelper ignores the namespaceAwareness parameter.

 /**
     * Helper program: Transforms a String to a XML Document.
     *
     * @param inputXMLString     the input xml string
     * @param namespaceAwareness the namespace awareness
     * @return parsed document
     */
    public static Document string2Doc(String inputXMLString, boolean namespaceAwareness) {
        try {
            return XmlDocumentReader.parse(inputXMLString, false);
        } catch (IOException | SAXException e) {
            log.error("Cannot parse XML string.", e);
            return null;
        }
//        try {
//            DocumentBuilder builder = getDocumentBuilder(namespaceAwareness);
//            InputSource is = new InputSource(new StringReader(inputXMLString));
//            is.setEncoding("UTF-8");
//            return builder.parse(is);
//        } catch (SAXException | ParserConfigurationException | IOException e) {
//            log.error("cannot parse following content\n\n" + inputXMLString);
//            e.printStackTrace();
//            return null;
//        }
    }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.