Giter VIP home page Giter VIP logo

smop's Introduction

SMOP is Small Matlab and Octave to Python compiler.

SMOP translates matlab to python. Despite obvious similarities between matlab and numeric python, there are enough differences to make manual translation infeasible in real life. SMOP generates human-readable python, which also appears to be faster than octave. Just how fast? Timing results for "Moving furniture" are shown in Table 1. It seems that for this program, translation to python resulted in about two times speedup, and additional two times speedup was achieved by compiling SMOP run-time library runtime.py to C, using cython. This pseudo-benchmark measures scalar performance, and my interpretation is that scalar computations are of less interest to the octave team.

======================================== ==================

octave-3.8.1 190 ms

---------------------------------------- ------------------ smop+python-2.7 80 ms ---------------------------------------- ------------------ smop+python-2.7+cython-0.20.1 40 ms ---------------------------------------- ------------------ Table 1. SMOP performance ======================================== ==================

News

October 15, 2014

Version 0.26.3 is available for beta testing. Next version 0.27 is planned to compile octave scripts library, which contains over 120 KLOC in almost 1,000 matlab files. There are 13 compilation errors with smop 0.26.3 .

Installation

  • Network installation is the best method if you just want it to run the example:

    $ easy_install smop --user
  • Install from the sources if you are behind a firewall:

    $ tar zxvf smop.tar.gz
    $ cd smop
    $ python setup.py install --user
  • Fork github repository if you need the latest fixes.
  • Finally, it is possible to use smop without doing the installation, but only if you already installed the dependences -- numpy and networkx:

    $ tar zxvf smop.tar.gz
    $ cd smop/smop
    $ python main.py solver.m
    $ python solver.py

Working example

We will translate solver.m to present a sample of smop features. The program was borrowed from the matlab programming competition in 2004 (Moving Furniture).To the left is solver.m. To the right is a.py --- its translation to python. Though only 30 lines long, this example shows many of the complexities of converting matlab code to python.

01   function mv = solver(ai,af,w)  01 def solver_(ai,af,w,nargout=1): 
02   nBlocks = max(ai(:));          02     nBlocks=max_(ai[:]) 
03   [m,n] = size(ai);              03     m,n=size_(ai,nargout=2)

02

Matlab uses round brackets both for array indexing and for function calls. To figure out which is which, SMOP computes local use-def information, and then applies the following rule: undefined names are functions, while defined are arrays.

---- -------------------------------------------------------------------------

03

Matlab function size returns variable number of return values, which corresponds to returning a tuple in python. Since python functions are unaware of the expected number of return values, their number must be explicitly passed in nargout.

04   I = [0  1  0 -1];              04     I=matlabarray([0,1,0,- 1])
05   J = [1  0 -1  0];              05     J=matlabarray([1,0,- 1,0])
06   a = ai;                        06     a=copy_(ai)
07   mv = [];                       07     mv=matlabarray([])

04

Matlab array indexing starts with one; python indexing starts with zero. New class matlabarray derives from ndarray, but exposes matlab array behaviour. For example, matlabarray instances always have at least two dimensions -- the shape of I and J is [1 4].

---- -------------------------------------------------------------------------

06

Matlab array assignment implies copying; python assignment implies data sharing. We use explicit copy here.

---- -------------------------------------------------------------------------

07

Empty matlabarray object is created, and then extended at line 28. Extending arrays by out-of-bounds assignment is deprecated in matlab, but is widely used never the less. Python ndarray can't be resized except in some special cases. Instances of matlabarray can be resized except where it is too expensive.

08   while ~isequal(af,a)           08     while not isequal_(af,a):
09     bid = ceil(rand*nBlocks);    09         bid=ceil_(rand_() * nBlocks)
10     [i,j] = find(a==bid);        10         i,j=find_(a == bid,nargout=2)
11     r = ceil(rand*4);            11         r=ceil_(rand_() * 4)
12     ni = i + I(r);               12         ni=i + I[r]
13     nj = j + J(r);               13         nj=j + J[r]

09

Matlab functions of zero arguments, such as rand, can be used without parentheses. In python, parentheses are required. To detect such cases, used but undefined variables are assumed to be functions.

---- -------------------------------------------------------------------------

10

The expected number of return values from the matlab function find is explicitly passed in nargout.

---- -------------------------------------------------------------------------

12

Variables I and J contain instances of the new class matlabarray, which among other features uses one based array indexing.

14     if (ni<1) || (ni>m) ||       14         if (ni < 1) or (ni > m) or
               (nj<1) || (nj>n)                            (nj < 1) or (nj > n):
15         continue                 15             continue
16     end                          16
17     if a(ni,nj)>0                17         if a[ni,nj] > 0:
18         continue                 18           continue
19     end                          19
20     [ti,tj] = find(af==bid);     20         ti,tj=find_(af == bid,nargout=2)
21     d = (ti-i)^2 + (tj-j)^2;     21         d=(ti - i) ** 2 + (tj - j) ** 2
22     dn = (ti-ni)^2 + (tj-nj)^2;  22         dn=(ti - ni) ** 2 + (tj - nj) ** 2
23     if (d<dn) && (rand>0.05)     23         if (d < dn) and (rand_() > 0.05):
24         continue                 24             continue
25     end                          25
26     a(ni,nj) = bid;              26         a[ni,nj]=bid
27     a(i,j) = 0;                  27         a[i,j]=0
28     mv(end+1,[1 2]) = [bid r];   28         mv[mv.shape[0] + 1,[1,2]]=[bid,r]
29  end                             29
30                                  30     return mv

Implementation status

Random remarks

With less than five thousands lines of python code

SMOP does not pretend to compete with such polished products as matlab or octave. Yet, it is not a toy. There is an attempt to follow the original matlab semantics as close as possible. Matlab language definition (never published afaik) is full of dark corners, and SMOP tries to follow matlab as precisely as possible.

There is a price, too.

The generated sources are matlabic, rather than pythonic, which means that library maintainers must be fluent in both languages, and the old development environment must be kept around.

Should the generated program be pythonic or matlabic?

For example should array indexing start with zero (pythonic) or with one (matlabic)?

I beleive now that some matlabic accent is unavoidable in the generated python sources. Imagine matlab program is using regular expressions, matlab style. We are not going to translate them to python style, and that code will remain forever as a reminder of the program's matlab origin.

Another example. Matlab code opens a file; fopen returns -1 on error. Pythonic code would raise exception, but we are not going to do that. Instead, we will live with the accent, and smop takes this to the extreme --- the matlab program remains mostly unchanged.

It turns out that generating matlabic` allows for moving much of the project complexity out of the compiler (which is already complicated enough) and into the runtime library, where there is almost no interaction between the library parts.

Which one is faster --- python or octave? I don't know.

Doing reliable performance measurements is notoriously hard, and is of low priority for me now. Instead, I wrote a simple driver go.m and go.py and rewrote rand so that python and octave versions run the same code. Then I ran the above example on my laptop. The results are twice as fast for the python version. What does it mean? Probably nothing. YMMV.

ai = zeros(10,10);
af = ai;

ai(1,1)=2;
ai(2,2)=3;
ai(3,3)=4;
ai(4,4)=5;
ai(5,5)=1;

af(9,9)=1;
af(8,8)=2;
af(7,7)=3;
af(6,6)=4;
af(10,10)=5;

tic;
mv = solver(ai,af,0);
toc

Running the test suite:

$ cd smop
$ make check
$ make test

Command-line options

lei@dilbert ~/smop-github/smop $ python main.py -h
SMOP compiler version 0.25.1
Usage: smop [options] file-list
    Options:
    -V --version
    -X --exclude=FILES      Ignore files listed in comma-separated list FILES
    -d --dot=REGEX          For functions whose names match REGEX, save debugging
                            information in "dot" format (see www.graphviz.org).
                            You need an installation of graphviz to use --dot
                            option.  Use "dot" utility to create a pdf file.
                            For example: 
                                $ python main.py fastsolver.m -d "solver|cbest"
                                $ dot -Tpdf -o resolve_solver.pdf resolve_solver.dot
    -h --help
    -o --output=FILENAME    By default create file named a.py
    -o- --output=-          Use standard output
    -s --strict             Stop on the first error
    -v --verbose

smop's People

Contributors

alexg31 avatar iglesias avatar johnyf avatar perimosocordiae avatar victorlei avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

smop's Issues

preserve comments

Comments annotate code as a docstring and at the few places where the code itself does not suffice for readability. For this reason it would be nice to provide an option for converting MATLAB comments to python comments. This was mentioned also in #12 and in this issue on chiselapp.

Join efforts with other projects

There are several other projects that translate matlab to python. The most relevant seems to be these two, also written in python and BSD / MIT licensed:
https://bitbucket.org/juricap/ompc/src
https://github.com/miaoever/Mat2py

I'm not certain which project is furthest along (I chose this one to bug since it was the most recent), but working on finishing one codebase to perfection rather than having three different but incomplete ones ought to be much more efficient...

add 'convert' to project description

perhaps adding 'convert' to the project description would increase the probability of google suggesting it for a "convert matlab to python" search

UNC path is not understood under some rare conditions

For unknown reason, UNC path combined with asterisk, as in smop \foo\bar\bzz*.m is not understood and somehow prevents smop from being run. Other, similar, expressions work just fine. For example, smop "\foo\bar\bzz*.m" (note the quotes) is okay.

Recognize structure fields as array

This Matlab function:

    function a = test(m)
             a = m.member(:, 0)
    end

produces this python function:

    def test(m):
        a = m.member(:, 0)
        return a

The obvious conclusion I can make is that smop is detecting any symbol with '.' in it as a function. Any idea what would need to be changed to fix this?

round and filter get converted to round_ and filter_

Example:

function a = test(b)
         [c,d] = butter(4, 'low')
         a = filter(c,d, b)
         a = round(a)
end

gets converted to:

def test(b):
    c, d = butter(4, 'low') # nargout=2
    a = filter_(c, d, b)
    a = round_(a)
    return a

This happens both in the master branch and @perimosocordiae's divergent branch.

Any pointers?

Bug: Dot product and element-wise product translation inverted

Matlab operator * is the dot product when multiplying matrices or vectors
and is translated to python operator * which is the element-wise multiplication when
using matrices or vectors; while matlab operator .* which is the element-wise multiplication is translated to .dot()

Bug: np.array affectation is reference affectation

Matlab's operator = deep copies the matrix, while in numpy it's only the reference that is copied, even when affecting a sub part of the matrix/vector.
Fix:
a = b
should be:
a = np.array(b)
when b is a vector/matrix.

But it's not useful when affecting the result of an operation:
a = b * b
for example.

PS: adding spaces right and left of the = operator would make code more readable, as keeping comments and line feeds.

crash in interactive mode

=>> function x=foo(y); x=y; end
Traceback (most recent call last):
File "main.py", line 139, in
main()
File "main.py", line 93, in main
resolve.resolve(t,symtab)
File "/home/lei/smop-github/smop/resolve.py", line 67, in resolve
u = G.node[n]["ident"]
KeyError: 'ident'
lei@dilbert ~/smop-github/smop $

fastsolver.py missing all but the headers

I accidentally uploaded a development version -- sorry I have no experience with git... Meanwhile download version 0.22 -- it is supposed to work . In particular, you should be able to run

python -m fastsolver.py

Without error messages.

what simple and obvious step did I miss?

length function

lenght function should be converted to len

currently:

arr{1} = 1;
arr{2} = 2;
lentgh(arr);

converted to

arr[1] = 1
arr[1] = 1
lenght(arr)

TBD

  • cython?
  • arrays with [ ], structs
  • command mode
  • comments
  • dot op
  • fastsplver
  • compare performance --> profiler
  • .py3 compatibility
  • pep8
  • ompc examples
  • libermate examples
  • matlab2fortran -> regex.m low prio
  • pypy

function with no argin translated incorrectly

matlab:

function ret = noargin()
    ret = 0

python:

# Autogenerated with SMOP version 0.25
# main.py test.m -o test.py
from __future__ import division
import numpy as np
from runtime import *
def noargin_(,nargout=1):
    #3
    ret=0
    return ret

fail to run "Working example: solver.m" in README

Hi all,
I am new to Python and need to convert some MATLAB code to Python code, so I tried the "Working example: solver.m" on https://github.com/victorlei/smop/blob/master/README.rst as a start, but got the error message:

~/Downloads/smop-0.25.4/smop> python main.py solver.m
Traceback (most recent call last):
File "main.py", line 164, in
main()
File "main.py", line 157, in main
G = resolve.resolve(func_obj)
File "/nfs/yugong/data/jnwang/Downloads/smop-0.25.4/smop/resolve.py", line 65, in resolve
G = as_networkx(t)
File "/nfs/yugong/data/jnwang/Downloads/smop-0.25.4/smop/resolve.py", line 50, in as_networkx
G.add_node(uu, ident=u) # label=label)
TypeError: add_node() got an unexpected keyword argument 'ident'

I also tried the latest version smop-0.25.6, and ended up with the same error, anyone knows what is the problem? I am using openSUSE 13.1, python 2.7.6, and MATLAB R2012b.

Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.