Giter VIP home page Giter VIP logo

binpash / libdash Goto Github PK

View Code? Open in Web Editor NEW
38.0 38.0 7.0 933 KB

The dash shell as a linkable library. Tracks https://git.kernel.org/pub/scm/utils/dash/dash.git, with extended interfaces, bindings for Python and OCaml, and tools for generating JSON representations of shell scripts.

License: Other

Makefile 0.54% Shell 36.25% M4 0.63% C 43.32% Roff 8.71% Max 0.38% OCaml 5.49% Python 4.67%
library ocaml parser posix posix-sh posix-shell python shell shell-script

libdash's People

Contributors

ao2 avatar bkoropoff avatar brainflux avatar davem330 avatar ebblake avatar eric-s-raymond avatar herbertx avatar hvdijk avatar jacmet avatar jillest avatar jrn avatar kreuter avatar larryhynes avatar ldoolitt avatar legionus avatar mcdutchie avatar meyering avatar mgree avatar peda-r avatar pweis8 avatar rocky avatar skitt avatar smorimoto avatar tklauser avatar toofishes avatar tucak avatar vorlonofportland avatar whoopdedo avatar zenczykowski avatar zhmu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

libdash's Issues

Incorrect handling of single-quoted escape sequences in `ast.ml`

See test/failing/backslash or test/failing/aaaa_single (which comes from @tucak's original bug report's reproducer (mgree/smoosh#5) for the bug resolved in #1).

The problem is, depending on your perspective, in backslash insertion or in E insertion in ocaml/ast.ml. At this point, ast.ml is pretty out of sync with mgree/smoosh's shim.ml. Hopefully upcoming student work on generalizing the parsing interface will give us a simpler model, though we're probably a ways out.

Prompt callbacks

libdash, by default, uses dash expansion prints the PS1 and PS2 in the dash prompt. This behavior is wrong (and possibly dangerous, depending on the use case).

We should make the prompt functions monkeypatchable, allowing users of the library to circumvent dash internals (beyond the alias table) entirely. In an ideal world, we could even compile just the few files necessary.

Parsing and unparsing has super-linear runtime

Parsing and unparsing have super-linear runtime, largely because OCaml list append and string concatenation are not optimized.

Parsing: parse_tilde quadratic (/cubic?) behavior

for n in 1000 2000 4000 8000 16000 32000 64000; do perl -e "print '~'; print 'a' x $n;" > /tmp/t; time ./parse_to_json.native /tmp/t > /dev/null; done

user    0m0.015s
user    0m0.034s
user    0m0.153s
user    0m0.830s
user    0m4.540s
user    0m27.661s
user    3m4.591s

(real/sys runtime omitted for brevity)

Notes:

  • This is likely because of ast.ml: | c::s' -> parse_tilde (acc @ [c]) s'. It's not obvious to me why it seems to be cubic (rather than quadratic) runtime though.
    • An alternative would be to prepend to the list, and then reverse the result at the end.
  • The parse_to_json.native test application is from https://github.com/andromeda/pash/tree/main/compiler/parser
  • To avoid conflating the libdash runtime with the JSON serialization, I disabled the JSON serialization code (print_ast) in the test app

Parsing: to_assign cubic (?) behavior

for n in 1000 2000 4000 8000 16000 32000 64000; do perl -e "print 'f' x $n; print '=pay_respects;'" > /tmp/t1; time ./parse_to_json.native /tmp/t1 > /dev/null; done

user    0m0.011s
user    0m0.021s
user    0m0.098s
user    0m0.571s
user    0m3.749s
user    0m25.097s
user    2m52.969s

This also has an expensive list append operation: ast.ml | C c :: a -> to_assign (v @ [c]) a.

Unparsing: ^ string concatenation considered harmful

OCaml's "^" string operator is not optimized; concatenating n strings one at a time can take O(n^2) runtime (https://discuss.ocaml.org/t/whats-the-fastest-way-to-concatenate-strings/2616/7). This is arguably a compiler issue e.g., CPython optimizes for common cases (https://mail.python.org/pipermail/python-dev/2013-February/124031.html).

ast.ml's unparsing is essentially a series of "^" operations, hence everything is going to have a worst-case runtime that's super-linear.

Here's an example of a long pipeline, showing quadratic runtime for json_to_shell:

for n in 2000 4000 8000 16000 32000 64000 128000; do (echo -n "echo 'Hello '"; for f in `seq 1 $n`; do echo -n "| cat "; done; echo) > /tmp/s1; cat /tmp/s1 | ./parse_to_json.native > /tmp/j1; time ./json_to_shell.native /tmp/j1 | md5sum; done

user    0m0.035s
user    0m0.112s
user    0m0.508s
user    0m1.920s
user    0m13.909s
user    0m52.966s
user    4m26.274s

I also tried removing the Ast.to_string operation in json_to_shell.ml, to show that the JSON deserialization by itself is fast (linear); thus, the quadratic runtime is due to the libdash core.

Unparsing: fresh_marker for heredocs is slow on adversarial inputs

fresh_marker tries to find increasingly long-variants of {EOF, EOFF, EOFFF ..}, until it can find a marker that is not contained in the heredoc. This is slow for adversarial inputs that deliberately use all those markers:

cat <<CTHULHU
EOF
EOFF
EOFFF
EOFFFF
EOFFFFF
CTHULHU

Migrating python bindings from PaSh

  • Copy over bindings and build from PaSh
  • Rename files to something sensible
  • Get tests running in CI
  • Port over OCaml json converters
  • Port over OCaml vs. Python tests
  • Create setup.py/pip-based install routine
  • Migrate PaSh to use setup.py

Unify AST with Smoosh

Right now ast.ml and shim.ml in smoosh are out of sync. It'd be great if there were only one copy of the code that converts dash's data structures to OCaml trees.

Some trickiness: it'd be ideal to not have to port over all of the smoosh runtime terms, just the actual parseable ones. Doing so would add an extra layer of conversion, though, which is probably pretty costly. It should be possible to functorize the dash conversion: given some functions that generate appropriate values of some abstract type, we'll call them in the right order.

Abandon dynamic linking

It has been tremendously painful working with ctypes: it's been hard to get things to work locally, nevermind in CI. Each OPAM version bump burns a day debugging linking issues.

The solution seems clear to me: abandon dynamic linking.

Chapter 20: Interfacing C with OCaml is the general resource we want.

We should simply write accessor functions for getting each field out of each struct. Each struct is an opaque type. Annoying, but straightforward enough.

It's tempting to use an Abstract_tag, which tells the GC about the pointer, or Custom_tag, which lets us use GC hooks to finalize things, etc. But we're already manipulating the stackmarks appropriately, and we don't want the OCaml GC mucking about in there. So: just return the raw pointer.

Shell AST server

Accept shell scripts as POST, return JSON.

Should be super simple, à la python -m http.server.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.