Giter VIP home page Giter VIP logo

macaw's Introduction

License GitHub last commit

Clojars Project

Macaw logo

Macaw

Macaw is a limited Clojure wrapper for JSqlParser. Similar to its parrot namesake, it's intelligent, can be taught to speak SQL, and has many colors (supports many dialects).

Rationale

JSqlParser does great work actually parsing SQL, but its use of the visitor pattern makes Clojure inter-op very awkward. Macaw exists to make working with JSqlParser from within Clojure feel more idiomatic and pleasant, letting you walk over a query with custom callbacks and returning persistent data structures.

Currently it's especially useful for extracting the columns, tables, and side-effecting commands from a SQL string (see Query Parsing) and also intelligently renaming the columns and tables used in it (see Query Rewriting).

Preliminaries

Building

To build a local JAR, use

./bin/build-jar

This will create a JAR in the target directory.

Working with the Java files

To compile the Java files, use

./bin/java-compile

If you're working on Macaw and make changes to a Java file, you must:

  1. Recompile
  2. Restart your Clojure REPL

for the changes to take effect.

Usage

For detailed documentation, refer to the Marginalia documentation here.

Query Parsing

For extracting information from a query, use parsed-query to get a parse object and query->components to turn it into something useful. For example:

;; macaw.core
(-> "SELECT total FROM orders"
    parsed-query
    query->components)
;; => {:columns           #{{:component {:column "total"}, :context ["SELECT"]}},
;;     :has-wildcard?     #{},
;;     :mutation-commands #{},
;;     :tables            #{{:component {:table "orders"}, :context ["FROM" "SELECT"]}},
;;     :table-wildcards   #{}}

The returned map will always have that general shape as of Macaw 0.1.30. Each of the main keys will always refer to a set, and each set will always consist of maps with a :component key (described below) and a :context key (described farther below).

Column and table :components

Columns will have a :column key with the name as it appears in the query, and may also have a :table or :schema key if available.

(-> "SELECT id, orders.total, public.orders.tax FROM public.orders"
    parsed-query
    query->components
    :columns)
;; => #{{:component {:column "id"},                                       :context ["SELECT"]}
;;      {:component {:column "total", :table "orders", :schema "public"}, :context ["SELECT"]}
;;      {:component {:column "tax",   :table "orders", :schema "public"}, :context ["SELECT"]}}

Note that the schema for total was inferred, but neither the table nor the schema for id was inferred since there are similar queries where it could be ambiguous (e.g., with a JOIN).

Macaw will also resolve aliases sensibly:

(-> "SELECT o.id, o.total, u.name FROM public.orders o JOIN users u ON u.id = o.user_id"
    parsed-query
    query->components
    (select-keys [:columns :tables]))
;; => {:columns
;;     #{{:component {:column "id", :table "orders", :schema "public"}, :context ["SELECT"]}
;;       {:component {:column "total", :table "orders", :schema "public"}, :context ["SELECT"]}
;;       {:component {:column "name", :table "users"}, :context ["SELECT"]}
;;       {:component {:column "id", :table "users"}, :context ["JOIN" "SELECT"]}
;;       {:component {:column "user_id", :table "orders", :schema "public"}, :context ["JOIN" "SELECT"]}},
;;     :tables
;;     #{{:component {:table "orders", :schema "public"}, :context ["FROM" "SELECT"]}
;;       {:component {:table "users"}, :context ["FROM" "JOIN" "SELECT"]}}}

Note that :tables is similar to :columns, but only contains the :table and (if available anywhere in the query) :schema keys.

Wildcard :components

The :has-wildcard? and :table-wildcards keys refer to *s:

(-> "SELECT * from orders"
    parsed-query
    query->components
    :has-wildcard?)
;; => #{{:component true, :context ["SELECT"]}}
(-> "SELECT o.*, u.* FROM public.orders o JOIN users u ON u.id = o.user_id"
    parsed-query
    query->components
    :table-wildcards)
;; => #{{:component {:table "users"}, :context ["SELECT"]}
;;      {:component {:table "orders", :schema "public"}, :context ["SELECT"]}}

The shape of :table-wildcards will be the same as the shape of :tables. The :component for has-wildcard? will never be false.

Mutation commands

Any commands that could change the state of the database are returned in :mutation-commands:

(-> "DROP TABLE orders"
    parsed-query
    query->components
    :mutation-commands)
;; => #{{:component "drop", :context []}}

The list of recognized commands as of 0.1.30 is:

$ grep MUTATION_COMMAND java/com/metabase/macaw/AstWalker.java | grep -oEi '".+"' | sort
"alter-sequence"
"alter-session"
"alter-system"
"alter-table"
"alter-view"
"create-function"
"create-index"
"create-schema"
"create-sequence"
"create-synonym"
"create-table"
"create-view"
"delete"
"drop"
"grant"
"insert"
"purge"
"rename-table"
"truncate"
"update"

Context

As can be seen above, every item additional has a :context key containing a stack (most-specific first) of the query components in which the item was found. The definitive list of components if found in com.metabase.macaw.AstWalker.QueryContext, but as of 0.1.30 the list is as follows (and is unlikely to change):

DELETE,
ELSE,
FROM,
GROUP_BY,
HAVING,
IF,
INSERT,
JOIN,
SELECT,
SUB_SELECT,
UPDATE,
WHERE;

Query Rewriting

Editing queries can be done with replace-names. It takes two arguments, the query itself (as a string) and a map of maps. The outer keys of the map are :schemas, :tables, and :columns. The inner maps take the form old-name -> new-name. For example:

(replace-names "SELECT p.id, orders.total FROM people p, public.orders;"
               {:schemas  {"public" "private"}
                :tables   {"people" "users"}
                :columns  {"total" "amount"}})
;; => "SELECT p.id, orders.amount FROM users p, private.orders;"

Error Handling

Macaw makes no effort to recover from errors. Malformed SQL strings will probably raise a JSqlParserException, which Macaw will forward on to your code.

Contributing

External contributions are welcome! We require all third-party contributors to sign our license agreement. Macaw's philosophy towards third-party contributions is largely similar to that of Metabase's. We'd especially like to call out that document's YOLO method of PR submission:

If you come up with something really cool, and want to share it with us, just submit a PR. If it hasn't gone through the above process, we probably won't merge it as is, but if it's compelling, we're more than willing to help you via code review, design review and generally OCD nitpicking so that it fits into the rest of our codebase.

Since Macaw is a much simpler project than Metabase (and also doesn't involve a graphical user interface) we should be able to respond to product proposals fairly quickly.

macaw's People

Contributors

crisptrutski avatar dpsutton avatar piranha avatar tsmacdonald avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

macaw's Issues

Alias removal is over-eager; removes tables that are shadowed

(deftest tables-with-complex-aliases-test
  (testing "With an alias that is also a table name"
    (is (= #{"user" "user2_final"}
           (tables
            "SELECT legacy_user.id AS old_id,
                    user.id AS new_id
             FROM user AS legacy_user
             OUTER JOIN user2_final AS user
             ON legacy_user.email = user2_final.email;")))))

Support parsing data-modifying statements in WITH

This is legal in Postgres

But fails

(deftest complicated-mutations-test
  (is (= #{"delete" "insert"}
         (mutations "WITH outdated_orders AS (
                       DELETE FROM orders
                       WHERE
                         date <= '2018-01-01'
                       RETURNING *
                     )
                     INSERT INTO order_log
                     SELECT * from outdated_orders;"))))

with


macaw.core-test
1 non-passing tests:

Error in complicated-mutations-test

expected: (=
	   #{"delete" "insert"}
	   (mutations
	    "WITH outdated_orders AS (
                       DELETE FROM orders
                       WHERE
                         date <= '2018-01-01'
                       RETURNING *
                     )
                     INSERT INTO order_log
                     SELECT * from outdated_orders;"))

   error: net.sf.jsqlparser.JSQLParserException: net.sf.jsqlparser.parser.ParseException: Encountered unexpected token: "DELETE" "DELETE"
    at line 2, column 24.

Was expecting:

    "WITH"


	  

Finish repository setup

  • Add secrets:
    • GITHUB_TOKEN
    • CLOJARS_USERNAME
    • CLOJARS_PASSWORD
  • Set up Clojars
    • (secrets as above)
    • Uncomment deploy.yml, edit if needed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.