Giter VIP home page Giter VIP logo

Comments (16)

lelit avatar lelit commented on June 12, 2024 1

I think a first step toward the goal is implementing the pre-parser speculated in point 1, able to extract only the comments and keeping them apart, with their absolute position in the original statement.

from pglast.

lelit avatar lelit commented on June 12, 2024 1

I will try to understand the effort needed to expose the pg_query_scan() function as time permits.

from pglast.

lelit avatar lelit commented on June 12, 2024 1

I've implemented a preliminary scan() function in the parser module, that exposes libpgquery's pg_query_scan(). This is just the first step...

from pglast.

lelit avatar lelit commented on June 12, 2024 1

See here for an example.

from pglast.

lelit avatar lelit commented on June 12, 2024

Unfortunately that's difficult: see libpg_query's #15 for details.

As Lukas suggests, with lot of gymnastics we could

  1. with a pre-parser replace all comments with equivalent-length string of spaces, keeping them safe with their original position
  2. parse the resulting comment-free statement
  3. while printing back, we look at each node position, and if it "higher" than the current comment, we emit the comment first, removing it from the "queue"

As an alternative approach, replace point 3 with

  1. inject "syntetic" Comment nodes in the parsed tree, again using original position to find the right place
  2. properly implement Comment printer

from pglast.

Hultner avatar Hultner commented on June 12, 2024

I discussed this briefly with Lele over email and would like to add an excerpt from reply here.

I looked at the issues you referenced and what you mention in #23 as an alternative approach is very similar to the way yapf goes about it from the reference in my previous email. I do agree and think that altering the PG parser is likely to require significant effort, and the alternative approach of “re-injecting” the comment nodes into the AST sounds like a much more viable starting-point to achieve something like that. I am not very familiar with lbpg_query but as long as we have a reference between AST-nodes and their original place in the code doing a injection should be reasonable without too much hacking about, I might even be able to hack a PoC together in a fork if such is the case.

Here's the mentioned comment splicer from yapf. From my point of view this approach is very similar to 3 & 4 above and seems like a reasonable way to move forward. Would love to hear your comments on this.

from pglast.

SKalt avatar SKalt commented on June 12, 2024

Looks like the linked libpg_query issue #15 on preserving comments was closed as implemented:

Closing this, since this has been implemented in the scanner method released in pg_query 2.0:
https://github.com/pganalyze/libpg_query#usage-scanning-a-query-into-its-tokens-using-the-postgresql-scannerlexer
(note it doesn't really fit the logic to put this information in the parser, so you would have to use both the scanner and the parser method in some cases)
-- pganalyze/libpg_query#15 (comment)

Does this news make implementing comment preservation easier? If so, what would the steps be to using the new pg_query functionaility via libpg_query?

from pglast.

SKalt avatar SKalt commented on June 12, 2024

I'm seeing commit 55aed47 on the v3 branch. What would the next steps be? I can contribute test cases if nothing else.

from pglast.

lelit avatar lelit commented on June 12, 2024

The plan is: add a boolean option to the prettify() function, that when True it first collects all comments in the original statement the pass them to the IndentedStream constructor. At that point, its print_node() method should emit in a way to be determined the comments as soon as the node carries a location greater than that of the comments.

from pglast.

lelit avatar lelit commented on June 12, 2024

A first cut of this feature is present in the just released 3.0dev0 version.
I'm not really satisfied of the outcome, but it's a start... Any feedback is appreciated!

from pglast.

SKalt avatar SKalt commented on June 12, 2024

Initial feedback:

  • I'd like the comments to be left in their original -- form if possible; maybe test for newlines in the comment body?
  • some comments get added to the next line when they follow a ,-preceded item.
demos
echo '-- hello

select 1
' > ./normal.sql;

echo 'CREATE TABLE foo(
  bar INT -- an informative comment;
  , baz TEXT -- another comment,
);' > ./tricky.sql

echo '
CREATE OR REPLACE FUNCTION my_func(x INT) RETURNS TABLE(x INT) LANGUAGE plpgsql AS $$ BEGIN return query select * from foo; END $$;
'  > ./plpg.sql

docker run -it -v $(pwd):/workspace --workdir /worksapce python:3.8-buster bash
pip install pglast==3.0.dev0 

function compare_pre_post() {
  local file_to_format=${1:?missing required positional argument}; shift;
  echo "before ----------------------";
  cat $file_to_format | tee /tmp/pre;
  echo "after ------------------------";
  python -m pglast $@ $file_to_format /dev/stdout | tee /tmp/post;
  echo "compared ------------------";
  diff -u /tmp/pre /tmp/post;
}

compare_pre_post ./normal.sql --preserve-comments | sed 's/^/# /g'
# before ----------------------
# -- hello
# 
# select 1
# 
# after ------------------------
# /* hello */
# SELECT 1
# compared ------------------
# --- /tmp/pre  2021-05-04 15:09:26.569431088 +0000
# +++ /tmp/post 2021-05-04 15:09:26.726431096 +0000
# @@ -1,4 +1,2 @@
# --- hello
# -
# -select 1
# -
# +/* hello */
# +SELECT 1
compare_pre_post ./tricky.sql --preserve-comments | sed 's/^/# /g'
# before ----------------------
# CREATE TABLE foo(
#   bar INT -- an informative comment;
#   , baz TEXT -- another comment,
# );
# after ------------------------
# CREATE TABLE foo (
#     bar integer
#   , /* an informative comment; */ baz text
# ) /* another comment, */ 
# compared ------------------
# --- /tmp/pre  2021-05-04 15:10:35.471343512 +0000
# +++ /tmp/post 2021-05-04 15:10:35.639343520 +0000
# @@ -1,4 +1,4 @@
# -CREATE TABLE foo(
# -  bar INT -- an informative comment;
# -  , baz TEXT -- another comment,
# -);
# +CREATE TABLE foo (
# +    bar integer
# +  , /* an informative comment; */ baz text
# +) /* another comment, */ 
compare_pre_post ./malformed.plpg.sql --preserve-comments | sed 's/^/# /g'
# before ----------------------
# 
# CREATE OR REPLACE FUNCTION my_func(x INT) RETURNS TABLE(x INT) LANGUAGE plpgsql AS $$ BEGIN return query select * from foo; END $$;
# 
# after ------------------------
# CREATE OR REPLACE FUNCTION my_func(x integer)
# RETURNS TABLE (x integer)LANGUAGE plpgsql
# AS $$ BEGIN return query select * from foo; END $$
# compared ------------------
# --- /tmp/pre  2021-05-04 15:38:25.591646027 +0000
# +++ /tmp/post 2021-05-04 15:38:25.747646035 +0000
# @@ -1,3 +1,3 @@
# -
# -CREATE OR REPLACE FUNCTION my_func(x INT) RETURNS TABLE(x INT) LANGUAGE plpgsql AS $$ BEGIN return query select * from foo; END $$;
# -
# +CREATE OR REPLACE FUNCTION my_func(x integer)
# +RETURNS TABLE (x integer)LANGUAGE plpgsql
# +AS $$ BEGIN return query select * from foo; END $$

from pglast.

lelit avatar lelit commented on June 12, 2024

Thank you.

The first is doable, at least when one using the prettifying printer, but not when using the compact one: the latter does not emit newlines, so embedded comments obviously must be in the C format.

The second is not possible, as not all nodes carry their original position (for example literal values), so pglast cannot determine whether the comment is before of after them.

from pglast.

lelit avatar lelit commented on June 12, 2024

The logic is pretty simple: the print_comment() method emits the text, and it gets called by print_node() when the given node was originally after the next available comment.
Maybe you can do some experiment and suggest different approaches?

from pglast.

lelit avatar lelit commented on June 12, 2024

I improved the printing of comments, maintaining the original style.

from pglast.

SKalt avatar SKalt commented on June 12, 2024

Looks good to me! I'll see if I can get a chance to play around with the code further in the near future.

from pglast.

lelit avatar lelit commented on June 12, 2024

Present in release v3.0.dev1.
Feel free to re-open this (or a new issue) if there is something else that can be done.
Thank you!

from pglast.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.