Comments (16)
I think a first step toward the goal is implementing the pre-parser speculated in point 1, able to extract only the comments and keeping them apart, with their absolute position in the original statement.
from pglast.
I will try to understand the effort needed to expose the pg_query_scan()
function as time permits.
from pglast.
I've implemented a preliminary scan()
function in the parser module, that exposes libpgquery's pg_query_scan()
. This is just the first step...
from pglast.
See here for an example.
from pglast.
Unfortunately that's difficult: see libpg_query's #15 for details.
As Lukas suggests, with lot of gymnastics we could
- with a pre-parser replace all comments with equivalent-length string of spaces, keeping them safe with their original position
- parse the resulting comment-free statement
- while printing back, we look at each node position, and if it "higher" than the current comment, we emit the comment first, removing it from the "queue"
As an alternative approach, replace point 3 with
- inject "syntetic"
Comment
nodes in the parsed tree, again using original position to find the right place - properly implement
Comment
printer
from pglast.
I discussed this briefly with Lele over email and would like to add an excerpt from reply here.
I looked at the issues you referenced and what you mention in #23 as an alternative approach is very similar to the way yapf goes about it from the reference in my previous email. I do agree and think that altering the PG parser is likely to require significant effort, and the alternative approach of “re-injecting” the comment nodes into the AST sounds like a much more viable starting-point to achieve something like that. I am not very familiar with lbpg_query but as long as we have a reference between AST-nodes and their original place in the code doing a injection should be reasonable without too much hacking about, I might even be able to hack a PoC together in a fork if such is the case.
Here's the mentioned comment splicer from yapf. From my point of view this approach is very similar to 3 & 4 above and seems like a reasonable way to move forward. Would love to hear your comments on this.
from pglast.
Looks like the linked libpg_query issue #15 on preserving comments was closed as implemented:
Closing this, since this has been implemented in the scanner method released in pg_query 2.0:
https://github.com/pganalyze/libpg_query#usage-scanning-a-query-into-its-tokens-using-the-postgresql-scannerlexer
(note it doesn't really fit the logic to put this information in the parser, so you would have to use both the scanner and the parser method in some cases)
-- pganalyze/libpg_query#15 (comment)
Does this news make implementing comment preservation easier? If so, what would the steps be to using the new pg_query
functionaility via libpg_query
?
from pglast.
I'm seeing commit 55aed47 on the v3 branch. What would the next steps be? I can contribute test cases if nothing else.
from pglast.
The plan is: add a boolean option to the prettify()
function, that when True
it first collects all comments in the original statement the pass them to the IndentedStream
constructor. At that point, its print_node()
method should emit in a way to be determined the comments as soon as the node carries a location
greater than that of the comments.
from pglast.
A first cut of this feature is present in the just released 3.0dev0 version.
I'm not really satisfied of the outcome, but it's a start... Any feedback is appreciated!
from pglast.
Initial feedback:
- I'd like the comments to be left in their original
--
form if possible; maybe test for newlines in the comment body? - some comments get added to the next line when they follow a
,
-preceded item.
demos
echo '-- hello
select 1
' > ./normal.sql;
echo 'CREATE TABLE foo(
bar INT -- an informative comment;
, baz TEXT -- another comment,
);' > ./tricky.sql
echo '
CREATE OR REPLACE FUNCTION my_func(x INT) RETURNS TABLE(x INT) LANGUAGE plpgsql AS $$ BEGIN return query select * from foo; END $$;
' > ./plpg.sql
docker run -it -v $(pwd):/workspace --workdir /worksapce python:3.8-buster bash
pip install pglast==3.0.dev0
function compare_pre_post() {
local file_to_format=${1:?missing required positional argument}; shift;
echo "before ----------------------";
cat $file_to_format | tee /tmp/pre;
echo "after ------------------------";
python -m pglast $@ $file_to_format /dev/stdout | tee /tmp/post;
echo "compared ------------------";
diff -u /tmp/pre /tmp/post;
}
compare_pre_post ./normal.sql --preserve-comments | sed 's/^/# /g'
# before ----------------------
# -- hello
#
# select 1
#
# after ------------------------
# /* hello */
# SELECT 1
# compared ------------------
# --- /tmp/pre 2021-05-04 15:09:26.569431088 +0000
# +++ /tmp/post 2021-05-04 15:09:26.726431096 +0000
# @@ -1,4 +1,2 @@
# --- hello
# -
# -select 1
# -
# +/* hello */
# +SELECT 1
compare_pre_post ./tricky.sql --preserve-comments | sed 's/^/# /g'
# before ----------------------
# CREATE TABLE foo(
# bar INT -- an informative comment;
# , baz TEXT -- another comment,
# );
# after ------------------------
# CREATE TABLE foo (
# bar integer
# , /* an informative comment; */ baz text
# ) /* another comment, */
# compared ------------------
# --- /tmp/pre 2021-05-04 15:10:35.471343512 +0000
# +++ /tmp/post 2021-05-04 15:10:35.639343520 +0000
# @@ -1,4 +1,4 @@
# -CREATE TABLE foo(
# - bar INT -- an informative comment;
# - , baz TEXT -- another comment,
# -);
# +CREATE TABLE foo (
# + bar integer
# + , /* an informative comment; */ baz text
# +) /* another comment, */
compare_pre_post ./malformed.plpg.sql --preserve-comments | sed 's/^/# /g'
# before ----------------------
#
# CREATE OR REPLACE FUNCTION my_func(x INT) RETURNS TABLE(x INT) LANGUAGE plpgsql AS $$ BEGIN return query select * from foo; END $$;
#
# after ------------------------
# CREATE OR REPLACE FUNCTION my_func(x integer)
# RETURNS TABLE (x integer)LANGUAGE plpgsql
# AS $$ BEGIN return query select * from foo; END $$
# compared ------------------
# --- /tmp/pre 2021-05-04 15:38:25.591646027 +0000
# +++ /tmp/post 2021-05-04 15:38:25.747646035 +0000
# @@ -1,3 +1,3 @@
# -
# -CREATE OR REPLACE FUNCTION my_func(x INT) RETURNS TABLE(x INT) LANGUAGE plpgsql AS $$ BEGIN return query select * from foo; END $$;
# -
# +CREATE OR REPLACE FUNCTION my_func(x integer)
# +RETURNS TABLE (x integer)LANGUAGE plpgsql
# +AS $$ BEGIN return query select * from foo; END $$
from pglast.
Thank you.
The first is doable, at least when one using the prettifying printer, but not when using the compact one: the latter does not emit newlines, so embedded comments obviously must be in the C
format.
The second is not possible, as not all nodes carry their original position (for example literal values), so pglast cannot determine whether the comment is before of after them.
from pglast.
The logic is pretty simple: the print_comment() method emits the text, and it gets called by print_node() when the given node
was originally after the next available comment.
Maybe you can do some experiment and suggest different approaches?
from pglast.
I improved the printing of comments, maintaining the original style.
from pglast.
Looks good to me! I'll see if I can get a chance to play around with the code further in the near future.
from pglast.
Present in release v3.0.dev1.
Feel free to re-open this (or a new issue) if there is something else that can be done.
Thank you!
from pglast.
Related Issues (20)
- Extra parens when formatting different operations HOT 2
- Support PG 15 HOT 5
- deparse_protobuf() broken on big-endian HOT 9
- Deparsing of modified queries and query parts HOT 3
- Double Quotes NOT preserved after parsing HOT 2
- Support statement types similar to referenced relations HOT 3
- Failed to install with pip (python 3.10.11) HOT 1
- ParseError when handling a statement containing a UUID HOT 1
- Token's reported location shifts when query has Turkish characters HOT 6
- Safety belt triggered HOT 2
- Duplicate `DEFERRABLE INITIALLY DEFERRED` in output HOT 3
- Can't compile with glibc >= 2.38 HOT 2
- pgpp remove important parentheses HOT 3
- Support PG16 HOT 6
- TypeError while init RawStmt from query parse tree dict which has A_Star. HOT 2
- (32-bit) FETCH ALL -> FETCH 2147483647 HOT 3
- Printing of AlterOwnerStmt for operator class results in error. HOT 3
- pglast>-6 install from wheel is missing symbol HOT 11
- referenced_relations: incorrectly treating LATERAL-ly-joined aliases as real relations HOT 4
- Question about the traverse method on v6 HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pglast.