Giter VIP home page Giter VIP logo

Comments (5)

cube2222 avatar cube2222 commented on September 22, 2024 1

Hey, thanks for the report.

First time I'm seeing this. Do you possibly have the JSON value that is causing this?

from octosql.

chapmanjacobd avatar chapmanjacobd commented on September 22, 2024

It's the same 10GB file as before https://files.pushshift.io/reddit/submissions/RS_2022-08.zst.

Not sure how to get the JSON value. In the code you should be able to add some logging in the stacktrace to print it but maybe with stuff like memory errors that is more difficult?

when I try to generate a trace it only creates a 16 byte file:

unzstd --memory=2048MB --stdout ~/lb/psaw/files.pushshift.io/reddit/submissions/RS_2022-08.zst | octosql "SELECT * FROM stdin.json" -o csv --profile trace
2022/10/15 13:02:41 profile: trace enabled, trace.out
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x908f1b]

goroutine 7 [running]:
github.com/cube2222/octosql/datasources/json.getOctoSQLValue({0x7, {}, {}, {}, {}, {}, {}, {}, {0x0}, {{0x0, ...}}, ...}, ...)
	/home/xk/octosql/datasources/json/execution.go:196 +0xc1b
github.com/cube2222/octosql/datasources/json.getOctoSQLValue({0x8, {}, {}, {}, {}, {}, {}, {}, {0x0}, {{0xc00027aa00, ...}}, ...}, ...)
	/home/xk/octosql/datasources/json/execution.go:209 +0xa58
github.com/cube2222/octosql/datasources/json.getOctoSQLValue({0x7, {}, {}, {}, {}, {}, {}, {}, {0xc0005f9b60}, {{0x0, ...}}, ...}, ...)
	/home/xk/octosql/datasources/json/execution.go:196 +0xc78
github.com/cube2222/octosql/datasources/json.glob..func1.1()
	/home/xk/octosql/datasources/json/workers.go:68 +0x5b8
created by github.com/cube2222/octosql/datasources/json.glob..func1
	/home/xk/octosql/datasources/json/workers.go:44 +0x45
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x908f1b]

goroutine 8 [running]:
github.com/cube2222/octosql/datasources/json.getOctoSQLValue({0x7, {}, {}, {}, {}, {}, {}, {}, {0x0}, {{0x0, ...}}, ...}, ...)
	/home/xk/octosql/datasources/json/execution.go:196 +0xc1b
github.com/cube2222/octosql/datasources/json.getOctoSQLValue({0x8, {}, {}, {}, {}, {}, {}, {}, {0x0}, {{0xc00027aa00, ...}}, ...}, ...)
	/home/xk/octosql/datasources/json/execution.go:209 +0xa58
github.com/cube2222/octosql/datasources/json.getOctoSQLValue({0x7, {}, {}, {}, {}, {}, {}, {}, {0xc0005f9b60}, {{0x0, ...}}, ...}, ...)
	/home/xk/octosql/datasources/json/execution.go:196 +0xc78
github.com/cube2222/octosql/datasources/json.glob..func1.1()
	/home/xk/octosql/datasources/json/workers.go:68 +0x5b8
created by github.com/cube2222/octosql/datasources/json.glob..func1
	/home/xk/octosql/datasources/json/workers.go:44 +0x45

▓▒▓░ lb:/ (main|●2?1) [141/SIGPIPE|2]🍪  ls trace.out 
Permissions Size User Date Modified Git Name
.rw-r--r--@   16 xk   15 Oct 13:02   -N trace.out
████ lb:/ (main|●2?1) 🌍 cat trace.out
go 1.11 trace⏎                    

from octosql.

chapmanjacobd avatar chapmanjacobd commented on September 22, 2024

Interesting... when I only do the first line then I get a more clear error message:

░▒░█ lb:/ (main|●2?1) 🍞 unzstd --memory=2048MB --stdout ~/lb/psaw/files.pushshift.io/reddit/submissions/RS_2022-08.zst | head -1 | octosql "SELECT * FROM stdin.json" -o csv --profile trace
2022/10/15 13:04:57 profile: trace enabled, trace.out
panic: invalid value type to print in CSV: List

goroutine 1 [running]:
github.com/cube2222/octosql/outputs/formats.FormatCSVValue(0xc0003ac480, {0x7, 0x0, 0x0, 0x0, {0x0, 0x0}, {0x0, 0x0, 0x0}, ...})
	/home/xk/octosql/outputs/formats/csv_format.go:65 +0x785
github.com/cube2222/octosql/outputs/formats.(*CSVFormatter).Write(0xc00008b6a0, {0xc0009cc000, 0x5e, 0x0?})
	/home/xk/octosql/outputs/formats/csv_format.go:42 +0x12c
github.com/cube2222/octosql/outputs/eager.(*OutputPrinter).Run.func1({{0xcdd960?, 0x0?}}, {{0xc0009cc000, 0x5e, 0x5e}, 0x0, {0x0, 0x0, 0x0}})
	/home/xk/octosql/outputs/eager/eager.go:42 +0x54
github.com/cube2222/octosql/datasources/json.(*DatasourceExecuting).Run(0xc000497f50, {{0xf63870?, 0xc0003bd3b0?}, 0x0?}, 0xc0001abc50, 0x0?)
	/home/xk/octosql/datasources/json/execution.go:127 +0xc86
github.com/cube2222/octosql/outputs/eager.(*OutputPrinter).Run(0xc0002091c0, {{0xf63870?, 0xc0003bd3b0?}, 0x0?})
	/home/xk/octosql/outputs/eager/eager.go:39 +0x1e4
github.com/cube2222/octosql/cmd.glob..func4(0x157bd60, {0xc00009eb90, 0x1, 0x5?})
	/home/xk/octosql/cmd/root.go:458 +0x3b54
github.com/spf13/cobra.(*Command).execute(0x157bd60, {0xc000032250, 0x5, 0x5})
	/home/xk/go/pkg/mod/github.com/spf13/[email protected]/command.go:856 +0x67c
github.com/spf13/cobra.(*Command).ExecuteC(0x157bd60)
	/home/xk/go/pkg/mod/github.com/spf13/[email protected]/command.go:974 +0x3b4
github.com/spf13/cobra.(*Command).Execute(...)
	/home/xk/go/pkg/mod/github.com/spf13/[email protected]/command.go:902
github.com/spf13/cobra.(*Command).ExecuteContext(...)
	/home/xk/go/pkg/mod/github.com/spf13/[email protected]/command.go:895
github.com/cube2222/octosql/cmd.Execute({0xf637c8?, 0xc000208dc0?})
	/home/xk/octosql/cmd/root.go:471 +0x53
main.main()
	/home/xk/octosql/main.go:24 +0xe8
█░█▒ lb:/ (main|●2?1) [141/SIGPIPE|0|2]🌎 unzstd --memory=2048MB --stdout ~/lb/psaw/files.pushshift.io/reddit/submissions/RS_2022-08.zst | head -1 
{"all_awardings":[],"allow_live_comments":false,"archived":false,"author":"Kresley","author_created_utc":1390870929,"author_flair_background_color":null,"author_flair_css_class":null,"author_flair_richtext":[],"author_flair_template_id":null,"author_flair_text":null,"author_flair_text_color":null,"author_flair_type":"text","author_fullname":"t2_f0c4k","author_patreon_flair":false,"author_premium":false,"awarders":[],"banned_by":null,"can_gild":true,"can_mod_post":false,"category":null,"content_categories":null,"contest_mode":false,"created_utc":1659312011,"discussion_type":null,"distinguished":"moderator","domain":"self.podcasts","edited":false,"event_end":1659326400.0,"event_is_live":false,"event_start":1659312000.0,"gilded":0,"gildings":{},"hidden":false,"hide_score":false,"id":"wcciq1","is_created_from_ads_ui":false,"is_crosspostable":true,"is_meta":false,"is_original_content":false,"is_reddit_media_domain":false,"is_robot_indexable":true,"is_self":true,"is_video":false,"link_flair_background_color":"#7e53c1","link_flair_css_class":"announce","link_flair_richtext":[],"link_flair_template_id":"efc6fb7e-75ae-11ea-baed-0e1ba7588035","link_flair_text":"Announcement","link_flair_text_color":"light","link_flair_type":"text","locked":false,"media":null,"media_embed":{},"media_only":false,"name":"t3_wcciq1","no_follow":false,"num_comments":8,"num_crossposts":1,"over_18":false,"parent_whitelist_status":"all_ads","permalink":"\/r\/podcasts\/comments\/wcciq1\/have_questions_about_recent_supreme_court\/","pinned":false,"post_hint":"self","preview":{"enabled":false,"images":[{"id":"Qalc2eReon-KD6FX7LpDmEuzwjap9_yobQbqDYd2RoQ","resolutions":[{"height":108,"url":"https:\/\/external-preview.redd.it\/aY2_qQw82pmeuXTAHhzUC3BvZrwYhi9GKxP6Gw9ECQU.jpg?width=108&crop=smart&auto=webp&s=01924a24a46b6e764caf58f310678ac69a6c6c8b","width":108},{"height":216,"url":"https:\/\/external-preview.redd.it\/aY2_qQw82pmeuXTAHhzUC3BvZrwYhi9GKxP6Gw9ECQU.jpg?width=216&crop=smart&auto=webp&s=ae46ea72110dc7c060d102b496cfd457ecd6b950","width":216},{"height":320,"url":"https:\/\/external-preview.redd.it\/aY2_qQw82pmeuXTAHhzUC3BvZrwYhi9GKxP6Gw9ECQU.jpg?width=320&crop=smart&auto=webp&s=f6b853bbf5f384084a45c8dde00d5d7b97356d69","width":320}],"source":{"height":500,"url":"https:\/\/external-preview.redd.it\/aY2_qQw82pmeuXTAHhzUC3BvZrwYhi9GKxP6Gw9ECQU.jpg?auto=webp&s=dc05866f0c0b48980e43549ceb0955176f30e3c6","width":500},"variants":{}}]},"pwls":6,"quarantine":false,"removed_by":null,"removed_by_category":null,"retrieved_on":1660025808,"score":16,"secure_media":null,"secure_media_embed":{},"selftext":"We're lucky to have an attorney who is *also* a veteran podcaster kind enough to join here for a live Reddit Talk about recent US Supreme Court decisions, Justices and perhaps some other federal law decisions you may (or may not have) heard about in the news lately.  He may also have some friends and guests join. \n\n\nCharles Star is an attorney in Brooklyn, the host of [Hostile Witness](https:\/\/www.soundcloud.com\/hwpod), a panelist on [ALAB Series](https:\/\/www.alabseries.com\/) and the guy who did most of the talking on the late [Mic Dicta](https:\/\/soundcloud.com\/micdicta). He tweets at [@ugarles](https:\/\/twitter.com\/Ugarles) (and has a reddit account he almost never uses). He writes about legal issues from time to time and has bylines in Vice, Slate and a bunch of places that no longer exist. (He'd probably appear on your podcast.)  \n\n\n**What is Reddit Talk?**\n\nTalk allows us to host interactive sessions with podcasters and users of the community can join in and interact with the speakers in real time from either new.reddit on desktop or though (an official) Reddit app. A Talk can handle up to 30 different speakers and up to 100k listeners in the audience can tune in at the same time.\n\nYou can learn more about Reddit Talk from the user guide [here](https:\/\/reddithelp.com\/hc\/en-us\/articles\/4404349195284-What-is-Reddit-Talk-) .\n\n**Ask Away!**\n\nIf you can't make it live at that time, you may leave any questions you might have  as a comment on this announcement post and we'll try to pick some to read during the Talk. A recording of the Talk should continue to be available here after it concludes.\n\nYou can tap on **Follow** on this post if you'd like to receive a reminder notification when the Talk will begin, if you'd like to come ask a question live.","send_replies":true,"spoiler":false,"stickied":false,"subreddit":"podcasts","subreddit_id":"t5_2qh1d","subreddit_name_prefixed":"r\/podcasts","subreddit_subscribers":2322456,"subreddit_type":"public","suggested_sort":null,"thumbnail":"self","thumbnail_height":null,"thumbnail_width":null,"title":"Have questions about recent Supreme Court decisions? So do we! Join us Sunday @ 8:00 p.m. EST with a lawyer\/podcaster who can help answer how we got here.","top_awarded_type":null,"total_awards_received":0,"treatment_tags":[],"upvote_ratio":0.71,"url":"https:\/\/www.reddit.com\/r\/podcasts\/comments\/wcciq1\/have_questions_about_recent_supreme_court\/","view_count":null,"whitelist_status":"all_ads","wls":6}

from octosql.

chapmanjacobd avatar chapmanjacobd commented on September 22, 2024

That makes sense to me. Maybe instead of panic it should just be Error:

Error: typecheck error: invalid type for CSV: List

It works with no problems when specifying all columns except the list and object type columns. 38 million rows in 500 seconds.

unzstd --memory=2048MB --stdout ~/lb/psaw/files.pushshift.io/reddit/submissions/RS_2022-08.zst | octosql "SELECT author_flair_css_class,allow_live_co
mments,archived,author,author_patreon_flair,author_created_utc,author_flair_background_color,author_flair_template_id,author_flair_text,author_flair_text_color,author_flair_type,author_fullnam
e,author_premium,banned_by,can_gild,can_mod_post,category,content_categories,contest_mode,created_utc,discussion_type,distinguished,domain,edited,event_end,event_is_live,event_start,gilded,hid
den,hide_score,is_reddit_media_domain,id,is_created_from_ads_ui,is_crosspostable,is_meta,is_original_content,is_robot_indexable,is_self,is_video,link_flair_background_color,link_flair_css_clas
s,link_flair_template_id,link_flair_text,link_flair_text_color,link_flair_type,locked,media,media_only,name,no_follow,num_comments,num_crossposts,over_18,parent_whitelist_status,permalink,pinn
ed,post_hint,pwls,quarantine,removed_by,removed_by_category,retrieved_on,score,secure_media,send_replies,spoiler,stickied,subreddit,subreddit_id,subreddit_name_prefixed,subreddit_subscribers,s
ubreddit_type,suggested_sort,thumbnail_height,thumbnail,thumbnail_width,title,top_awarded_type,total_awards_received,upvote_ratio,url,view_count,whitelist_status,wls FROM stdin.json" -o csv > 
RS_2022-08.csv

Maybe the only thing to fix here is to bubble up the error somehow so that it does not show up as "invalid memory address or nil pointer dereference"

from octosql.

cube2222 avatar cube2222 commented on September 22, 2024

Thanks for the writeup!

I'll admit I have no idea why you got the nil dereference instead of the List panic there, but I'll look into it (and probably, as you suggested, turn it into a normal error, to avoid further similar issues).

Cheers!

from octosql.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.