src-d / gitbase Goto Github PK
View Code? Open in Web Editor NEWSQL interface to git repositories, written in Go. https://docs.sourced.tech/gitbase
License: Apache License 2.0
SQL interface to git repositories, written in Go. https://docs.sourced.tech/gitbase
License: Apache License 2.0
Hello,
How is this project related to https://github.com/cloudson/gitql ?
I guess you did some research before creating this organization and this repository.
We have ways of joining almost all tables in gitquery, except commits and trees.
We could have another UDF to perform the join between these two tables:
commit_has_tree(commit_hash, tree_hash)
For example, select commit messages with Go files:
SELECT message
FROM commits
INNER JOIN tree_entries
ON commit_has_tree(hash, tree_hash)
WHERE name LIKE '%.go';
If we add this UDF, I also propose to rename commit_contains
to commit_has_blob
, because commit can contain many things and is consistent with this naming.
Thoughts? /cc @mcarmonaa @jfontan @ajnavarro
Add support for NULL values:
Depends on src-d/go-mysql-server#1
Previous comments:
This is missing the repo_id parameter, right?
After a talk we decided to do not add repo_id. The performance of that udfs will be improved using indexes. At the begining will be really slow.
So, if the
repo_id
is missing and the only things the UDF has arecommit_hash
andcommit_blob
, how are we supposed to retrieve that info?
Repository Pool does not have all repositories opened, right? So you can't just iterate them all until you find a match. The UDF should receive something with the repo associated to the given row or something along those lines. Otherwise, where is the UDF supposed to look for?
Given a commit hash, it will always contains the specified blob or not. In the future, we will have a bitmap index to be able to answer this kind of questions. Right now, the only way that we have to do it is iterate over all the repositories.
Also, if the commit is repeated in several repositories, it will appears n times on the result.
Also, you don't need to have all the repositories opened, you can iterate them and send commits per each repository, and filter that ones that does not match.
So, for each row that uses that UDF we have to iterate all repositories again?
Right now, yes. In the future it will be a simple query to an index. Also the UDF can be improved to be executed at the table iterator level, like another column. Doing this, you don't need to iterate over all the repositories per each column again.
--> Executing query: SELECT author_name from commits order by COUNT(*);
panic: interface conversion: interface is string, not int32
goroutine 1 [running]:
panic(0x9839e0, 0xc420fb87c0)
/usr/local/go/src/runtime/panic.go:500 +0x1a1
github.com/gitql/gitql/sql.(*integerType).Compare(0xd5d788, 0x949d20, 0xc4201ef910, 0x949d20, 0xc429cec240, 0xc429cec240)
<autogenerated>:45 +0x82
github.com/gitql/gitql/sql/plan.(*sorter).Less(0xc42811bf20, 0x0, 0x1792, 0x0)
/home/antonio/work/src/github.com/gitql/gitql/sql/plan/sort.go:156 +0x1e5
sort.medianOfThree(0xcf59c0, 0xc42811bf20, 0x0, 0x1792, 0x2f24)
/usr/local/go/src/sort/sort.go:74 +0x49
sort.doPivot(0xcf59c0, 0xc42811bf20, 0x0, 0xbc97, 0x53a887, 0xc42006c400)
/usr/local/go/src/sort/sort.go:99 +0x601
sort.quickSort(0xcf59c0, 0xc42811bf20, 0x0, 0xbc97, 0x1f)
/usr/local/go/src/sort/sort.go:188 +0x83
sort.Sort(0xcf59c0, 0xc42811bf20)
/usr/local/go/src/sort/sort.go:222 +0x80
github.com/gitql/gitql/sql/plan.(*sortIter).computeSortedRows(0xc4219fe540, 0x1, 0xb)
/home/antonio/work/src/github.com/gitql/gitql/sql/plan/sort.go:126 +0x292
github.com/gitql/gitql/sql/plan.(*sortIter).Next(0xc4219fe540, 0xc425fc24d0, 0xc4201ef610, 0xc42030d9e0, 0x4a19fc, 0xc425fc2480)
/home/antonio/work/src/github.com/gitql/gitql/sql/plan/sort.go:92 +0xd9
github.com/gitql/gitql/sql/plan.(*iter).Next(0xc42571f940, 0xc4201ef600, 0x1, 0x1, 0x0, 0x0)
/home/antonio/work/src/github.com/gitql/gitql/sql/plan/project.go:65 +0x38
main.(*cmdQueryBase).printQuery(0xc4201d6b40, 0xc42571f960, 0x1, 0x1, 0xcf2700, 0xc42571f940, 0xa34a17, 0x6, 0x0, 0x0)
/home/antonio/work/src/github.com/gitql/gitql/cmd/gitql/query_base.go:65 +0x222
main.(*CmdShell).Execute(0xc4201d6b40, 0xc4201a96c0, 0x0, 0x1, 0x1, 0x1)
/home/antonio/work/src/github.com/gitql/gitql/cmd/gitql/shell.go:75 +0xa5a
github.com/jessevdk/go-flags.(*Parser).ParseArgs(0xc420022a20, 0xc42000c5f0, 0x1, 0x1, 0x4, 0x2, 0xc42001c900, 0xc4200ab080, 0xc4200aaf48)
/home/antonio/work/src/github.com/jessevdk/go-flags/parser.go:316 +0x8e6
github.com/jessevdk/go-flags.(*Parser).Parse(0xc420022a20, 0xa36250, 0x7, 0xa44624, 0x1d, 0x0)
/home/antonio/work/src/github.com/jessevdk/go-flags/parser.go:186 +0x74
main.main()
/home/antonio/work/src/github.com/gitql/gitql/cmd/gitql/main.go:16 +0x38b
go-mysql-server
is now ignored by go dep. After the initial development is done and the code is more stable we should manage this dependency with dep.
Depends on src-d/go-mysql-server#1
Hello. I am trying to use this tool but experiencing some trouble. When I input go get github.com/sqle/gitquery
in my command line, the installation fails with following messages:
# github.com/sqle/gitquery
opt/go/src/github.com/sqle/gitquery/commits.go:52: cannot use cIter (type object.CommitIter) as type *object.CommitIter in field value:
*object.CommitIter is pointer to interface, not interface
opt/go/src/github.com/sqle/gitquery/commits.go:65: i.i.Next undefined (type *object.CommitIter is pointer to interface, not interface)
opt/go/src/github.com/sqle/gitquery/commits.go:73: i.i.Close undefined (type *object.CommitIter is pointer to interface, not interface)
Since I am a total newbie in golang, I don't even know whether it is a bug or not. However, is this a bug? I am running go with go1.8.3 linux/amd64 in Ubuntu 16.04.
Thanks for reading. Looking forward to your response.
In order to use gitql for scripting, CSV output is more convenient than formatted tables.
➜ gitql git:(f028104) ✗ gitql query 'SELECT * FROM tags'
SELECT * FROM tags
+------+------+--------------+-------------+-------------+---------+--------+
| HASH | NAME | TAGGER EMAIL | TAGGER NAME | TAGGER WHEN | MESSAGE | TARGET |
+------+------+--------------+-------------+-------------+---------+--------+
+------+------+--------------+-------------+-------------+---------+--------+
The table objects
is deprecated and must be removed.
Executing the query without alias, the result is as expected:
!> SELECT COUNT(*) FROM commits;
--> Executing query: SELECT COUNT(*) FROM commits;
+----------+
| COUNT(*) |
+----------+
| 48279 |
+----------+
But if we add an alias:
!> SELECT COUNT(*) as c FROM commits limit 10;
--> Executing query: SELECT COUNT(*) as c FROM commits limit 10;
+------------------------------------------+
| C |
+------------------------------------------+
| 6fa4b393c01a84c9adf2e2435fba6de13227eabf |
| f6fe463165824f26efe6aaabaa352032f6f93886 |
| 2ba79c4f8ad74b87ef44dd692d46706adbb9e8d0 |
| d07b93103dc7e8bcba010541efa5b0a2394ea6a7 |
| 054bc50dde1a194bbd9a69a72004c2e18b19852f |
| deaab788af9a4f1ed8ed8193b20e3cffb1555b20 |
| 2cfc70f0de7c8902791d3b23a92d6462b9e11d72 |
| fc8af32b8ddaeddf12542cc233631b3dccf4724a |
| 27d5987585e08376915ca02ebb53cfc0a40a39f0 |
| bc01db39ce5bf4d227bc5ef6d9b95bb5f5390f5c |
+------------------------------------------+
Add objects
table iterating over all git objects regardless of its type.
We should implement filter and column pushdown once they land in go-mysql-server.
I don't think we have any computationally expensive columns yet, so we shouldn't care much about column pushdown, but we should about filter pushdown, which can reduce a lot the amount of data sent.
Right now the names are 2/3 characters:
type Database struct {
name string
cr sql.Table
tr sql.Table
rr sql.Table
ter sql.Table
br sql.Table
or sql.Table
rmr sql.Table
}
Change the names to something that could be understood.
The sql engine is actually backend-agnostic, so we'll create a separate GitHub project for it, so it can be versioned independently.
Master is not compiling because that Interfaces has been changed.
Add a server command to start a compatible mysql server.
hi,
which youtube/vitess sqlparser version should I use to build gitql?
build from source:
gitql tag v0.3.0 with youtube/vitess branch release-2.1
sql/parse/parse.go:140: tn.Name.String undefined (type sqlparser.TableIdent has no field or method String)
sql/parse/parse.go:328: undefined: sqlparser.HexVal
sql/parse/parse.go:343: v.Name.Lowered undefined (type string has no field or method Lowered)
using go get
:
ZhengdeMacBook-Pro:gitql hanzheng$ go get github.com/gitql/gitql
# github.com/gitql/gitql/sql/parse
sql/parse/parse.go:320: sqlparser.StrVal (type sqlparser.ValType) is not a type
sql/parse/parse.go:321: cannot convert v (type sqlparser.ValExpr) to type string
sql/parse/parse.go:324: undefined: sqlparser.NumVal
sql/parse/parse.go:326: cannot convert v (type sqlparser.ValExpr) to type string
sql/parse/parse.go:328: sqlparser.HexVal (type sqlparser.ValType) is not a type
thanks.
➜ gitql git:(f028104) ✗ gitql query 'SELECT size, name FROM blobs, tree_entries WHERE hash = entry_hash LIMIT 5'
SELECT size, name FROM blobs, tree_entries WHERE hash = entry_hash LIMIT 5
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x4976d0]
goroutine 1 [running]:
panic(0x832420, 0xc420014130)
/usr/local/go/src/runtime/panic.go:500 +0x1a1
github.com/gitql/gitql/git.(*treeEntryIter).Next(0xc4202a67e0, 0xac7a80, 0xc4202a6bc0, 0x0, 0x0)
/home/smola/dev/go/src/github.com/gitql/gitql/git/tree_entries.go:79 +0x50
github.com/gitql/gitql/sql/plan.(*crossJoinIterator).fillRows(0xc420342be0, 0xc42001ee60, 0x18)
/home/smola/dev/go/src/github.com/gitql/gitql/sql/plan/cross_join.go:106 +0x38
github.com/gitql/gitql/sql/plan.(*crossJoinIterator).Next(0xc420342be0, 0x20, 0x7ffb5f379000, 0xc4202a68a0, 0x4)
/home/smola/dev/go/src/github.com/gitql/gitql/sql/plan/cross_join.go:75 +0x365
github.com/gitql/gitql/sql/plan.(*filterIter).Next(0xc4202a6800, 0xc4201f24d0, 0x1, 0x1, 0x2)
/home/smola/dev/go/src/github.com/gitql/gitql/sql/plan/filter.go:55 +0x38
github.com/gitql/gitql/sql/plan.(*limitIter).Next(0xc4202a6820, 0x4, 0x2, 0x1, 0x8b4759)
/home/smola/dev/go/src/github.com/gitql/gitql/sql/plan/limit.go:61 +0x4c
github.com/gitql/gitql/sql/plan.(*iter).Next(0xc4202a6840, 0xc4202a6880, 0x2, 0x2, 0x2)
/home/smola/dev/go/src/github.com/gitql/gitql/sql/plan/project.go:77 +0x38
main.(*CmdQuery).printQuery(0xc420011e00, 0xc4202717c0, 0x2, 0x2, 0xac6180, 0xc4202a6840)
/home/smola/dev/go/src/github.com/gitql/gitql/cmd/gitql/query.go:102 +0x1fc
main.(*CmdQuery).executeQuery(0xc420011e00, 0x0, 0x0)
/home/smola/dev/go/src/github.com/gitql/gitql/cmd/gitql/query.go:89 +0x33a
main.(*CmdQuery).Execute(0xc420011e00, 0xc42014c380, 0x0, 0x2, 0x1, 0x2)
/home/smola/dev/go/src/github.com/gitql/gitql/cmd/gitql/query.go:39 +0x69
github.com/jessevdk/go-flags.(*Parser).ParseArgs(0xc420076780, 0xc42000c310, 0x2, 0x2, 0x2, 0x1, 0xc42001e630, 0xc4200fa780, 0xc4200fa6c8)
/home/smola/dev/go/src/github.com/jessevdk/go-flags/parser.go:316 +0x8e6
github.com/jessevdk/go-flags.(*Parser).Parse(0xc420076780, 0x8b79fd, 0x7, 0x8c3a7d, 0x1d, 0x0)
/home/smola/dev/go/src/github.com/jessevdk/go-flags/parser.go:186 +0x74
main.main()
/home/smola/dev/go/src/github.com/gitql/gitql/cmd/gitql/main.go:15 +0x2e8
INSERT support, initially for the memory backend.
I think being able to export a db dump of all the tables described in README.md would be very useful.
go version
- go version go1.8.1 linux/amd64
go get github.com/sqle/gitquery/cmd/gitquery
src/github.com/sqle/gitquery/cmd/gitquery/query_base.go:9:2: use of internal package not allowed
This kinda prevented us from showing the tool on MSR tool session.
Vitess SQL parser does not support SHOW and DESCRIBE. So we are currently matching them ad hoc before passing the query to the parser. They should be properly supported in the SQL parser, contributing it to Vitess or, if that's not possible, forking it.
We need to add a specific session implementation for gitquery
Depends on src-d/go-mysql-server#36
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.