embulk / embulk Goto Github PK
View Code? Open in Web Editor NEWEmbulk: Pluggable Bulk Data Loader.
Home Page: https://www.embulk.org/
License: Apache License 2.0
Embulk: Pluggable Bulk Data Loader.
Home Page: https://www.embulk.org/
License: Apache License 2.0
Some databases can store nested values.
Some file formats such as JSON or XM have nested values.
To transfer between those databases and file formats, embulk needs to support nested types.
Discussions are:
array(T)
like array(long)
. [1, 2, 3]array
. [1, "str", 0.3]struct(int, string))
Current preview command run input stage only.
I would like to check after filter mode result.
Data include a lot of broken records. A bulk import can skip them but we want to load them later as an exceptional case. To do it, we want to get those error records written to other files or databases.
Difficulty in terms of API design is that format of the records can be different depending on plugin types.
So, depending on plugins, error output needs to store 3 kinds of data:
a) file from a certain position
b) line
c) record with various schema
We can't install guess plugins using RubyGems because GuessExecutor uses fixed list of guess plugins (gzip, charset, newline, csv).
Some CSV files have some garbage lines.
Those lines are informative for some cases but they're not a part of table.
For example,
created date,2015-03-05
created by,me
device,xyz
time,key1,key2,key3,key4
2015-03-05 20:40:14 -0800,a,b,c,d
2015-03-05 20:40:14 -0800,a,b,c,d
2015-03-05 20:40:14 -0800,a,b,c,d
2015-03-05 20:40:14 -0800,a,b,c,d
2015-03-05 20:40:14 -0800,a,b,c,d
2015-03-05 20:40:14 -0800,a,b,c,d
...
When I executed "java -jar embulk-0.1.0.jar bundle ./data" according to README.md, this error occurred.
$ java -jar embulk-0.1.0.jar bundle ./data
Initializing ./data...
Errno::ENOENT: No such file or directory - No such file or directory
stat at org/jruby/RubyFile.java:848
fu_each_src_dest at /foo/bar/embulk/embulk-0.1.0.jar!/META-INF/jruby.home/lib/ruby/1.9/fileutils.rb:1525
fu_each_src_dest0 at /foo/bar/embulk/embulk-0.1.0.jar!/META-INF/jruby.home/lib/ruby/1.9/fileutils.rb:1541
fu_each_src_dest at /foo/bar/embulk/embulk-0.1.0.jar!/META-INF/jruby.home/lib/ruby/1.9/fileutils.rb:1523
cp at /foo/bar/embulk/embulk-0.1.0.jar!/META-INF/jruby.home/lib/ruby/1.9/fileutils.rb:395
run at file:/foo/bar/embulk/embulk-0.1.0.jar!/embulk/command/embulk_run.rb:122
each at org/jruby/RubyArray.java:1613
run at file:/foo/bar/embulk/embulk-0.1.0.jar!/embulk/command/embulk_run.rb:118
(root) at classpath:embulk/command/embulk.rb:38
I looked over lib/embulk/command/embulk_run.rb.
%w[.bundle/config embulk/input_example.rb embulk/output_example.rb examples/csv-stdout.yml examples/sample.csv.gz Gemfile Gemfile.lock].each do |file| # TODO get file list from the jar
url = resource_class.resource("/embulk/data/bundle/#{file}").to_s
dst = File.join(path, file)
FileUtils.mkdir_p File.dirname(dst)
FileUtils.cp(url, dst)
But url
seems empty and embulk-0.1.0.jar
doesn't include .bundle/config
.
In https://github.com/komamitsu/embulk-plugin-redis, I want to use "bin/embulk guess" command to guess a schema in Redis and generate schema information automatically as much as possible. Of cause it's not a complete schema because Redis is schema-less, though.
To achieve that, I think we need to add InputPlugin#guess method like this:
class InputXxxxx < InputPlugin
def self.guess(config)
x = Xxxxx.connect(
config.param('host', :string, :default => 'localhost'),
config.param('port', :integer, :default => 8888))
xs = x.get((0...x.size).select{|i| i % 10 == 0})
// Checking the schemas of `xs'...
:
cols << Column.new(idx, schema_name, schema_type)
:
cols
end
As this commit shows, creating a new release is complicated:
24cc193
I want to make them as like this:
rake set_verson 0.2.1
(runs step 1,2,3,4)It shows embulk's version number...!
Mac OS X 10.10.1
Oracle Java version "1.7.0_45"
Yoshida-no-MacBook-Air:Documents takumi$ java -jar embulk.jar gem list
*** LOCAL GEMS ***
ffi (1.9.3 java)
jar-dependencies (0.1.2)
jruby-openssl (0.9.5 java)
json (1.8.0 java)
krypt (0.0.2)
krypt-core (0.0.2 universal-java)
krypt-provider-jdk (0.0.2)
rake (10.1.0)
rdoc (4.1.2)
Yoshida-no-MacBook-Air:Documents takumi$ java -jar embulk.jar gem install gzip
Fetching: gzip-1.0.gem (100%)
Successfully installed gzip-1.0
1 gem installed
Yoshida-no-MacBook-Air:Documents takumi$ java -jar embulk.jar guess embulk-example/example.yml -o config.yml
2015-02-17 07:09:27,493 [INFO]: main:org.embulk.standards.LocalFileInputPlugin: Listing local files at directory '/Users/takumi/Documents/embulk-example/csv' filtering filename by prefix 'sample_'
2015-02-17 07:09:27,502 [INFO]: main:org.embulk.standards.LocalFileInputPlugin: Loading files [/Users/takumi/Documents/embulk-example/csv/sample_01.csv.gz]
Embulk::PluginLoadError: Unknown guess plugin 'gzip'. embulk/guess/gzip.rb is not installed. Run 'embulk gem search -rd embulk-guess' command to find plugins.
lookup at file:/Users/takumi/Documents/embulk.jar!/embulk/plugin_registry.rb:30
lookup at file:/Users/takumi/Documents/embulk.jar!/embulk/plugin.rb:188
new_java_guess at file:/Users/takumi/Documents/embulk.jar!/embulk/plugin.rb:178
new_java_guess at /Users/takumi/Documents/embulk.jar!/META-INF/jruby.home/lib/ruby/1.9/forwardable.rb:201
run at file:/Users/takumi/Documents/embulk.jar!/embulk/command/embulk_run.rb:275
(root) at classpath:embulk/command/embulk.rb:39
org/embulk/plugin/PluginManager.java:42:in `newPlugin': org.embulk.config.ConfigException: GuessPlugin 'gzip' is not found
from org/embulk/spi/ExecSession.java:106:in `newPlugin'
from org/embulk/spi/Exec.java:54:in `newPlugin'
from org/embulk/exec/GuessExecutor.java:267:in `run'
from org/embulk/spi/FileInputRunner.java:129:in `run'
from org/embulk/exec/GuessExecutor.java:118:in `run'
from org/embulk/spi/FileInputRunner.java:101:in `run'
from org/embulk/exec/GuessExecutor.java:251:in `transaction'
from org/embulk/spi/FileInputRunner.java:95:in `run'
from org/embulk/spi/util/Decoders.java:77:in `transaction'
from org/embulk/spi/util/Decoders.java:33:in `transaction'
from org/embulk/spi/FileInputRunner.java:92:in `run'
from org/embulk/exec/GuessExecutor.java:170:in `transaction'
from org/embulk/spi/FileInputRunner.java:60:in `transaction'
from org/embulk/exec/GuessExecutor.java:114:in `runGuessInput'
from org/embulk/exec/GuessExecutor.java:93:in `doGuess'
from org/embulk/exec/GuessExecutor.java:34:in `access$000'
from org/embulk/exec/GuessExecutor.java:73:in `run'
from org/embulk/exec/GuessExecutor.java:70:in `run'
from org/embulk/spi/Exec.java:21:in `doWith'
from org/embulk/exec/GuessExecutor.java:70:in `guess'
from org/embulk/command/Runner.java:168:in `guess'
from org/embulk/command/Runner.java:70:in `main'
from java/lang/reflect/Method.java:606:in `invoke'
from file:/Users/takumi/Documents/embulk.jar!/embulk/command/embulk_run.rb:275:in `run'
from classpath:embulk/command/embulk.rb:39:in `(root)'
from classpath_3a_embulk/command/classpath:embulk/command/embulk.rb:39:in `(root)'
from org/embulk/cli/Main.java:13:in `main'
Caused by:
file:/Users/takumi/Documents/embulk.jar!/embulk/plugin_registry.rb:30:in `lookup': org.jruby.exceptions.RaiseException: (PluginLoadError) Unknown guess plugin 'gzip'. embulk/guess/gzip.rb is not installed. Run 'embulk gem search -rd embulk-guess' command to find plugins.
from file:/Users/takumi/Documents/embulk.jar!/embulk/plugin.rb:188:in `lookup'
from file:/Users/takumi/Documents/embulk.jar!/embulk/plugin.rb:178:in `new_java_guess'
from /Users/takumi/Documents/embulk.jar!/META-INF/jruby.home/lib/ruby/1.9/forwardable.rb:201:in `new_java_guess'
from file:/Users/takumi/Documents/embulk.jar!/embulk/command/embulk_run.rb:275:in `run'
from classpath:embulk/command/embulk.rb:39:in `(root)'
Yoshida-no-MacBook-Air:Documents takumi$ java -jar embulk.jar gem list
*** LOCAL GEMS ***
ffi (1.9.3 java)
gzip (1.0)
jar-dependencies (0.1.2)
jruby-openssl (0.9.5 java)
json (1.8.0 java)
krypt (0.0.2)
krypt-core (0.0.2 universal-java)
krypt-provider-jdk (0.0.2)
rake (10.1.0)
rdoc (4.1.2)
@frsyuki @nahi Embulk on spi2 branch doesn't work and gets NameError: uninitialized constant Embulk::GuessCsv::TFGuess
. Could you fix it or do you have any workarounds?
$ java -cp $(echo $(find */target -name "*.jar") | sed "s/ /:/g") org.embulk.cli.Embulk examples/config.yml
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/muga/works/workspace/embulk/embulk-cli/target/dependency/slf4j-log4j12-1.7.9.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/muga/works/workspace/embulk/embulk-core/target/dependency/slf4j-log4j12-1.7.9.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/muga/works/workspace/embulk/embulk-standards/target/dependency/slf4j-log4j12-1.7.9.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" org.jruby.exceptions.RaiseException: (NameError) uninitialized constant Embulk::GuessCsv::TFGuess
at org.jruby.RubyModule.const_missing(org/jruby/RubyModule.java:2723)
at RUBY.guess_type(file:/Users/muga/works/workspace/embulk/embulk-cli/target/dependency/embulk-core-0.1.0-SNAPSHOT.jar!/embulk/guess_csv.rb:159)
at RUBY.guess_field_types(file:/Users/muga/works/workspace/embulk/embulk-cli/target/dependency/embulk-core-0.1.0-SNAPSHOT.jar!/embulk/guess_csv.rb:115)
at org.jruby.RubyArray.each(org/jruby/RubyArray.java:1613)
at org.jruby.RubyEnumerable.each_with_index(org/jruby/RubyEnumerable.java:977)
at RUBY.guess_field_types(file:/Users/muga/works/workspace/embulk/embulk-cli/target/dependency/embulk-core-0.1.0-SNAPSHOT.jar!/embulk/guess_csv.rb:115)
at org.jruby.RubyArray.each(org/jruby/RubyArray.java:1613)
at RUBY.guess_field_types(file:/Users/muga/works/workspace/embulk/embulk-cli/target/dependency/embulk-core-0.1.0-SNAPSHOT.jar!/embulk/guess_csv.rb:114)
at RUBY.guess_lines(file:/Users/muga/works/workspace/embulk/embulk-cli/target/dependency/embulk-core-0.1.0-SNAPSHOT.jar!/embulk/guess_csv.rb:29)
at RUBY.guess(file:/Users/muga/works/workspace/embulk/embulk-cli/target/dependency/embulk-core-0.1.0-SNAPSHOT.jar!/embulk/guess_plugin.rb:105)
at RUBY.guess(file:/Users/muga/works/workspace/embulk/embulk-cli/target/dependency/embulk-core-0.1.0-SNAPSHOT.jar!/embulk/guess_plugin.rb:44)
at Embulk$$GuessPlugin$$JavaAdapter_1955955176.guess(Embulk$$GuessPlugin$$JavaAdapter_1955955176.gen:13)
It throws an exception:
java.lang.RuntimeException: java.lang.AssertionError: OutputStreamFileOutput is not implemented yet
We want to apply conversion or filtering to records when:
We need JavaDoc for plugin development in Java.
Embulk has only LocalExecutor but distributed execution is one of the main goals of Embulk.
To support various distributed execution environments, having pluggable & dynamically-loadable executor SPI is good.
I got the following error during executing output-elasticsearch plugin. The root cause is that a context classloader cannot find the resoruce file. The problem might occur during other Java plugins are executed.
org/elasticsearch/env/Environment.java:213:in `resolveConfig': org.elasticsearch.env.FailedToResolveConfigException: Failed to resolve config path [names.txt], tried file path [names.txt], path file [.../config/names.txt], and classpath
from org/elasticsearch/node/internal/InternalSettingsPreparer.java:119:in `prepareSettings'
from org/elasticsearch/client/transport/TransportClient.java:157:in `<init>'
from org/elasticsearch/client/transport/TransportClient.java:115:in `<init>'
Let me know how to set bintray_user
:
> ./gradlew build
FAILURE: Build failed with an exception.
* Where:
Build file '/Users/leo/work/git/embulk/build.gradle' line: 29
* What went wrong:
A problem occurred evaluating root project 'embulk'.
> Could not find property 'bintray_user' on com.jfrog.bintray.gradle.BintrayExtension_Decorated@108a46d6.
* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.
BUILD FAILED
Total time: 2.691 secs
Ruby Symbols are converted to String implicitly after passing them to yield
as a parameter in InputPlugin.transaction.
module Embulk
class InputRedis < InputPlugin
Plugin.register_input('redis', self)
def self.transaction(config, &control)
task = {
:
:hello => :world
}
commit_reports = yield(task, columns, 1)
:
def run
require 'pp'
pp @task ==> {...., "hello"=>"world"}
I just looked over the source code. org.embulk.config.ModelManager
uses com.fasterxml.jackson.databind.ObjectMapper#writeValueAsString
and I guess it converts RubySymbol to just String internally and it's related to the issue.
We can write following how-to documents:
Unlike other plugins, users normally don't know name of guess plugins. So, they don't write name of the guess plugins to configuration files.
To make guess plugins pluggable, Embulk should find and load installed guess plugins automatically.
More Ruby plugsin!
embulk exits with status 1 when version(--version
) or help(--help
) command is executed.
It would be great if it exits with 0 in such cases.
It's a common work for input plugins to guess schema from schema-less records.
An idea here is to provide a library to make it easy.
Once bulid.grade file gets stabled, we don't need pom.xml any more.
Delete pom.xml and update README.md.
I want to set the error ratio for rolling back the transaction.
When DataSource.merge finds an array, it appends new array to existent array. This is good when it merges guessed NextConfig to original ConfigSource.
However, it's not good to merge NextConfig of previous execution to the previous ConfigSource.
We need 2 merges:
{"array": [1,2,3]}.merge({"array": [4,5,6]})
=> {"array":[4,5,6]}
{"array": [1,2,3]}.merge({"array": [4,5,6]})
=> {"array":[1,2,3,4,5,6]}
We should add more information to PreviewResult. Because we check the result of PreviewExecutor and often determine whether we should execute run command or not. The current PreviewResult provides schema and list of ignored exceptions for users. I think that it's better to add more:
I also think that the above information can be build by using CommitReport on https://github.com/embulk/embulk/blob/master/embulk-core/src/main/java/org/embulk/exec/PreviewExecutor.java#L106
CsvParserPlugin has max_quoted_size_limit.
https://github.com/embulk/embulk/blob/master/embulk-standards/src/main/java/org/embulk/standards/CsvParserPlugin.java#L73
But it is not used in CsvTokenizer.
https://github.com/embulk/embulk/blob/master/embulk-standards/src/main/java/org/embulk/standards/CsvTokenizer.java#L30
As this comment says, shouldn't it be {"done"=>done}
?
https://github.com/enukane/embulk-plugin-input-pcapng-files/pull/2/files#diff-8360174ae27f7aace992035f0fcec11eR38
The "last_path" is useful for retrying LocalFileInputPlugin to recover.
"new ruby-input" generator does not work properly.
java -jar ./embulk-0.4.1.jar new ruby-input sample
% java -jar ./embulk-0.4.1.jar new ruby-input sample
Creating embulk-input-sample/
Creating embulk-input-sample/README.md
Creating embulk-input-sample/LICENSE.txt
Creating embulk-input-sample/.gitignore
Creating embulk-input-sample/Rakefile
Creating embulk-input-sample/Gemfile
Creating embulk-input-sample/embulk-input-sample.gemspec
Creating embulk-input-sample/lib/embulk/input/sample.rb
rake build (and failed.)
% cd embulk-input-sample/
% rake build
rake aborted!
There was a SyntaxError while loading embulk-input-sample.gemspec:
/home/embulk/embulk-input-sample/embulk-input-sample.gemspec:7: syntax error, unexpected tIDENTIFIER, expecting keyword_end
...h the output plugins by "embulk-output" keyword."
... ^
/home/arch/embulk-input-sample/embulk-input-sample.gemspec:7: syntax error, unexpected tSTRING_BEG, expecting keyword_do or '{' or '('
...tput plugins by "embulk-output" keyword."
... ^
/home/embulk/embulk-input-sample/Rakefile:1:in `<top (required)>'
(See full trace by running task with --trace)
Can't use double quote "embulk-output" in gemspec file.
spec.description = "Sample input plugin is an Embulk plugin that loads records from Sample so that any output plugins can receive the records. Search the output plugins by "embulk-output" keyword."
Use single quote better
spec.description = "Sample input plugin is an Embulk plugin that loads records from Sample so that any output plugins can receive the records. Search the output plugins by 'embulk-output' keyword."
And one more thing.
module Embulk
module Input
class SampleInputPlugin < InputPlugin
Plugin.register_input(sample, self)
need single quote??
module Embulk
module Input
class SampleInputPlugin < InputPlugin
Plugin.register_input('sample', self)
This is an important issue!
Embulk can't load java-based plugins (jar files) dynamically. Users need to add the jar files to classpath. Inconvenient...
It's developer friendly if plugins can access 'task' definition both via Symbol and String as a key.
def self.transaction(config, &control)
...
task[:foo] = 42
yield(task, schema, threads)
...
report
end
attr_reader :task
def run
task[:foo] #=> nil because of ser-de, expecting 42
end
With 3 processors, I got following log messages:
2015-02-10 17:07:42,753 [INFO]: embulk-executor-1:org.embulk.output.postgresql.PostgreSQLCopyBatchInsert: Loading 10 rows (290 bytes)
2015-02-10 17:07:42,753 [INFO]: embulk-executor-0:org.embulk.output.postgresql.PostgreSQLCopyBatchInsert: Loading 10 rows (290 bytes)
2015-02-10 17:07:42,753 [INFO]: embulk-executor-2:org.embulk.output.postgresql.PostgreSQLCopyBatchInsert: Loading 10 rows (290 bytes)
2015-02-10 17:07:42,755 [INFO]: embulk-executor-2:org.embulk.output.postgresql.PostgreSQLCopyBatchInsert: > 0.00 seconds (loaded 10 rows in total)
2015-02-10 17:07:42,755 [INFO]: embulk-executor-0:org.embulk.output.postgresql.PostgreSQLCopyBatchInsert: > 0.00 seconds (loaded 10 rows in total)
2015-02-10 17:07:42,755 [INFO]: embulk-executor-1:org.embulk.output.postgresql.PostgreSQLCopyBatchInsert: > 0.00 seconds (loaded 10 rows in total)
2015-02-10 17:07:42,756 [INFO]: main:org.embulk.exec.LocalExecutor: {done: 3 / 3, running: 0}
2015-02-10 17:07:42,756 [INFO]: main:org.embulk.exec.LocalExecutor: {done: 3 / 3, running: 0}
2015-02-10 17:07:42,757 [INFO]: main:org.embulk.exec.LocalExecutor: {done: 3 / 3, running: 0}
2015-02-10 17:07:42,763 [INFO]: main:org.embulk.output.jdbc.JdbcOutputConnection: SQL: SET search_path TO "public"
2015-02-10 17:07:42,763 [INFO]: main:org.embulk.output.jdbc.JdbcOutputConnection: > 0.00 seconds
2015-02-10 17:07:42,763 [INFO]: main:org.embulk.output.jdbc.JdbcOutputConnection: SQL: DROP TABLE IF EXISTS "load01"
2015-02-10 17:07:42,765 [INFO]: main:org.embulk.output.jdbc.JdbcOutputConnection: > 0.00 seconds
2015-02-10 17:07:42,765 [INFO]: main:org.embulk.output.jdbc.JdbcOutputConnection: SQL: ALTER TABLE "load01_0000000054daab5e0b71b000_BULK_LOAD_TEMP" RENAME TO "load01"
2015-02-10 17:07:42,765 [INFO]: main:org.embulk.output.jdbc.JdbcOutputConnection: > 0.00 seconds
It's difficult to know which message is from which task processor thread. If the message has processor index, it will be:
2015-02-10 17:07:42,753 [INFO]: [1] embulk-executor-1:org.embulk.output.postgresql.PostgreSQLCopyBatchInsert: Loading 10 rows (290 bytes)
2015-02-10 17:07:42,753 [INFO]: [2] embulk-executor-0:org.embulk.output.postgresql.PostgreSQLCopyBatchInsert: Loading 10 rows (290 bytes)
2015-02-10 17:07:42,753 [INFO]: [3] embulk-executor-2:org.embulk.output.postgresql.PostgreSQLCopyBatchInsert: Loading 10 rows (290 bytes)
GuessPlugin completes "parser" and "decoders" section of a configuration file. But we also want to make "in" section guessable.
For example, redis is a schema-less database (not file format). It's InputPlugin (not FileInputPlugin or ParserPlugin). To make the behavior deterministic, we should write schema to configuration file. But it's hard work to write the all schema manually. Plugins should be able to help it.
aws-sdk jar is too large to include the single-package jar.
The idea is to remove the dependency from embulk and create embulk-plugin-s3
Nothing!
For transactional example
Idea:
I want to execute Embulk using this command line:
$ embulk run config.yml --execute-command "ssh %(find_host %{index}) embulk run-task %{task} %{index}"
With this command, embulk executes the given ssh command for each task. Then embulk run-task
executes the given task on another host.
This is the simplest way to run embulk on distributed environment. We don't have to install Hadoop, YARN, Mesos, etc.
Some distributed execution environment such as mpich2 or Sun Grid Engine may work using this command-line.
See also: #29
A bulk load likely fails when we import large amount of data (say 1TB). When it fails, Embulk currently requires plugins to rollback entire transaction so that we can retry.
However, retrying the entire bulk load of the large data takes too long time. It's very good if we can resume the failed transaction from partially succeeded state.
Difficulties:
Java developers probably don't want to write Ruby code and don't know Ruby's packaging system.
But Embulk uses RubyGems for packaging.
Idea here is that Embulk provides a gradle plugin to automate the packaging.
The plugin does:
See also: #28
I want to set log level to embulk command arguments.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.