Giter VIP home page Giter VIP logo

gemini's Introduction

Gemini Build Status codecov

Find similar code in Git repositories

Gemini is a tool for searching for similar 'items' in source code repositories. The supported granularity levels for items are:

  • repositories (TBD)
  • files
  • functions

Gemini is based on its sister research project codenamed Apollo.

Run

./hash   <path-to-repos-or-siva-files>
./query  <path-to-file>
./report

You would need to prefix commands with docker-compose exec gemini if you run it in docker. Read below how to start gemini in docker or standalone mode.

Hash

To pre-process number of repositories for a quick finding of the duplicates run

./hash ./src/test/resources/siva

Input format of the repositories is the same as in src-d/Engine.

To pre-process repositories for search of similar functions run:

./hash -m func ./src/test/resources/siva

Besides local file system gemini support different distributed storages.

Query

To find all duplicate of the single file run

./query <path-to-single-file>

To find all similar function defined in a file run:

./query -m func <path-to-single-file>

If you are interested in similarities of only 1 function defined in the file you can run:

./query -m func <path-to-single-file>:<function name>:<line number where the function is defined>

Report

To find all duplicate files and similar functions in all repositories run

./report

All repositories must be hashed before and a community detection library installed.

Requirements

Docker

Start containers:

docker-compose up -d

Local directories repositories and query are available as /repositories and /query inside the container.

Examples:

docker-compose exec gemini ./hash /repositories
docker-compose exec gemini ./query /query/consumer.go
docker-compose exec gemini ./report

Standalone

You would need:

  • JVM 1.8
  • Apache Cassandra or ScyllaDB
  • Apache Spark 2.2.x
  • Python 3
  • Bblfshd v2.5.0+

By default, all commands are going to use

  • Apache Cassandra or ScyllaDB instance available at localhost:9042
  • Apache Spark, available though $SPARK_HOME
# save some repos in .siva files using Borges
echo -e "https://github.com/src-d/borges.git\nhttps://github.com/erizocosmico/borges.git" > repo-list.txt

# get Borges from https://github.com/src-d/borges/releases
borges pack --loglevel=debug --workers=2 --to=./repos -f repo-list.txt

# start Apache Cassandra
docker run -p 9042:9042 \
  --name cassandra -d rinscy/cassandra:3.11

# or ScyllaDB \w workaround https://github.com/gocql/gocql/issues/987
docker run -p 9042:9042 --volume $(pwd)/scylla:/var/lib/scylla \
  --name some-scylla -d scylladb/scylla:2.0.0 \
  --broadcast-address 127.0.0.1 --listen-address 0.0.0.0 --broadcast-rpc-address 127.0.0.1 \
  --memory 2G --smp 1

# to get access to DB for development
docker exec -it some-scylla cqlsh

Configuration for Apache Spark

Use env variables to set memory for hash job:

export DRIVER_MEMORY=30g
export EXECUTOR_MEMORY=60g

To use a external cluster just set the URL to the Spark Master though an env var:

MASTER="spark://<spark-master-url>" ./hash <path>

CLI arguments

All three commands accept parameters for database connection and logging:

  • -h/--host - cassandra/scylla db hostname, default 127.0.0.1
  • -p/--port - cassandra/scylla db port, default 9042
  • -k/--keyspace - cassandra/scylla db keyspace, default hashes
  • -v/--verbose - producing more verbose output, default false

For query and hash commands parameters for bblfsh/features extractor configuration are available:

  • -m/--mode - similarity modes: file or function, default file
  • --bblfsh-host - babelfish server host, default 127.0.0.1
  • --bblfsh-port - babelfish server port, default 9432
  • --features-extractor-host - features-extractor host, default 127.0.0.1
  • --features-extractor-port - features-extractor port, default 9001

Hash command specific arguments:

  • -l/--limit - limit the number of repositories to be processed. All repositories will be processed by default
  • -f/--format - format of the stored repositories. Supported input data formats that repositories could be stored in are siva, bare or standard, default siva
  • --gcs-keyfile - path to JSON keyfile for authentication in Google Cloud Storage

Report specific arguments:

  • --output-format - output format: text or json
  • --cassandra - Enable advanced cql queries for Apache Cassandra database

Limitations

Currently gemini targets medium size repositories and datasets.

We set resonable defaults and pre-filtering rules to provide the best results for this case. List of rules:

  • Exclude binary files
  • Exclude empty files from full duplication results
  • Exclude files less than 500B from file-similarity results
  • Similarity deduplication works only for languages supported by babelfish and syntactically correct files

Performance tips

We recommend to run Spark with 10GB+ memory for each executer and for the driver. Gemini wouldn't benifit from more than 1 CPU per task.

Horizontal scaling doesn't work well for the first stage of the pipeline and depends on size of the biggest repositories in a dataset but the rest of pipeline scales well.

Distributed storages

Gemini supports different distributed storages in local and cluster mode. It already includes all necessary jars as a part of fat jar.

HDFS

Path format to git repositories: hdfs://hdfs-namenode/path

To configure HDFS in local or cluster mode please consult Hadoop documentation.

Google Cloud Storage

Path format to git repositories: gs://bucket/path

To connect to GCS locally use --gcs-keyfile flag with path to JSON keyfile.

To use GCS in cluster mode please consult Google Cloud Storage Connector documentation.

Amazon Web Services S3

Path format to git repositories: s3a://bucket/path

To connect to S3 locally use following flags:

  • --aws-key - AWS access keys
  • --aws-secret - AWS access secret
  • --aws-s3-endpoint - region endpoint of your S3 bucket

Due to some limitations passing key&secret as part of URI is not supported.

To use AWS S3 in cluster mode please consult hadoop-aws documentation

Known bugs

  • Search for similarities in C# code isn't supported right now (patch with workaround)
  • Timeout for UAST extraction is relatevely low on real dataset according to our experience and it isn't configurable (patch1 and path2 with workaround)
  • For standard & bare format gemini prints wrong repositories listing (issue)

Development

Compile & Run

If env var DEV is set, ./sbt is used to compile and run all non-Spark commands: ./hash and ./report. This is a convenient for local development, as not requiring a separate "compile" step allows for a dev workflow that is similar to experience with interpreted languages.

Build

To build final .jars for all commands

./sbt assemblyPackageDependency
./sbt assembly

Instead of 1 fatJar we bulid 2, separating all the dependencies from actual application code to allow for lower build times in case of simple changes.

Test

To run tests, that rely

./sbt test

Re-generate code

Latest generated code for gRPC is already checked in under src/main/scala/tech/sourced/featurext. In case you update any of the src/main/proto/*.proto, you would need to generate gRPC code for Feature Extractors:

./src/main/resources/generate_from_proto.sh

To generate new protobuf messages fixtures for tests, you may use bblfsh-sdk-tools:

bblfsh-sdk-tools fixtures -p .proto -l <LANG> <path-to-source-code-file>

License

Copyright (C) 2018 source{d}. This project is licensed under the GNU General Public License v3.0.

gemini's People

Contributors

bzz avatar carlosms avatar dpordomingo avatar marnovo avatar smacker avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gemini's Issues

Report: community detection library

Part of the #54: create an internal library for community detection.

Small internal library in Python (a function or class) for community detection using IGraph, based on the algorithm from graph.py#detect_communities().

Should include a very basic tests (but not really testing IGraph implementation itself).

Release v0.0.1

Setup release automation if needed and produce a first release.

  • tag
  • release notes
  • GH release
  • CI automation on tag
  • push image to DockerHub (handled separately under #33 )

Add URLs to the output of report and query commands

Show links to the files when it's found duplicates.

  • Add reference hash to DB.
  • For Github https://github.com/<repo>/blob/<ref_hash>/<file_path>
  • For Gitlab - same URL schema is the same as github
  • For BitBucket, according to docs 
https://bitbucket.org/<repo>/src/<ref_hash>/<file_path>

Improve listRepositories function

It should be able to work with all "formats". Siva (regular and buckets), bare, regular.
It also should provide different output depends on the number of repositories.

Report: community detection cli app

Part of #54: small app to read connected components, do community detection and pretty-print the results.

A CLI app in Python, that reads the connected components from Parquet created in #58, does community detection using the library in #59 and prints the output to STDOUT, in a format consistent \w current duplicate output format.

A call to this app should also be included in ./report, right after a duplicate detection, so for a end user whole process will look like a single application.

Not mandatory, but might be a good idea to use some kind of simple templating library for formatting the output, so templates can be shared between JVM and Python.

Flaky tests: Dataframe containing duplicated files is not properly saved

Problem

tests sometimes fail with the following output.

How to reproduce

Run ./sbt test many times, and it'll fail sometimes. It is difficult to reproduce in the CI (because it takes more time), but I could see there that error a couple of times.

Using the tip of dpordomingo/gemini::reproduce-issue-22, I created a gist with the 3 scenarios I found:

Test succeeds

No data is stored in the test_hashes_duplicates

Test fails

No data at all is stored in the test_hashes_duplicates

Partial data is stored in the test_hashes_duplicates

  • report3-failing.txt -> output of the failing test
  • report3-failing-db.txt -> content of the testing keyspaces. Only the half of the data retrieved by engine were saved in test_hashes_duplicates (the srcd/borges repo is hashed, but erizocosmico/borges repo is lost.

More info

It requires more investigation, but it could be related when how data is stored during the tests before the test cases start running.

Currently, in the beforeAll section, it is created two keyspaces (test_hashes_duplicates for tests needing duplicates, test_hashes_uniques for tests needing no duplicates)
This is done in two stages:

  1. engine returns a DataFrame containing all the files in the given siva files,
  2. the DataFrame is stored in Cassandra <-- (I think something fails at this point)

After the beforeAll the test cases run, and (sometimes) the ones asserting that there are duplicate files in test_hashes_duplicates fails.

It can be seen in the logs, that in all situations the DataFrame is populated with the right contents (1,2,3).
But then:

  • when no data is stored at all, at the save stage, there is always outputed Wrote 0 rows to test_hashes_duplicates.blob_hash_files
  • when half of the data is stored, at the save stage, there is one row that says: Wrote 47 rows to test_hashes_duplicates.blob_hash_files 47 rows
  • when everything go well, at the save stage, there are some rows that say Wrote n rows to test_hashes_duplicates.blob_hash_files, and the sum of all rows sum 80 33 rows + 47 rows

Hints

  • Why when duplicate data is partially saved into test_hashes_duplicates, the missing files are from github.com/erizocosmico/borges.git? (here)
  • Why DataFrame is always populated with right data, but sometimes its data is not properly saved?

Report: add similar files

Umbrella issue for updating current implementation to use "similarity" from Apollo:

  • Query DB, detect connected components, write Parquet in Scala #58
  • Internal library in python (function/class) for community detection \w IGraph #85
  • App in Python, reading Parquet, doing community detection using the library above and printing the output #60

Processing of siva files bigger than 2Gb

There is a limit for a job in Spark. It's 2GB. We need to investigate how to change it if possible and how it will affect spark. (the limit was introduced for some reason)

If somebody else will look at it, here is a tip. It looks like the limit is not from Spark actually, but JVM. I can be wrong. JFYI

Exception:

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 415 in stage 1.0 failed 4 times, most recent failure: Lost task 415.3 in stage 1.0 (TID 1072, 10.2.15.79, executor 8): java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE
	at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:869)
	at tech.sourced.siva.SivaReader.getEntry(SivaReader.java:42)
	at tech.sourced.engine.provider.RepositoryObjectFactory$$anonfun$genSivaRepository$1.apply(RepositoryProvider.scala:209)

Run Gemini file-level duplicate detection on PGA

Document in README the resources, needed to successfully process 1k, 2k, 10k, 100k and whole PGA of the .siva files.

So good start would be

  • document the known configuration of the cluster we use internally
  • running Gemini hash, documenting how long does it take to finish,
  • what CPU-load, Mem, IO-thoughtput workload it creates on that cluster (i.e from Sysdig Dashbord, to get access file an issue)
  • see which resource is a bottleneck
  • try to optimize, in order to utilize that resource better (i.e in case of throughput - have more executor JVMs running on the same machine)
  • see if we are hit and can help with some Engine issues

Add Scala linter to CI

TODO:

  • add/document sbt task with linter and code style
  • integrate linter in CI

As well noted in #48 (review)

I would be nice if all of us use the same configuration for code style.

Query: implement actual query logic

Part of the #53

Add query.py#query() logic to Gemini Scala API:

  • no need to handle args.id case
  • need to agree with ML team on a way to read WMH params like htnum and band_size that must be the same, as used for hashing
  • use a DB from a Apollo hash run
  • use vocabulary in .asdf (OrderedDocumentFrequencies), built by Apollo hash \w --docfreq

Update query integration test

Part of the #53

We have one that does hash and then looks for duplicates. We need to update it to test similarity too.

I suggest such a plan:

  • Amend current test dataset to include similar file(s)
  • Populate DB table hashtables with fixture created using apollo with that dataset
  • Use docFreq&params from apollo also
  • Run bblfsh server & fe server
  • Check query output and make sure both duplicates and similar files appear in the output

Improve docker image management in CI

We have scripts that launch docker images, used by Travis CI. To skip them we look for STYLE_CHECK env var, but that is not the only case we should skip anymore.

if [[ -z "${STYLE_CHECK}" ]]; then

From this thread #83 (comment)

@carlosms

Maybe we should find a better way to skip docker images? For instance we will have at some point lint for python, scala, and other possible new tests that need a different env.
Would it work if in travis.yml we moved the before_script: inside each matrix entry? It would be more verbose, but easier to spot where we are launching docker.
I'm not sure about the best option, but it's something worth looking into.

@smacker

current check is copy-past for a script we already use. But I agree is need something better. Maybe also use docker-compose as proposed by smola.

And from this other thread #83 (comment)

@smola

I understand this is not strictly related to this PR, since you were already doing this previously with scripts/start_docker_db.sh, but you might consider using docker-compose for this in the future.

It can be easily pre-installed in Travis (see here) and it's a single file with more concise syntax to define Docker dependencies. You can choose to start all of them with up or specific up bblfsh.

@smacker

yes. We are going to have 3 docker images (db, bblfsh, feature extraction) in dependencies for gemini. It makes perfect sense to use docker-compose in the feature. (actually, I already do locally)

FE: feature extractors

This is an umbrella issue for Feature Extractors (FE) implementation:

  • Define gRPC Service, messages in src/main/proto/*.proto #52
  • Generation scripts: server in python src/main/python/generated.py, #57
    Python CLI app src/main/python/main.py configures the port
  • Generation scripts: client in scala src/main/scala #63
  • Docker file \w feature extractor (used on CI) #73
  • Implement FE service: call sourced.ml.extractros #73
  • Add missed weight parameter to extractors #79
  • Integration Tests: add CI profile, that for a given UAST checks the response is not failing #81

Perf: measure performance of file-based similarity

Umbrella issue for adding perf measurements to every command for file-based similarity:

  • Add CLI flag for ./query ./report to print time for each stage
  • Instrument FE, exposing one endpoint \w json
  • Instrument Apache Spark hashing job using org.apache.spark.groupon.metrics.UserMetricsSystem to expose to Spark JSON endpoint

Query: add similar files

Umbrella issue to update current implementation to use a :similarity" notion from Apollo:

  • UAST extraction #68
  • talk to FE gRPC, to extract features #69
  • Find/write CPU WMH implementation in Scala (tests!) #78
  • Write query.py#query() logic on Scala side #89
  • Update ./query CLI output to include similar files #92
  • Integration test with all components (except hash, use cql fixture for it)

Make all commands apply schema (if not exist)

Right now, only tests do not rely on having existing schema in DB.

Each command should be changed (hash, query, report) to do the same and create schema, instead of failing if it does not exist.

Implement report command

This command will output the duplicate files among all the hashed files:

  • <file_path> | <repo> | <sha1>
  • show the <sha1> only when command is ran with --verbose mode
  • add a lik to the file in GH (based in HEAD) -> https://github.com/<repo>/blob/HEAD/<file_path>

Add Docker compose for dependencies

To streamline first user experience:

  • add simple Docker compose script for dependencies: DB and FE - it will enable part of #84
  • make CI use it (so only appropriate containers are started - as suggested in #91)

Simplify arguments for report command

Currently, we have 2 flags: group-by and condensed. Both of them require Cassandra and can't work with ScyllaDB. We think condensed flag is quite confusing. So we propose to keep only group-by flag.

Integration Tests on CI in Spark Standalone cluster mode

Right now we have a profile on CI that does Integration Tests with Apache Spark in local mode.

In order to be able to catch more trickier issues i.e \w runtime classpath collision, we need test it also in Apache Spark cluster mode Standalone configuration.

TODOs:

  • add new profile on CI \w INTEGRATION_TESTS=true
  • same test scenario, as in local mode, except for
  • add starting Apache Spark Master and Workers manually
  • run Gemini using, using the above with MASTER="spark://127.0.0.1:7077" ./hash ...

Release: update release artifacts

Right now, Gemini has 3 use cases:

  1. local, for developer on pre-configured environment, using shell scripts
  2. local, for a first-time user, #84 (comment)
  3. k8s \w Apache Spark cluster, Dockerfiles: Gemini \w Spark, Feature Extractors

Current release artifacts are only .tar file for 1 and a Docker file.

This issue is about changing the release process to accommodate recent changes on file-level similarity so:

  • add .sh for starting a feature extractor process on local machine
  • in ./report, check if Python is available
  • make sure Dockerfile for FE re-uses the shell scripts (as much as possible)
  • add publishing Docker container for FE to the release process
    (this will enable testing Gemini on 3rd use-case)

this way, a new release should accommodate all 1 and 3 case from above.

Release: include LICENSE file in the release artifacts

As discussed in #63 (comment) , this project is governed by GPL3 but include few (well identified, .proto) files under BSD-like licenses, thus we need to make sure we keep copyright notice/disclamer and redistribute it as a part of the final released artefacts.

To do so, we just need to:

Query: CPU WMH implementation in Scala

Part of #53 and #55

For both, query and hashing similar files we need to have a CPU-based WeightedMinHash implementation in JVM, to avoid depending on MinHashCUDA lib and GPU.

Here are few reference implementations of this algorithm:

It might make sense to research if such library already exists in JVM ecosystem, and if not - it might make sense to implement one in Java as part of Gemini codebase. This way it can be later used from Java, Scala, Clojure, etc eventually, as a standalone library.

Correctness verification is of paramount importance, so we would need to have some tests, and may be a reference data \w hashes of some sets, produced by one of the implementations from above.

Hash: add similar files

Umbrella issue for adding hashing similar files using Apache Spark:

  • UAST, feature extraction (same args as in query)
  • Tokenize, Vectorize: wBOW, tf/idf - defining a pipeline
    • generate docFreq.json and params.json
    • vectorize: file->set
  • Use CPU-based WMH implementation in Scala from #67 to hash every set
  • Take results of hash pipeline, create hash tables, write to ScyllaDB \w schema as in Apollo #111
  • Test correctness: Apollo and Gemini produces same results (use same seed if needed)

Document easy way and hard way to run Gemini

After #95, update the README structure, add

This change can bee small - it does not need to be "complete" in documentation coverage or full conformance \w doc guide from above, but it has to setup the right "structure" of our user-facing documentation, that will be improved later.

Fix WARN on ./report

Right now, after #90 (comment) ./report command (at least on macOS) produces

WARN 11:57:50 org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner (FileSystem.java:2995) - exception in the cleaner thread but it will continue to run
java.lang.InterruptedException
	at java.lang.Object.wait(Native Method)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
	at org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:2989)
	at java.lang.Thread.run(Thread.java:748)

It is may or may not be related to HADOOP-12829 - the goal is to find out and fix it.

Update output of scala applications

For query and report tools:

  • add -v flag, producing more verbose debug output
  • replace println() with proper slf4j logging (same as Apache Spark uses) on different levels INFO/DEBUG

The Spark Application logging will be handled in different issue.

CI: fix by switching to Docker instead of EmbeddedCassandra

Right now test use EmbeddedCassandra to speed up CI \wo Docker.

But proper INTEGRATION_TESTS=true profile would not be possible \wo Docker in before_script anyway, so it would make sense to just rely on it in both Dev and CI flows (as we do in other projects)

Flaky tests: org.eclipse.jgit.errors.MissingObjectException

On local ./sbt run as well as on CI, sometimes test become flaky with org.eclipse.jgit.errors.MissingObjectException.

Example:

ERROR 15:13:13,185 org.apache.spark.internal.Logging$class (Logging.scala:91) - Exception in task 0.0 in stage 1.0 (TID 2)
org.eclipse.jgit.errors.MissingObjectException: Missing commit 4aa29ac236c55ebbfbef149fef7054d25832717f
	at org.eclipse.jgit.internal.storage.file.WindowCursor.open(WindowCursor.java:164)
	at org.eclipse.jgit.revwalk.RevWalk.getCachedBytes(RevWalk.java:903)

CI

This might, or might not have to do something either with Engine behaviour, or with test fixtures that we have.

This issue is about investigating the reason and if needed, filing appropriate issues elsewhere to fix it.

Update output of Spark application

For ./hash
migrate to logger, provided by Apache Spark
output: mute meaningless Spark logs (though log4j.config)

This would most probably mean to refactor Gemini class to have logging dependency injected (thorough constructor or .set...()), as it will be instantiated by the client (Spark/non-spark ones) in a different ways.

duplicates in repository

When we have identical files in repository, only one is written in DB and duplicates won't appear in report because the primary key is the same for them.

Steps to reproduce:

smacker at Maxims-MacBook-Air in ~/tmp/testrepo on master*
$ ls
CONTRIBUTING.md file.py         file_2.py

file_2.py is copy of file.py.

Run engine to collect files:

+--------------------+--------------------+---------------+--------------------+----+
|         commit_hash|           file_hash|           path|       repository_id|name|
+--------------------+--------------------+---------------+--------------------+----+
|06e561f1a7d6db4f3...|c4e5bcc8001f80acc...|      file_2.py|file:///Users/sma...|HEAD|
|06e561f1a7d6db4f3...|eaf26a547aa54cde7...|CONTRIBUTING.md|file:///Users/sma...|HEAD|
|06e561f1a7d6db4f3...|c4e5bcc8001f80acc...|        file.py|file:///Users/sma...|HEAD|
+--------------------+--------------------+---------------+--------------------+----+

Check what we have in DB after hash:

cqlsh:hashes> select * from blob_hash_files;

 blob_hash                                | repo                               | file_path
------------------------------------------+------------------------------------+-----------------
 eaf26a547aa54cde7079567d832ac05880eb6bd2 | file:///Users/smacker/tmp/testrepo | CONTRIBUTING.md
 c4e5bcc8001f80acc238877174130845c5c39aa3 | file:///Users/smacker/tmp/testrepo |       file_2.py

(2 rows)

Speedup CI

The longest task is integration. Most of the time takes:

  • install python. We can improve it a bit, installing some deps from apt instead of building
  • docker for FE. We can remove it now when we have python in CI
  • Make build. We can try to use travis cache to improve it

also check that we don't run services when they aren't necessary

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.