Giter VIP home page Giter VIP logo

tracemachina / nativelink Goto Github PK

View Code? Open in Web Editor NEW
1.1K 1.1K 101.0 47.02 MB

NativeLink is an open source high-performance build cache and remote execution server, compatible with Bazel, Buck2, Reclient, and other RBE-compatible build systems. It offers drastically faster builds, reduced test flakiness, and specialized hardware.

Home Page: https://nativelink.com

License: Apache License 2.0

Starlark 4.13% Rust 90.46% Python 0.30% Shell 0.97% Dockerfile 0.16% Nix 2.17% C++ 0.01% Go 1.80%
apache2 bazel buck2 build-automation build-system chromium ci content-addressable-storage free nix re-client remote-execution rust simulation

nativelink's People

Contributors

aaronmondal avatar adam-singer avatar aleksdmladenovic avatar allada avatar bclark8923 avatar blakehatch avatar blizzardc0der avatar caass avatar chrisstaite avatar chrisstaite-menlo avatar cormacrelf avatar dependabot[bot] avatar dolcetriade avatar eltociear avatar froody avatar harper-carroll avatar ibilalkayy avatar jaroeichler avatar jhpratt avatar krishmoran avatar marcussorealheis avatar matdexir avatar mhz5 avatar nfarah86 avatar renovate[bot] avatar schahinrohani avatar steedmicro avatar triplekai avatar tyr-one avatar zbirenbaum avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nativelink's Issues

Wrapping GrpcStore with FastSlowStore failure for AC

Now that GrpcStore supports forwarding to an upstream AcServer, I attempted wrapping it in a FastSlowStore to make a local cache and am getting the following error:

[2023-07-17T08:06:26.568Z ERROR ac_server] get_action_result Resp: 0.14148201 Some("0f3c5b706e843dc5323f8a21ed8ea59bd8d55c53310abf8b11471f0e1e95924c") Err(Error { code: Internal, messages: ["Action result not found", "Failed to get_part in get_part_unchunked", "---", "Writer was dropped before EOF was sent", "Failed to recv first chunk in collect_all_with_size_hint", "Failed to read stream to completion in get_part_unchunked"] })

This is obviously not as expected, so something is wrong with the implementation. I haven't dug into what's causing it yet.

benchmarks?

This isn't so much an issue but I'm curious if any benchmarks were done against any of the other popular bazel remote cache implementations

AC context lost in store trait

When forwarding to an upstream AC store via a GrpcStore then it uses the AcServer. However, if you wrap it in a fast_slow store, for example, to have a local cache, then you end up with it loosing the context and querying the CAS instead of the AC.

Inefficient upload of stderr/stdout for workers

Currently we do upload stderr and stdout wait for them to finish uploading then start uploading the output files.

This is silly because just below we have a FuturesUnordered that we could add the futures to instead and upload them in parallel with the files.

This is a serial operation, so this fix has a high chance of dramatically reducing upload time for fast running tasks.

Offending code:
https://github.com/allada/turbo-cache/blob/master/cas/worker/running_actions_manager.rs#L628

EvictingMap doesn't remove the oldest entries

I was just looking into how EvictingMap determined which was the least recently used entry and it appears that it's entirely within the lru::LruCache implementation, and entries are not promoted when they are touch()'d and they are not put in order when the insert_with_time is used.

Stop vendoring protos

Since rules_rust 0.25 introduced prost/tonic rules it's now theoretically possible to remove the entire proto directory and trivially generate the current proto target with this:

rust_tonic_library(
    name = "remote_execution",
    proto = "@remote-apis//build/bazel/remote/execution/v2:remote_execution_proto",
)

Modulo some toolchain setup/configuration and adjustments to bring turbo-cache in sync with the upstream proto this will give us a net gain of over 6k LoC. Fantastic 😍

There seems to be one bug remaining in 0.25 that prevents us from implementing this and we'll probably have to wait for 0.25.1 or 0.26, but we're getting close:

Hard linking in `download_to_directory` in `running_actions_manager.rs` does not give guarantees

The linking logic here:
https://github.com/allada/turbo-cache/blob/e172756613b5398f1ccdaaf258f3f7b80ac4b08e/cas/worker/running_actions_manager.rs#L101

Does not properly hold the FileEntry object which is needed in order to guarantee the file does not get deleted and any other needed guarantees. It should also fix the bug outlined here if we properly hold a reference to FileEntry: https://github.com/allada/turbo-cache/blob/e172756613b5398f1ccdaaf258f3f7b80ac4b08e/cas/store/filesystem_store.rs#L411

Create test harness for Grpc*

Currently we have no tests for GrpcScheduler or GrpcStore. This has resulted in at least one bug (see #199) and an outstanding ticket to create tests (#154).

In order to test this a gRPC service needs to be spun up to test against. This is non-trivial.

This issue exists to track the effort to create a framework that can create a gRPC service to write tests against.

Add AWS-native k8s Deployment

I'd like to run turbo-cache in a k8s cluster deployed with Pulumi so that we can automatically set it up for users as part of rules_ll. Simple yaml-manifests would be usable for users of raw k8s, Terraform and Pulumi.

I'd be willing to work on this ☺️

Publish more prometheus stats throughout system

Now that Prometheus is added and the API is established, we need to spread the usage around the system.

  • We can't consider our Prometheus publishing stable until prometheus/client_rust#155 is fixed.
  • GrpcServices
    • Implement for AcServer
    • Implement for BytestreamServer
    • Implement for CapabilitiesServer
    • Implement for CasServer
    • Implement for ExecutionServer
    • Implement for WorkerApiServer
  • Schedulers
    • Implement for PlatformPropertiesManager
    • Implement for CacheLookupScheduler
    • Implement for GrpcScheduler
    • #216 Implement for SimpleScheduler
    • #216 Implement for scheduler::Worker
  • Stores
    • Implement for CompressionStore
    • Implement for DedupStore
    • #263 Implement for FastSlowStore
    • #202 Implement for FilesystemStore
    • Implement for GrpcStore
    • #207 Implement for MemoryStore
    • Implement for S3Store
    • Implement for SizePartitioningStore
    • #208 Implement for VerifyStore
  • Workers
    • #213 Implement for LocalWorker
    • #213 Implement for RunningActionsManager
    • Implement for WorkerApiClientWrapper
  • Global State
    • #230 How many active GRPC connections there are
    • Total GRPC connections since server started
    • Time server started (timestamp)
    • Global config setting used
    • Maybe internal tokio metrics. This might require unstable tokio stuff 😦

Add official nix package

As the project matures and gets closer to an actual release version we should start considering packaging options. One such option (that I'm most interested in 😆) is the nix package repository.

@allada Plz ping me when we're getting close to a release and I can prepare a nixpkgs release for turbo-cache. Apart from maybe a CI workflow this probably doesn't require a PR to this repo, just one to nixpkgs.

Add prometheus logging

We need to gather stats on what is happening and when. Prometheus is a great choice for this and there's pretty good rust support.

Workers that spawn child processeses may cause zombies

In fixing another bug I discovered that tokio's kill command does not kill the entire child process tree, instead it only kills the immediate child. This can cause zombies if the child spawns more child processes then is killed.

This is going to be very tricky to write tests for it because bazel uses a sandbox internally, so we may need to break out of the sandbox in order to call setsid() to create a process group.

This library does work, but when I looked at it's implementation I'm not sure we should use it because it uses spawn_blocking when waiting on processes. This might cause all our threads to block if we are not careful on high cpu machines.
https://docs.rs/command-group/latest/command_group/index.html

It is very common for people to use the entrypoint_cmd config to wrap their program in a shell script runs their program under docker making this problem much less of an issue.

Support building with only Cargo

To make it more accessible we should support both cargo building and bazel building.

The major reason for this is because we should support windows, but in my experience it's quite difficult to get windows to work with bazel, so I'd like to just use cargo for such case.

Build stall when CAS reset

I had everything stall during a full Chromium build, it did all come back to life after a minute or so, but the cause appears to be this:

[2023-09-07T07:59:41.326Z WARN  h2::proto::streams::recv] recv_reset; remotely-reset pending-accept streams reached limit (20)
[2023-09-07T07:59:41.326Z ERROR cas] Failed running service : hyper::Error(Http2, Error { kind: GoAway(b"", ENHANCE_YOUR_CALM, Library) })

I'm not sure why this happened and it's the first time I've seen it.

Implement blake3

Sha256 is known to be quite slow. There's a new kid in town, Blake3. It's crazy fast, super secure and just better overall.

BRE already supports it:
https://github.com/bazelbuild/remote-apis/blob/39c174e10d224c46b556d8d4615863804d5b2ff6/build/bazel/remote/execution/v2/remote_execution.proto#L1900

Bazel appears to be in the process of supporting it:
bazelbuild/bazel#18658

Micro-bench testing shows it is worth the effort:
https://gist.github.com/allada/6b4321a6487c2888ff73ce1cc0fc86ed

All results are on a 16 core i9 with (no threads):
1GB @ 10:

sha256: 35.546221869s
blake3: 2.346503712s

abs difference: 33.199718157s
% difference r: 1514.86%
% difference i: 6.60%

1MB @ 10_000:

sha256: 34.653737424s
blake3: 2.129524155s

abs difference: 32.524213269s
% difference r: 1627.30%
% difference i: 6.15%

1KB @ 1_000_000:

sha256: 3.714629289s
blake3: 725.049013ms

abs difference: 2.989580276s
% difference r: 512.33%
% difference i: 19.52%

100B @ 1_000_000:

sha256: 453.309362ms
blake3: 113.922135ms

abs difference: 339.387227ms
% difference r: 397.91%
% difference i: 25.13%

This is a significant difference and when things are under high workload, it is often because we are spending so much time hashing. This hash a very high chance of dramatically improving performance of the cas stores.

Workers do not honor timeout

Currently we do not honor the timeout field in the proto for actions. We need to also have a "max job time" type setting and force kill jobs if the go over this limit regardless of what the proto requests.

This should be trivial to implement.

Use `crate_universe` instead of `cargo_raze`

cargo_raze seems to be mostly abandoned in favor of crate_universe.

Moving from cargo_raze to crate_universe means essentially just changing build files like this:

"//third-party:prost_types" -> "@crate_index//:prost-types"

The entire third_party directory is then superceded by a single Cargo.Bazel.lock file.

There are two ways crate_universe can be used:

  1. Track dependencies in the WORKSPACE directly and delete Cargo.toml, i.e. something like
    crates_repository(
        ...
        packages = {
            "somecrate": crate.spec(version = "1.2.3")
        },
    )
  2. Keep the Carto.toml and generate dependencies from that, i.e. something like
    crates_repository(
        ...
        manifests = ["@//:Cargo.toml"],
    )

@allada I already have an implementation of option 1 but I need to update it to the recently changed deps. I'll send a PR when It's ready ❤️

If everything runs on same host (worker, scheduler, cas, exc...) it is possible to deadlock

Buck2 hammers the remote execution as hard as it can (which is a good thing). In my testing I was running everything in the same process (which is not how it should be done in production), it caused the max number of files to be opened then deadlocked because every thread was waiting for another thread to release a file.

This only happens when you are reading and writing from one-file to another (ie: CAS(file) -> worker(file)).

Unable to build on Ubuntu 20.04

The build now fails for Ubuntu 20.04:

gcc: error: unrecognized command line option '-std=c++20'; did you mean '-std=c++2a'?
gcc: error: unrecognized command line option '-std=c++20'; did you mean '-std=c++2a'?

I also still have a requirement to build on 18.04.

This was broken in 6a72841.

Remote execution is not supported by the remote server, or the current account is not authorized to use remote execution

Looks like a cool project, so tried to use the TLDR, but got error

Remote execution is not supported by the remote server, or the current account is not authorized to use remote execution

The "docker-compose up" seemed be build great and get the containers running.

das@das-T14s-g1:~/temp$ sudo docker ps
[sudo] password for das: 
CONTAINER ID   IMAGE                       COMMAND                  CREATED       STATUS          PORTS                                                      NAMES
0c9ef89096f2   allada/turbo-cache:latest   "turbo-cache /root/w…"   2 hours ago   Up 53 minutes   50051-50052/tcp                                            docker-compose_turbo_cache_executor_1
375d537172ef   allada/turbo-cache:latest   "turbo-cache /root/s…"   2 hours ago   Up 53 minutes   50051/tcp, 0.0.0.0:50052->50052/tcp, :::50052->50052/tcp   docker-compose_turbo_cache_scheduler_1
27cdf217dd15   allada/turbo-cache:latest   "turbo-cache /root/l…"   2 hours ago   Up 53 minutes   0.0.0.0:50051->50051/tcp, :::50051->50051/tcp, 50052/tcp   docker-compose_turbo_cache_local_cas_1

Then tried the test example https://github.com/allada/turbo-cache/#tldr

das@das-T14s-g1:~/Downloads/turbo-cache$ bazelisk test //...   --remote_instance_name=main   --remote_cache=grpc://127.0.0.1:50051   --remote_executor=grpc://127.0.0.1:50051
2023/08/30 17:35:52 Downloading https://releases.bazel.build/6.2.1/release/bazel-6.2.1-linux-x86_64...
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
INFO: Invocation ID: d46b187b-1d91-477e-8f72-7d814b681684
ERROR: Remote execution is not supported by the remote server, or the current account is not authorized to use remote execution.

This is likely a beginner error, so sorry about that.

Thanks in advance

Add rate limiting to GrpcStore

When performing on-boarding of an upstream Goma proxy it requires the uploading of every file to populate the Redis cache. However, this is causing concurrency errors in the GrpcStore:

[2023-07-17T07:52:43.032Z ERROR cas_server] Error during .has() call in .find_missing_blobs() : Error { code: Internal, messages: ["status: Internal, message: \"h2 protocol error: http2 error: connection error received: unspecific protocol error detected (b\\\"[p]req HEADERS: max concurrency reached\\\")\", details: [], metadata: MetadataMap { headers: {} }", "in GrpcStore::find_missing_blobs"] } - ff2dde80d3e78f42b128dcf6b4fe7b1173908e4276a4ef9ac3819c09f668bbb2

There should be a back-off for handling this situation.

build failed on mac

Hi,

I'm trying to build turbo-cache on Mac. Bazel build returns this error message:

error[E0432]: unresolved import libc
--> external/raze__tempfile__3_3_0/src/file/imp/unix.rs:19:5
|
19 | use libc::{c_char, c_int, link, rename, unlink};
| ^^^^ use of undeclared crate or module libc

error: aborting due to previous error

[idea] Amount of magic for publishing metrics

Looking for feedback on how automatical publishing metrics should be?

Right now I have 2 approaches, a procedural, but verbose one and a magical macro one.

Here's the syntax for the two:

// FilesystemStore example.
impl<Fe: FileEntry> MetricsComponent for FilesystemStore<Fe> {
    fn gather_metrics(&self, c: &mut CollectorState) {
        c.publish(
            "read_buff_size",
            self.read_buffer_size,
            "Size of the configured read buffer size",
        );
        c.publish(
            "active_drop_spawns",
            &self.shared_context.active_drop_spawns,
            "Number of active drop spawns",
        );
        c.publish(
            "temp_path",
            &self.shared_context.temp_path,
            "Path to the configured temp path",
        );
        c.publish(
            "content_path",
            &self.shared_context.content_path,
            "Path to the configured content path",
        );
        c.publish("evicting_map", &self.evicting_map, "");
    }
}

// VerifyStore example.
impl MetricsComponent for VerifyStore {
    fn gather_metrics(&self, c: &mut CollectorState) {
        c.publish(
            "verify_size",
            self.verify_size,
            "If the verification store is verifying the size of the data",
        );
        c.publish(
            "verify_hash",
            self.verify_hash,
            "If the verification store is verifying the hash of the data",
        );
        c.publish(
            "size_verification_failures",
            &self.size_verification_failures,
            "Number of failures the verification store had due to size mismatches",
        );
        c.publish(
            "hash_verification_failures",
            &self.hash_verification_failures,
            "Number of failures the verification store had due to hash mismatches",
        );
    }
}

// MemoryStore example.
impl MetricsComponent for MemoryStore {
    fn gather_metrics(&self, c: &mut CollectorState) {
        c.publish("evicting_map", &self.evicting_map, "");
    }
}

The macro version would look like this:

// FilesystemStore example.
publish_metrics! {
    FilesystemStore<Fe> {
        evicting_map,
        read_buff_size "Size of the configured read buffer size" Bytes,
        shared_context {
            active_drop_spawns "Number of active drop spawns",
            temp_path "Path to the configured temp path",
            content_path "Path to the configured content path",
        }
    }
}

// VerifyStore example.
publish_metrics! {
    VerifyStore {
        verify_size "If the verification store is verifying the size of the data",
        verify_hash "If the verification store is verifying the hash of the data",
        size_verification_failures "Number of failures the verification store had due to size mismatches",
        hash_verification_failures "Number of failures the verification store had due to hash mismatches",
    }
}

// MemoryStore example.
publish_metrics! {
    MemoryStore {
        evicting_map,
    }
}

These two examples would do 100% identical things, except the macro one would also make it easy to denote the type (which is a bit tricky to do with the procedural one.

Thoughts?

Let's get goma going

I'm trying to get turbo-cache working with goma. We're running a 512-core goma + buildbarn cluster with about a dozen builds per day. I got somewhere by just replacing buildbarn's CAS with turbo-cache, but would love to try running the whole thing on tc.

I've got a turbo-cache scheduler, cas and worker running but currently stuck on this error from goma:

exec call: error in check missing blobs: rpc error: code = Unimplemented desc = missing blobs: rpc error: code = Unimplemented

EDIT: got it to work! lmk if you want me to test with goma + chromium + lots of cores!

Enable clippy during tests

A draft of this is at #152.

My current migration plan:

  • #158 Change impls of Into to From since the latter gives the former for free
  • #163 Implement is_empty for LenEntry and all its impls

Add remaining fixes separately:

Finally:

Additional issues/questions I encountered so far:

  • Clippy doesn't like the highest_priority_action_first and equal_priority_earliest_first tests in cas/scheduler/tests/action_messages_test.rs. I'm not sure what these tests are trying to test. Is this about Ord? Would testing something like assert!(first_action < current_action) or similar also work?
  • #174 The GetFinishedResult in cas/worker/tests/utils/mock_running_actions_manager.rs pointed out that the largest GetFinishedResult variant contains at least 496 bytes. It's talking about ActionResult here. I'd expect this to be fairly large but should it really be this large at all times?
  • We "hold a RefCell reference across an await point" in cas/store/tests/filesystem_store_test.rs L731. Seems like something is wrong there.

[Idea] Checkpoint support

Today I heard an interesting use case. Sometimes users may want to have processes that take a very long time, like training a ML model, but want to upload resume-able checkpoints that if the program is resumed it will resume from the last checkpoint.

Specific use case:

  1. Training program takes 3 days to run on a single GPU instance.
  2. The intermediate state can be quite like (100GB+), so uploads are slow.
  3. While the intermediate state is being uploaded, we want to keep the ML model training on the same GPU with same state.
  4. If the task is terminated turbo-cache should attempt to resume the process from the last saved state.
  5. A special ActionResult will be uploaded to AC for the task with a last_state tag in the hash (maybe environmental variable?). This will allow actions to be run against whatever the most recent state of the action cache is. For example to run some heuristics on the last model being trained (like TensorBoard).

Obviously this would be very difficult to implement right. It would be great if we could just snapshot memory state & files, upload it and allow it to be resumed, but certain things like GPU drivers present issues. We could easily do this by sending special signals to the program like: SIGUSR1, SIGUSR2, SIGVTALRM or exc, then the program would need to do the actions needed to save the resume-state files to disk then inform turbo-cache worker process it is done. TurboCache will then upload the state and the special "latest" ActionCache result.

This would obviously represent non-deterministic behavior, but it would be a configured parameter on the worker, so only use cases that specifically request this functionality would be allowed to use it (ie: opt-in to non-determinism).

Projects that do similar stuff:
https://github.com/checkpoint-restore/criu

Remove CacheLookupScheduler

I created CacheLookupScheduler because Goma wasn't getting cache hits. Turns out this was actually due to #176. This is now an entirely unused scheduler type.

I thought I'd raise an issue to determine if we want to simply delete it to avoid the maintenance cost. If I understand the RBE protocol it should actually never be required...

Create GrpcScheduler tests

There are cases where proxying a Scheduler to another instance would be useful.

An example use case is where an instance sits close to clients that is running a fast-slow store for CAS and AC over a slow network connection. This would allow an action cache lookup scheduler to utilise the local AC and then forward actions that are not found to a remote scheduler that is closer to the workers.

Sanitizer tracking issue

Sanitizer integration in rust is still quite experimental and tends to produce false positives. I went through a bunch of logs and think the issues below could be real bugs. I've added a few points of interest to key points in the codebase after some initial disentangling of the error logs.

AddressSanitizer:

  • #187 A lot of leaked memory during cas/store:ref_store_test. Seems to occur in get_test and update_test. POI:
    • The setup_stores call in update_test, in ref_store_owned. Seemingly during some clone operations.
    • The add_store call in setup_stores.
    • The first scope in each of the tests.
    • The ref_store field in the RefStore::new implementation.
    • The unsafe impl Sync for StoreReference {}.
    • The RefStore struct.
    • The name.to_string call in stores.insert in cas/store/lib.rs in add_store.
    • Seems like everything points to the string keys of the hashmap. in the stores.insert call.
    • Related issues:

ThreadSanitizer:

  • Data race in cas/worker:running_actions_manager_test in cleanup_happens_on_job_failure. POI:

    • Creation of fast_store in the setup_stores for these tests.
    • The try_join in running_actions_manager.rs in upload_results.
  • Data race in cas/worker:local_worker_test in new_local_worker_removes_work_directory_before_start_test. POI:

    • The new_local_worker in the failing test.
    • The fs::canonicalize call in new_local_worker in cas/worker/local_worker.rs.
  • Data race in cas/store:filesystem_store_test in oldest_entry_evicted_with_access_times_loaded_from_disk. POI:

    • The match statement at the end of the test on store.get_file_entry_for_digest.
    • The fs::create_dir_all call in the test.
    • The write_file calls in that test.

Make custom binaries for different services

As outlined in #116, we should make a few binaries for CAS, Scheduler, Worker instead of forcing every service to run everything.

We will still have a single binary that can do everything, but for special cases where certain dependencies cannot be filled (like wasm or certain OSs/kernels) users could use the split-out binaries.

Make WASM compatible binary optimized for edge computing

As outlined in #116, WASM could be very useful for edge-computing services. Since many of the bazel files are likely to live in S3, redis or other similar layers, it would likely save money, time and effort if users could hit a local edge point, a WASM module starts up, runs the tasks, then powers down. Since these kind of services would be data transformations the compute and resources would likely be low.

[info] Chrome stats

In building chrome with turbo-cache and getting a completely full cache, here's some useful stats on disk usage for a single fresh build (iterating newest 10k items):

item_size_bytes{quantile="0.00"} 102
item_size_bytes{quantile="0.01"} 883
item_size_bytes{quantile="0.03"} 3540
item_size_bytes{quantile="0.05"} 6320
item_size_bytes{quantile="0.10"} 11102
item_size_bytes{quantile="0.30"} 37572
item_size_bytes{quantile="0.50"} 83920
item_size_bytes{quantile="0.70"} 187360
item_size_bytes{quantile="0.90"} 477504
item_size_bytes{quantile="0.95"} 722688
item_size_bytes{quantile="0.97"} 932288
item_size_bytes{quantile="0.99"} 1433904
item_size_bytes{quantile="1.00"} 9157664

If a worker is killed, stale work directories exist

I had a Worker disconnect and then reconnect and then the scheduler logged:

[2023-09-07T08:29:17.678Z WARN  simple_scheduler] Internal error for worker 7b2e3f8d-0afb-4774-82e3-cecef835dbf3: Error { code: AlreadyExists, messages: ["File exists (os error 17) : Error creating work directory /root/.cache/turbo-cache/work/f5596fdea33511e1c7f513873c087faee689c6ee024c9d90d2908c351e5b0e83"] }

I think #129 may still be required.

Document that users should use `-c opt` when going into production

If the user just compiles using bazel build //cas it will result in a very slow binary. This is because of the way rust compiles by default and rust is extremely slow in non-optimized binaries.

We can simply update the readme to tell users to use -c opt when building for production.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.