Giter VIP home page Giter VIP logo

dscheck's Introduction

๐Ÿ“ข Note ๐Ÿ“ข

Multicore OCaml project has now been merged into OCaml ๐ŸŽ‰. This repository is no longer developed or maintained. Please follow the updates at the OCaml Github repository.

Citation

If you are citing this work in an academic paper, please cite the ICFP 2020 paper "Retrofitting Parallelism onto OCaml": https://dl.acm.org/doi/10.1145/3408995


Branch trunk Branch 4.14 Branch 4.13 Branch 4.12

Github CI Build Status (trunk branch) Github CI Hygiene Status (trunk branch) AppVeyor Build Status (trunk branch)

Github CI Build Status (4.14 branch) AppVeyor Build Status (4.14 branch)

TravisCI Build Status (4.13 branch) AppVeyor Build Status (4.13 branch)

TravisCI Build Status (4.12 branch) AppVeyor Build Status (4.12 branch)

README

Overview

OCaml is a functional, statically-typed programming language from the ML family, offering a powerful module system extending that of Standard ML and a feature-rich, class-based object system.

OCaml comprises two compilers. One generates bytecode which is then interpreted by a C program. This compiler runs quickly, generates compact code with moderate memory requirements, and is portable to many 32 or 64 bit platforms. Performance of generated programs is quite good for a bytecoded implementation. This compiler can be used either as a standalone, batch-oriented compiler that produces standalone programs, or as an interactive REPL system.

The other compiler generates high-performance native code for a number of processors. Compilation takes longer and generates bigger code, but the generated programs deliver excellent performance, while retaining the moderate memory requirements of the bytecode compiler. The native-code compiler currently runs on the following platforms:

Tier 1 (actively maintained) Tier 2 (maintained when possible)

x86 64 bits

Linux, macOS, Windows, FreeBSD

NetBSD, OpenBSD

x86 32 bits

Linux, Windows

FreeBSD, NetBSD, OpenBSD

ARM 64 bits

Linux, macOS

FreeBSD

ARM 32 bits

Linux

FreeBSD, NetBSD, OpenBSD

Power 64 bits

Linux

Power 32 bits

Linux

RISC-V 64 bits

Linux

IBM Z (s390x)

Linux

Other operating systems for the processors above have not been tested, but the compiler may work under other operating systems with little work.

All files marked "Copyright INRIA" in this distribution are Copyright ยฉ 1996-2021 Institut National de Recherche en Informatique et en Automatique (INRIA) and distributed under the conditions stated in file LICENSE.

Installation

See the file INSTALL.adoc for installation instructions on machines running Unix, Linux, macOS, WSL and Cygwin. For native Microsoft Windows, see README.win32.adoc.

Documentation

The OCaml manual is distributed in HTML, PDF, and Emacs Info files. It is available at

Availability

The complete OCaml distribution can be accessed at

Keeping in Touch with the Caml Community

There is an active and friendly discussion forum at

The OCaml mailing list is the longest-running forum for OCaml users. You can email it at

You can subscribe and access list archives via the Web interface at

An alternative archive of the mailing list is also available at

There also exist other mailing lists, chat channels, and various other forums around the internet for getting in touch with the OCaml and ML family language community. These can be accessed at

In particular, the IRC channel #ocaml on Libera has a long history and welcomes questions.

Bug Reports and User Feedback

Please report bugs using the issue tracker at https://github.com/ocaml/ocaml/issues

To be effective, bug reports should include a complete program (preferably small) that exhibits the unexpected behavior, and the configuration you are using (machine type, etc).

For information on contributing to OCaml, see HACKING.adoc and CONTRIBUTING.md.

Separately maintained components

Some libraries and tools which used to be part of the OCaml distribution are now maintained separately. Please use the issue trackers at their respective new homes:

dscheck's People

Contributors

art-w avatar bartoszmodelski avatar lyrm avatar polytypic avatar sadiqj avatar sudha247 avatar talex5 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

dscheck's Issues

Add a way to skip execution of traces

When debugging a failing case produced by qcheck-lin, it would be nice if dscheck could explore only the interleavings that produce the same intermediate values as reported by qcheck-lin (and skip the other execution traces that are less likely to have a bug). I think it would be enough to raise a specific exception to indicate that we don't need to continue exploring that way: (but not report it as a failure)

if value <> expected then raise Dscheck.Skip ; (* ... *)

(It almost works if the user does the try...with Skip -> () on their side, because the final postcondition is still called)

Add `Domain.join`

It's a bit of a pain to adapt existing code to dscheck because it doesn't have the standard Domain.spawn type (and Domain.join is missing). It would be super neat if enabling tracing only required this prelude:

open Dscheck

(* or *)
module Atomic = Dscheck.Atomic
module Domain = Dscheck.Domain

(with Domain.join added, I suspect that final wouldn't be required anymore?)

Randomness

Many lock-free implementations use some randomness as an easy way to load balance without coordination. This may be a source of nondeterminism and DSCheck does not like it. For example, say we have a test that begins with two threads wanting to put two elements into random slots in an array.

Run:

  1. Thread A draws slot 3 and puts item there
  2. Thread B draws slot 3 and fails
  3. Thread B retries; draws slot 5 and puts item there
    ..
    ..
    more steps

If DSCheck finds some races after step 3, it will keep rerunning the sequence 1-3 to explore events of interest that occur further down the sequence. But on the consecutive runs, thread B may succeed on the first try, yielding a different state. In the positive case, B will become disabled before its schedule and DSCheck crashes explicitly. In the more scary one, we may give a positive result without actually having explored all the interesting interleavings.

It's an easy mistake to make and a pretty difficult one to find. Perhaps, DSCheck should choose some random seed, and then reinitialize OCaml's prng to this value at the beginning of every run?

cc @lyrm, who's observed this issue in pratice, with dscheck crashing uncontrollably on ocaml-multicore/saturn#65

Validation for programs weaker than lock-free

Lots of real-world programs don't have lock-free property and it's often optimal that way. For one, because we mostly have a somewhat fair OS scheduler, and it does not make sense to optimize for the theoretical scenario, where one domain runs indefinitely. But, in absence of concrete bounds for unfairness, dscheck needs to explore all the possible Mazurkiewicz traces to prove properties about programs. In simple words, if thread A spins until thread B does something, dscheck will explore thread's A spinning to infinity.

There's a few things we could do to help here:

  • Add depth limit to the search. The easiest thing to do. It's mentioned in the DPOR paper and was used by CDSCheck guys. The problem is that we don't really know how to determine optimal depth. In the CDSCheck paper, they were looking for bugs in existing implementations to show that the algorithm is useful, in our case, obtaining negative results is just as important.
  • The papers also mention imposing some fairness conditions on the way DSCheck is scheduling things. I think it's potentially better than depth limit, but still a bit hard to ensure we're actually validating all the required states.
  • Letting user place yields. This has the disadvantage, that our atomic's interface will depart from stdlib's, so users have to add indirections (rather than just copy in our atomic in tests with dune rules). But it seems to be the best solution from the perspective of correctness, as it's easy for user to identify cycles in the state space. It also seems quite implementable (although we have to be careful when more than 1 thread is yielding at once).

I don't think that we can reasonably identify the cycles automatically, since we cannot know whether e.g. retries of a spinlock are bounded without reading local state.

Documentation requests

I think I'm a bit confused about how to use dscheck. Perhaps other people will be too.

The README says:

At the core, dscheck runs each test numerous times, exhaustively trying every single interleaving of atomic operations.

I tried adding some prints to it:

module Atomic = Dscheck.TracedAtomic
(* tested implementation needs to use dscheck's atomic module *)

let test_counter () =
  print_endline "\nstart";
  let counter = Atomic.make 0 in
  let incr () =
    print_endline "start incr";
    Atomic.set counter (Atomic.get counter + 1);
    print_endline "end incr";
  in 
  Atomic.spawn incr;
  Atomic.spawn incr;
  Atomic.final (fun () ->
      print_endline "end";
      Atomic.check (fun () -> Atomic.get counter == 2)
    )

let () = Atomic.trace test_counter

Many of the runs don't seem to finish. Is this expected? e.g. I get:

start
start incr

start
start incr

start
start incr
end incr

...

It would be useful if it could run without logging while searching, and then run the failing test case again with logging on. The default trace output doesn't seem very useful (e.g. it's hard to know what Process 3: compare_and_swap 14 is referring to).

I had assumed I could use non-atomic variables to keep track of the expected state and then compare that against the atomic values. e.g. adding a value to a lock-free queue and also to a regular Stdlib.Queue and checking the final contents are the same. Is that unsafe?

Does Atomic.spawn always run the function immediately, up to the first atomic operation? For example, this doesn't find the error:

let test_counter () =
  print_endline "\nstart";
  let x = Atomic.make true in   (* This is supposed to be false *)
  let might_be_set = ref false in
  Atomic.spawn (fun () ->
      print_endline "Setting...";
      might_be_set := true;
      Atomic.set x true;
      print_endline "Set";
    );
  Atomic.spawn (fun () ->
      print_endline "Checking";
      if not !might_be_set then assert (Atomic.get x = false)
    );
  Atomic.final (fun () ->
      print_endline "end";
    )

But switching the order of the spawns does find it. I guess if another process depends on the value then I need to use an atomic.

I tried doing this to encourage it to switch:

let yield = Atomic.make () in
...
Atomic.spawn (fun () -> Atomic.get yield; ...)

That worked if I put it in both spawns, but not in just one of them.

Basically, it would be useful to know what things are safe to do and which prevent it from testing all cases.

(Also: the example in the README doesn't do anything unless you add let () = Atomic.trace test_counter. It would be good to include this - maybe tested with MDX?).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.