Giter VIP home page Giter VIP logo

basislibrary's People

Contributors

johnreppy avatar kfl avatar robertharper avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

basislibrary's Issues

Clarify timing of exceptions from curried functions

Some curried functions in the Basis can raise exceptions. In a few cases it is underspecified when these are raised. For example, does Time.fmt ~1 throw, or is the exception deferred until the second argument is provided?

Affected functions:

  • Real.fmt
  • Time.fmt
  • others?

Make `Option` and `List` structures more consistent with each other

The option and list datatypes have a lot in common of each other. Both support the following operations:

(* functor *)
val map : ('a -> 'b) -> ('a t -> 'b t)

(* monoidal functor *)
val unit : 'a -> 'a t
val pair : 'a t * 'b t -> ('a * 'b) t

(* pointed functor *)
val empty : 'a t
val null : 'a t -> bool

(* monad *)
val join : 'a t t -> 'a t

The above list of operations is far from exhaustive.

Unfortunately, the API exposed by the Basis Library introduces unnecessary discrepancies:

  • Option has Option.isSome, but List has List.null
  • Option has Option.join, but List has List.concat
  • For some weird reason, List has List.mapPartial, rather than List.concatMap
  • Many of the functions in List also make sense for options, but Option doesn't provide them

Would it be possible to fix these discrepancies?

Addition of a Void module

Unsure if this is the best format/place for a new proposal (or if I should, say, create a Wiki page). If there's something else I should do, please let me know!


Often, it is useful to have access to an empty "void" type. A simple proposal might look like this:

signature VOID =
  sig
    type void
    type t = void
    val absurd : t -> 'a
  end

structure Void :> VOID =
  struct
    datatype void = Void of void
    type t = void
    fun absurd (Void v) = absurd v
  end

Perhaps a few other auxiliary functions could be useful, as well. For example:

    val fail : string -> t  (* raises `Fail` *)
    val asLeft : ('a,t) either -> 'a
    val asRight : (t,'a) either -> 'a

ambiguity of Date.date offset larger than 24hours

Sorry, this is a bug report, not an extension proposal.
I couldn't find a place to report a bug of basis spec.

Basis Date sais that:

Offsets are taken modulo 24 hours. That is, we express t, in hours, as sgn(t)(24*d + r), where d and r are non-negative, d is integral, and r < 24. The offset then becomes sgn(t)r and sgn(t)(24d) is added to the hours (before converting hours to days).

But sml/nj 110.97 takes an offset >24, ignore it:

- Date.toString (Date.date { year=2000, day= 1, hour=0, minute=0, month=Date.Jan, offset=SOME (Time.fromSeconds (0)), second=0});
val it = "Sat Jan 01 00:00:00 2000" : string
- Date.toString (Date.date { year=2000, day= 1, hour=0, minute=0, month=Date.Jan, offset=SOME (Time.fromSeconds (3600 * 24 + 1)), second=0});
val it = "Sat Jan 01 00:00:01 2000" : string

I think this implementation is correct, because offset larger than 24 is unnatural.
Also, the spec sais taken modulo 24 hours.

Clarify how string conversion functions in signature REAL treat the sign of a NaN value

In the signature REAL, the conversion functions fromDecimal and toDecimal ensure that the sign field of the IEEEReal.decimal_approx value matches the sign bit of the real value even if the value is NaN.

The description of the functions scan and fromString does not specify how the sign bit is derived in the case of an argument giving the NaN value, so an implementation may arbitrarily choose a value for the sign bit. Is this underspecified behavior intended? If the intention was to preserve the sign bit through conversion of NaN, could we say:

  • fromString is equivalent to Option.composePartial (fromDecimal, IEEEReal.fromString)?
  • scan is equivalent to Option.composePartial (mapPartialFirst fromDecimal, IEEEReal.scan)
    where fun mapPartialFirst f (a, b) = case f a of SOME a' => SOME (a', b) | NONE => NONE?

The description of the function toString says that a NaN value is converted to the string "nan". This ignores the sign bit of the NaN value. Therefore toString is not equivalent to IEEEReal.toString o Real.toDecimal. Is this intended?

This issue is raised as suggested in smlnj/legacy#73 (comment).

Expose ML implementation name in the REPL

I'd like interactive Successor ML implementations to expose their name as a standardized global constant:

val topLevelName : unit -> string

This way, it would be easier to work around differences between implementations, writing code that looks like this:

case topLevelName () of
    "Poly/ML" => use "Foo.PolyML.sml"
  | "MosML" => use "Foo.MosML.sml"
  | "SML/NJ" => use "Foo.SMLNJ.sml"

Thus avoiding the need to use arcane external tools for building programs.

Discussion of proposal guidelines and processes

This issue is for discussion of the draft guidelines for SML Basis Library proposals that have been posted to the wiki.

This issue is also intended as a forum for discussing the processes needed to submit, discuss, approve, and release changes to the Basis Library.

Subscript exceptionless signatures

I was curious if anyone had considered implementing signatures like VECTOR, and VECTOR_SLICE,
in terms of an opaquely ascribed cursor/iterator type, which ensures the the constraint that 0 <= i <= i + n <= |v|, where |v| is the length of v.

Then implementing the existing VECTOR/VECTOR_SLICE internally as functions from (int, cursor) -> cursor which raise Subscript exceptions when constructing the cursor to be returned, or having the cursor type transparently ascribed.

Or too much hassle for too little gain, because e.g. the lack of dependent types means 2 vectors of the same length would have to be represented by different cursor types? Mainly wanted to gauge reaction before thinking too hard about the details of implementing e.g. vector in terms of this.

ambiguity of offset precision of Date.date

Basis Date sais:

Offsets are taken modulo 24 hours. That is, we express t, in hours, as sgn(t)(24*d + r), where d and r are non-negative, d is integral, and r < 24.

r < 24 means that r is in hours.
And by we express t, in hours, r should be integral.

But in general, the Time.time for offset has a finer precision than seconds.

And SML/NJ 110.97 keep values finer that hour like below:

- Date.toString (Date.date { year=2000, day= 1, hour=0, minute=0, month=Date.Jan, offset=SOME (Time.fromSeconds (0)), second=0});
val it = "Sat Jan 01 00:00:00 2000" : string
- Date.toString (Date.date { year=2000, day= 1, hour=0, minute=0, month=Date.Jan, offset=SOME (Time.fromSeconds (1)), second=0});
val it = "Sat Jan 01 00:00:01 2000" : string (* +1 sec *)

I think the specification should be modified to keep seconds.

Add a type t to many signatures

I didn't find any issues discussing this, but guess it possibly has come up in the past,
apologies If I have missed that.

It would be convenient if many of the signatures provided by the basis came with a 'type t'.
an implementation which adds these is the extended-basis from mltonlib, but it would be nice if the basis provided them itself.

https://github.com/MLton/mltonlib/blob/92450815c771774f3f1d6ffd245802f262fa9b0f/com/ssh/extended-basis/unstable/detail/bootstrap.sml

a simple use case, in the following example:
signature STRING_STUFF =
sig
type t;
val fromString: string -> t option;
val toString : t -> string;
end

functor Something(StringStuff : STRING_STUFF) =
struct
(* ... )
end
(
currently you must first add the type t *)
structure IntStringStuff = struct open Int type t = int end :> STRING_STUFF;
structure Foo = Something(IntStringStuff);

(* If it was included in the basis one could just use the following instead *)
structure Foo = Something(Int :> STRING_STUFF);

given that it is possible to add these to the existing basis as mltonlib does, code like above can even be backwards compatible given a small shim, I'm not sure what if any difference it might have on compile times, if that is an argument against it, it could be measured...

Regular expressions

This is a feature request for adding regular expressions to the basic library of Successor-ML.

I assume the need for them and their usefulness is obvious, if not I could elaborate upon request.

Some SML implementations already have them.
Moscow ML has a POSIX 1003.2 variant:
http://mosml.org/mosmllib/Regex.html
SML/NJ has a variant with multiple syntaxes, currently AWK syntax:
http://www.smlnj.org/doc/smlnj-lib/Manual/regexp-lib-part.html

However lack of regular expressions in the BASIS library has dire consequences for everyday programming:
Availability - many implementations don't have them, e.g. PolyML;
Portability - a program written to use one implementation's regular expressions won't run on another's.

Which syntax will be chosen is in my opinion insignificant, but it seems like the multi-syntax approach is more versatile.

Bonus points for a direct possibility of matching a word boundary (\b or < and > in most syntaxes), a very useful feature in my experience, which both cited implementations currently lack.

Unsigned integer types

Many functions only make sense when applied to nonnegative integers (e.g., List.nth). However, because Standard ML doesn't have an unsigned integer type, one has to use signed integers, and then hopefully not forget to implement a nonnegativity check. I'd like a proper unsigned integer type, so that the check is performed automatically and at compile-time, rather than manually and at runtime.

I can anticipate an argument that I should use the existing word type. However, as useful as the word type may be for certain use cases (e.g., processing files with non-textual contents), it isn't a good general-purpose unsigned integer type for the following reasons:

  • It wraps around on over or underflow, rather throwing an exception. This is acceptable, perhaps even desirable, for low-level bit twiddling, but not for doing arithmetic in most applications.
  • word constants are represented textually as hexadecimals preceded by 0w. To get the more familiar decimal representations of numeric values, one would have to translate back and forth between word and int all over the place.

Unsigned integers should be provided in modules with the usual INTEGER signature, subject to the following constraints:

  • minInt is always SOME 0
  • abs is the identity function
  • sign never returns ~1
  • ~ raises Overflow if its argument isn't 0
  • - raises Overflow if its first argument is less than the second
  • div and quot are the same function
  • mod and rem are the same function

Ephemeral resource management

An ephemeral resource is a runtime entity with nontrivial cleanup logic, which must be executed exactly once. Not more, not less. Examples of ephemeral resources are: file handles, network connections, GUI objects, etc. Java and C++ have dedicated language features (“finalizers” and “destructors”, respectively) for managing ephemeral resources. Standard ML doesn't provide any useful abstractions for this purpose, either in the core language or in the Basis Library. As a result, writing programs that don't leak ephemeral resources in Standard ML is just as hard as it is in C. What a disaster!

IMO, the most successful abstraction ever designed for managing ephemeral resources is ownership, which can be thought of as a stylized variant of substructural types. Several languages (all of which descend from C++) incorporate ownership in their design, but as far as I can tell, the only one that does so in a type-safe manner is Rust.

Of course, neither Standard ML nor Successor ML has substructural types. But that doesn't mean that we can't incorporate ownership support in some way or another. An ownership system for Successor ML must have the following properties:

  • Cleanup logic is guaranteed to run exactly once for every ephemeral resource.
  • It must be implemented entirely as a library. No radical changes to ML's type system are allowed - nothing like Rust's affine types and lifetimes.

Here's my very rough proposal:

  • Unlike Rust, where ephemeral resources are owned by lexical scopes, in Successor ML, ephemeral resources shall be owned by threads. The first owner of an ephemeral resource is the thread that created it, but ownership can be transferred to another thread. When a thread terminates, whether successfully or with an error, it calls the cleanup function of every ephemeral resource it owns, in the reverse order of acquisition.
  • Manually cleaning up ephemeral resources must still be possible. Successive attempts to clean up the same ephemeral resource shall do nothing. Then the ownership system only needs to guarantee that every cleanup function is called at least once.
  • Since Successor ML doesn't have substructural types, we must content ourselves with raising exceptions in certain circumstances that Rust can prevent statically, like use after free.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.