smlfamily / basislibrary Goto Github PK

View Code? Open in Web Editor NEW

60.0 60.0 4.0 67 KB

Repository and Wiki for enriching the Standard ML Basis Library

Standard ML 100.00%

basislibrary's People

Contributors

Stargazers

Watchers

Forkers

kfl ptroja munksgaard seanpm2001

basislibrary's Issues

Clarify timing of exceptions from curried functions

Some curried functions in the Basis can raise exceptions. In a few cases it is underspecified when these are raised. For example, does Time.fmt ~1 throw, or is the exception deferred until the second argument is provided?

Affected functions:

Real.fmt
Time.fmt
others?

Discussion for proposal 2015-001

This issue is for discussion of proposal 2015-001 (Correction to the ListPair module).

Make `Option` and `List` structures more consistent with each other

The option and list datatypes have a lot in common of each other. Both support the following operations:

(* functor *)
val map : ('a -> 'b) -> ('a t -> 'b t)

(* monoidal functor *)
val unit : 'a -> 'a t
val pair : 'a t * 'b t -> ('a * 'b) t

(* pointed functor *)
val empty : 'a t
val null : 'a t -> bool

(* monad *)
val join : 'a t t -> 'a t

The above list of operations is far from exhaustive.

Unfortunately, the API exposed by the Basis Library introduces unnecessary discrepancies:

Option has Option.isSome, but List has List.null
Option has Option.join, but List has List.concat
For some weird reason, List has List.mapPartial, rather than List.concatMap
Many of the functions in List also make sense for options, but Option doesn't provide them

Would it be possible to fix these discrepancies?

Discussion for proposal 2022-001 (Add value tau to MATH signature)

This issue is for discussion of proposal 2022-001 (Add value tau to MATH signature).

Addition of a Void module

Unsure if this is the best format/place for a new proposal (or if I should, say, create a Wiki page). If there's something else I should do, please let me know!

Often, it is useful to have access to an empty "void" type. A simple proposal might look like this:

signature VOID =
  sig
    type void
    type t = void
    val absurd : t -> 'a
  end

structure Void :> VOID =
  struct
    datatype void = Void of void
    type t = void
    fun absurd (Void v) = absurd v
  end

Perhaps a few other auxiliary functions could be useful, as well. For example:

    val fail : string -> t  (* raises `Fail` *)
    val asLeft : ('a,t) either -> 'a
    val asRight : (t,'a) either -> 'a

Discussion of proposal 2015-007 (Addition of Ref module).

This issue is for discussion of proposal 2015-007 (Addition of Ref module).

Disussion of proposal 2015-004 (Addition of Buffer module)

This issue is for discussion of proposal 2015-004 (Addition of Buffer module).

Discussion of proposal 2015-005 (Addition of Fn module).

This issue is for discussion of proposal 2015-005 (Addition of Fn module).

Discussion of proposal 2015-008 (Additional math functions).

This issue is for discussion of proposal 2015-008 (Additional math functions).

ambiguity of Date.date offset larger than 24hours

Sorry, this is a bug report, not an extension proposal.
I couldn't find a place to report a bug of basis spec.

Basis Date sais that:

Offsets are taken modulo 24 hours. That is, we express t, in hours, as sgn(t)(24*d + r), where d and r are non-negative, d is integral, and r < 24. The offset then becomes sgn(t)r and sgn(t)(24d) is added to the hours (before converting hours to days).

But sml/nj 110.97 takes an offset >24, ignore it:

- Date.toString (Date.date { year=2000, day= 1, hour=0, minute=0, month=Date.Jan, offset=SOME (Time.fromSeconds (0)), second=0});
val it = "Sat Jan 01 00:00:00 2000" : string
- Date.toString (Date.date { year=2000, day= 1, hour=0, minute=0, month=Date.Jan, offset=SOME (Time.fromSeconds (3600 * 24 + 1)), second=0});
val it = "Sat Jan 01 00:00:01 2000" : string

I think this implementation is correct, because offset larger than 24 is unnatural.
Also, the spec sais taken modulo 24 hours.

Discussion of proposal 2015-006 (Additional string conversion functionality)

This issue is for discussion of proposal 2015-006 (Additional string conversion functionality).

Clarify how string conversion functions in signature REAL treat the sign of a NaN value

In the signature REAL, the conversion functions fromDecimal and toDecimal ensure that the sign field of the IEEEReal.decimal_approx value matches the sign bit of the real value even if the value is NaN.

The description of the functions scan and fromString does not specify how the sign bit is derived in the case of an argument giving the NaN value, so an implementation may arbitrarily choose a value for the sign bit. Is this underspecified behavior intended? If the intention was to preserve the sign bit through conversion of NaN, could we say:

fromString is equivalent to Option.composePartial (fromDecimal, IEEEReal.fromString)?
scan is equivalent to Option.composePartial (mapPartialFirst fromDecimal, IEEEReal.scan)
where fun mapPartialFirst f (a, b) = case f a of SOME a' => SOME (a', b) | NONE => NONE?

The description of the function toString says that a NaN value is converted to the string "nan". This ignores the sign bit of the NaN value. Therefore toString is not equivalent to IEEEReal.toString o Real.toDecimal. Is this intended?

This issue is raised as suggested in smlnj/legacy#73 (comment).

Discussion for proposal 2018-001 (Addition of monomorphic buffers)

This issue is for discussion of proposal 2018-001 (Addition of monomorphic buffers).

Note that this proposal is a generalization of the earlier proposal 2015-004.

Expose ML implementation name in the REPL

I'd like interactive Successor ML implementations to expose their name as a standardized global constant:

val topLevelName : unit -> string

This way, it would be easier to work around differences between implementations, writing code that looks like this:

case topLevelName () of
    "Poly/ML" => use "Foo.PolyML.sml"
  | "MosML" => use "Foo.MosML.sml"
  | "SML/NJ" => use "Foo.SMLNJ.sml"

Thus avoiding the need to use arcane external tools for building programs.

Discussion for proposal 2016-2

It'd be nice to define the ways we query external standards and basis defined optional modules

Discussion of proposal guidelines and processes

This issue is for discussion of the draft guidelines for SML Basis Library proposals that have been posted to the wiki.

This issue is also intended as a forum for discussing the processes needed to submit, discuss, approve, and release changes to the Basis Library.

Discussion of 2021-001 (Add getWindowSz function to Posix.TTY structure)

This issue is for discussion of Proposal 2021-001 (Add getWindowSz function to Posix.TTY structure).

Discussion for proposal 2017-001 (Millisecond sleep)

This issue is for discussion of proposal 2017-001 (Millisecond sleep).

Subscript exceptionless signatures

I was curious if anyone had considered implementing signatures like VECTOR, and VECTOR_SLICE,
in terms of an opaquely ascribed cursor/iterator type, which ensures the the constraint that 0 <= i <= i + n <= |v|, where |v| is the length of v.

Then implementing the existing VECTOR/VECTOR_SLICE internally as functions from (int, cursor) -> cursor which raise Subscript exceptions when constructing the cursor to be returned, or having the cursor type transparently ascribed.

Or too much hassle for too little gain, because e.g. the lack of dependent types means 2 vectors of the same length would have to be represented by different cursor types? Mainly wanted to gauge reaction before thinking too hard about the details of implementing e.g. vector in terms of this.

Discussion for proposal 2020-001 (Addition of the Universal module)

ambiguity of offset precision of Date.date

Basis Date sais:

Offsets are taken modulo 24 hours. That is, we express t, in hours, as sgn(t)(24*d + r), where d and r are non-negative, d is integral, and r < 24.

r < 24 means that r is in hours.
And by we express t, in hours, r should be integral.

But in general, the Time.time for offset has a finer precision than seconds.

And SML/NJ 110.97 keep values finer that hour like below:

- Date.toString (Date.date { year=2000, day= 1, hour=0, minute=0, month=Date.Jan, offset=SOME (Time.fromSeconds (0)), second=0});
val it = "Sat Jan 01 00:00:00 2000" : string
- Date.toString (Date.date { year=2000, day= 1, hour=0, minute=0, month=Date.Jan, offset=SOME (Time.fromSeconds (1)), second=0});
val it = "Sat Jan 01 00:00:01 2000" : string (* +1 sec *)

I think the specification should be modified to keep seconds.

Discussion for proposal 2018-002 (Additional slice operations)

This issue is for discussion of proposal 2018-002 (Additional slice operations).

Add a type t to many signatures

I didn't find any issues discussing this, but guess it possibly has come up in the past,
apologies If I have missed that.

It would be convenient if many of the signatures provided by the basis came with a 'type t'.
an implementation which adds these is the extended-basis from mltonlib, but it would be nice if the basis provided them itself.

https://github.com/MLton/mltonlib/blob/92450815c771774f3f1d6ffd245802f262fa9b0f/com/ssh/extended-basis/unstable/detail/bootstrap.sml

a simple use case, in the following example:
signature STRING_STUFF =
sig
type t;
val fromString: string -> t option;
val toString : t -> string;
end

functor Something(StringStuff : STRING_STUFF) =
struct
(* ... )
end
( currently you must first add the type t *)
structure IntStringStuff = struct open Int type t = int end :> STRING_STUFF;
structure Foo = Something(IntStringStuff);

(* If it was included in the basis one could just use the following instead *)
structure Foo = Something(Int :> STRING_STUFF);

given that it is possible to add these to the existing basis as mltonlib does, code like above can even be backwards compatible given a small shim, I'm not sure what if any difference it might have on compile times, if that is an argument against it, it could be measured...

Discussion for proposal 2019-001 (Correction to the PRIM_IO signature)

This issue is for discussion of proposal 2019-001 (Correction to the PRIM_IO signature).

Regular expressions

This is a feature request for adding regular expressions to the basic library of Successor-ML.

I assume the need for them and their usefulness is obvious, if not I could elaborate upon request.

Some SML implementations already have them.
Moscow ML has a POSIX 1003.2 variant:
http://mosml.org/mosmllib/Regex.html
SML/NJ has a variant with multiple syntaxes, currently AWK syntax:
http://www.smlnj.org/doc/smlnj-lib/Manual/regexp-lib-part.html

However lack of regular expressions in the BASIS library has dire consequences for everyday programming:
Availability - many implementations don't have them, e.g. PolyML;
Portability - a program written to use one implementation's regular expressions won't run on another's.

Which syntax will be chosen is in my opinion insignificant, but it seems like the multi-syntax approach is more versatile.

Bonus points for a direct possibility of matching a word boundary (\b or < and > in most syntaxes), a very useful feature in my experience, which both cited implementations currently lack.

Unsigned integer types

Many functions only make sense when applied to nonnegative integers (e.g., List.nth). However, because Standard ML doesn't have an unsigned integer type, one has to use signed integers, and then hopefully not forget to implement a nonnegativity check. I'd like a proper unsigned integer type, so that the check is performed automatically and at compile-time, rather than manually and at runtime.

I can anticipate an argument that I should use the existing word type. However, as useful as the word type may be for certain use cases (e.g., processing files with non-textual contents), it isn't a good general-purpose unsigned integer type for the following reasons:

It wraps around on over or underflow, rather throwing an exception. This is acceptable, perhaps even desirable, for low-level bit twiddling, but not for doing arithmetic in most applications.
word constants are represented textually as hexadecimals preceded by 0w. To get the more familiar decimal representations of numeric values, one would have to translate back and forth between word and int all over the place.

Unsigned integers should be provided in modules with the usual INTEGER signature, subject to the following constraints:

minInt is always SOME 0
abs is the identity function
sign never returns ~1
~ raises Overflow if its argument isn't 0
- raises Overflow if its first argument is less than the second
div and quot are the same function
mod and rem are the same function

Discussion of Basis Library Process document

Please use this issue to discuss the Process document

Discussion for proposal 2015-003

This issue is for discussion of proposal 2015-003 (Additional operations on sequences).

Ephemeral resource management

An ephemeral resource is a runtime entity with nontrivial cleanup logic, which must be executed exactly once. Not more, not less. Examples of ephemeral resources are: file handles, network connections, GUI objects, etc. Java and C++ have dedicated language features (“finalizers” and “destructors”, respectively) for managing ephemeral resources. Standard ML doesn't provide any useful abstractions for this purpose, either in the core language or in the Basis Library. As a result, writing programs that don't leak ephemeral resources in Standard ML is just as hard as it is in C. What a disaster!

IMO, the most successful abstraction ever designed for managing ephemeral resources is ownership, which can be thought of as a stylized variant of substructural types. Several languages (all of which descend from C++) incorporate ownership in their design, but as far as I can tell, the only one that does so in a type-safe manner is Rust.

Of course, neither Standard ML nor Successor ML has substructural types. But that doesn't mean that we can't incorporate ownership support in some way or another. An ownership system for Successor ML must have the following properties:

Cleanup logic is guaranteed to run exactly once for every ephemeral resource.
It must be implemented entirely as a library. No radical changes to ML's type system are allowed - nothing like Rust's affine types and lifetimes.

Here's my very rough proposal:

Unlike Rust, where ephemeral resources are owned by lexical scopes, in Successor ML, ephemeral resources shall be owned by threads. The first owner of an ephemeral resource is the thread that created it, but ownership can be transferred to another thread. When a thread terminates, whether successfully or with an error, it calls the cleanup function of every ephemeral resource it owns, in the reverse order of acquisition.
Manually cleaning up ephemeral resources must still be possible. Successive attempts to clean up the same ephemeral resource shall do nothing. Then the ownership system only needs to guarantee that every cleanup function is called at least once.
Since Successor ML doesn't have substructural types, we must content ourselves with raising exceptions in certain circumstances that Rust can prevent statically, like use after free.