Giter VIP home page Giter VIP logo

foundation's People

Contributors

adinapoli avatar aherrmann avatar akshaymankar avatar angerman avatar argumatronic avatar ciderale avatar dtaskoff avatar galderz avatar jange avatar jship avatar klacansky avatar matthewbauer avatar mitchellwrosen avatar ndmitchell avatar nicolasdp avatar nwtgck avatar parsonsmatt avatar phadej avatar picnoir avatar plaprade avatar purcell avatar release-candidate avatar ryanglscott avatar sighingnow avatar snoyberg avatar tekul avatar teodorlu avatar threefx avatar tmcdonell avatar vincenthz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

foundation's Issues

fromBytes UTF8 gets stuck

> print $ F.fromBytes F.UTF8 $ F.fromList [169,32,50,48,48,56,32,98,121,32,65,110,100,121,32,71,105]
("",[169,32,50,48,48,56,32,98,121,32,65,110,100,121,32,71,105])

Seems that it shouldn't just get wedged and ignore the rest of the bytes entirely.

Add IsString FilePath

It's very useful in practice, and currently people are just going to do:

F.filePathFromString $ F.fromString "C:/Neil/hs-foundation/examples/FieldTrip.txt"

Which isn't really safer, just less convenient. (I remain unconvinced about reinventing virtual file paths at the same time as doing the rest, but this lets me ignore that for the most part.)

Add native array slices

I ported a piece of Hoogle, available at https://gist.github.com/ndmitchell/6828df5eeca3776b32a0420c73dc95ed.

Profiling this code shows 98% of the time is in copyAtRO, even though to convert between ByteString and ByteArray I go via a list... Specifically, 47% of the time is due to break on newlines, and 51% due to uncons on the result. If there were a separating function that could be optimised to avoid the repeated recopy, but that means users can't write their own copy without going O(n^2), which is sad. I think String must support slicing, and given your nice design, that means Array must too.

Rename Core.String.IO to Core.IO

The fact the IO stuff works with String is more just a detail, it's really IO functions, and gives a nicer symmetry to the library.

Add Core.Partial

Should reexport the nice Partial stuff, plus some partial functions, e.g. fromJust.

arch should be an enum

What are typical values for arch? That's the obvious question when looking at the function.

Core.Chunks, add Eq, Ord and Show

I believe if the Element type of the chunks is already an instance of Eq, Ord or Show, it must not be that hard to add the default instance of these classes

Add ArrayUArray data collection

Possible definition of ArrayUArray is:

    import Foundation.Array
    data ArrayUArray ty = ArrayUArray (Array (UArray ty))

It should be more efficient than a simple array for many things like streaming, appending in the begin/middle/end

Be suitable for Hoogle

Here's my checklist:

  • Strings which work nicely, but pretty minimally
  • Vectors which work nicely, but pretty minimally
  • Ability to source the above from a memory mapped file
  • Ability to read strings in lazy chunks from a file with copying

String encoding improvement

  • UTF16 decoding
  • UTF16 encoding
  • UTF8 Lenient decoding
  • 7bit ASCII support decoding
  • 8bit ISO 8859-1 decoding
  • Modified UTF-8 (GHC UTF-8 encoding for Strings) support for IsString optimisation

Ditch Arrow

Noone likes Arrow. Why not supply first/second as either restricted to Tuple, or over Bifunctor?

foldTextFile is undocumented

What does it do? Is each thing a line? Or is each thing a fixed size? Or no guarantees? Any blanks?

What I really want is to push the String values into a conduit, which requires a different API anyway...

Use copyMutableArray# and copyArray#

Should make things like copyAtRO go much faster. They're native in GHC 7.6 and above, but slightly broken in 7.6, so make it conditional on 7.8 or above.

Support mmap for string/array

I can call Core.Foreign.fileMapRead, to get a FinalPtr Word8, but I really want an UArray Word8. Why not just have fileMapRead return that directly?

Chunks is just []

Admittedly with a strict element, but calling it StrictList might be a better name? Or maybe it has sufficiently little value that it's unnecessary.

Add builder for strings and boxed arrays

It would be nice to also have builders for strings and boxed arrays like the one for unboxed arrays. That would make it much easier to start building your destination collection while going over an input collection if you can't know the length of the destination collection.
Maybe even a Buildable type class?

Add uncons

I wrote:

f_uncons :: F.ByteArray -> Maybe (F.Word8, F.ByteArray)
f_uncons x = case x F.! 0 of
    Nothing -> Nothing
    Just c -> Just (c, F.drop 1 x)

But seems like it should be in the library, which is likely more efficient. Naturally generalise to all the things.

Is the ByteArray synonym worth it?

Writing UArray Word8 helps people learn all pieces of the library faster. And it shows the nice symmetry. I think this is something to show off, rather than hide behind a type alias.

String type features

Features I'd like from the String type:

  • Small strings should be moveable on the heap so they don't cause excessive fragmentation.
  • Large strings should be pinned in memory so they don't cause excessive GC time.
  • In certain applications I might want to never use large strings (or set the threshold much higher) - particularly long running server applications where fragmentation is likely to build. Controlling this with an environment variable seems reasonable (e.g. you can put it in a CAF so it doesn't add much overhead).
  • I want to be able to create a String from a memory mapped file, without copying the string. At the moment that requires creating a String from a Ptr, but happy with other formulations.

FilePath API

  • extension manipulation
  • path dropping / append filename
  • parent
  • windows support
  • OS support (readdir)

Add partial functions

fromJust/head/tail/fromLeft/fromRight are all useful functions. They should exist somewhere.

add fileMapReadWith

Something like:

fileMapReadWith :: FilePath -> (UArray Word8 -> IO a) -> IO a

Ditch comparing

compare `on` f

Is quite a bit clearer - comparing adds relatively little. Usually having sortOn is way better than overloading the comparing anyway.

Rename vector to array

Just a thought, given its the more "natural" name and you are very much replacing base which has array.

Move mmap stuff to Core.IO

I'd move the mmap stuff to it (in my mind, mmap is about IO, not foreign).

I'd also move the Core.String.IO to it, since its only 2 functions.

Module names are confusing

Having Overture is weird. I'd either make it Core.Prelude or Foundation. I'd probably make all the other modules Foundation.Number etc.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.