Giter VIP home page Giter VIP logo

scodec-stream's Introduction

scodec

Discord Maven Central javadoc

Scala combinator library for working with binary data.

Design Constraints

This library focuses on contract-first and pure functional encoding and decoding of binary data. The following design constraints are considered:

  • Binary structure should mirror protocol definitions and be self-evident under casual reading
  • Mapping binary structures to types should be statically verified
  • Encoding and decoding should be purely functional
  • Failures in encoding and decoding should provide descriptive errors
  • Compiler plugin should not be used

As a result, the library is implemented as a combinator based DSL. Performance is considered but yields to the above design constraints.

Acknowledgements

Scodec 1.x used Shapeless and is heavily influenced by scala.util.parsing.combinator. As of Scodec 2.x, the library only depends on the standard library.

Administrative

This project is licensed under a 3-clause BSD license.

The scodec channel on Typelevel Discord is a good place to go for help.

Introduction

The primary abstraction is a Codec[A], which supports encoding a value of type A to a BitVector and decoding a BitVector to a value of type A.

The codecs objects provides a number of predefined codecs and combinators.

    import scodec.*
    import scodec.bits.*
    import scodec.codecs.*

    // Create a codec for an 8-bit unsigned int followed by an 8-bit unsigned int followed by a 16-bit unsigned int
    val firstCodec = uint8 :: uint8 :: uint16

    // Decode a bit vector using that codec
    val result: Attempt[DecodeResult[(Int, Int, Int)]] = firstCodec.decode(hex"102a03ff".bits)
    // Successful(DecodeResult(((16, 42), 1023), BitVector(empty)))

    // Sum the result
    val add3 = (_: Int) + (_: Int) + (_: Int)
    val sum: Attempt[DecodeResult[Int]] = result.map(_.map(add3))
    // Successful(DecodeResult(1081, BitVector(empty)))

Automatic case class binding is supported via tuples:

    case class Point(x: Int, y: Int, z: Int)

    val pointCodec: Codec[Point] = (int8 :: int8 :: int8).as[Point]

    val encoded: Attempt[BitVector] = pointCodec.encode(Point(-5, 10, 1))
    // Successful(BitVector(24 bits, 0xfb0a01))

    val decoded: Attempt[DecodeResult[Point]] = pointCodec.decode(0xfb0a01)
    // Successful(DecodeResult(Point(-5, 10, 1), BitVector(empty)))

Codecs can also be derived, resulting in usage like:

    case class Point(x: Int, y: Int, z: Int) derives Codec

    val encoded: Attempt[BitVector] = Codec.encode(Point(-5, 10, 1))
    // Successful(BitVector(96 bits, 0x000000fb0000000a00000001))

    val decoded: Attempt[DecodeResult[Point]] = Codec.decode[Point](0x000000fb0000000a00000001)
    // Successful(DecodeResult(Point(-5, 10, 1), BitVector(empty)))

New codecs can be created by either implementing the Codec trait though typically new codecs are created by applying one or more combinators to existing codecs.

See the guide for detailed documentation. Also, see ScalaDoc. Especially:

Ecosystem

Many libraries have support for scodec:

Examples

There are various examples in the test directory, including codecs for:

The protocols module of fs2 has production quality codecs for the above examples.

The skunk database library uses scodec to communicate with Postgres.

The hippo project uses scodec to parse .hprof files.

The bitcoin-scodec library has a codec for the Bitcoin network protocol.

The scodec-msgpack library provides codecs for MessagePack.

The fs2-http project uses FS2, scodec, and shapeless to implement a minimal HTTP client and server.

The scodec-bson library implements BSON codecs and combinators.

Testing Your Own Codecs

If you're creating your own Codec instances scodec publishes some of its own test tooling in the scodec-testkit module.

Getting Binaries

libraryDependencies += "org.scodec" %%% "scodec-core" % 
  (if (scalaVersion.value.startsWith("2.")) "1.11.9" else "2.1.0")

Building

This project uses sbt and requires node.js to be installed in order to run Scala.js tests. To build, run sbt publishLocal.

Code of Conduct

See the Code of Conduct.

scodec-stream's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scodec-stream's Issues

Combination of stream decoders doesn't emit all expected items

Output of the following program is abcd, while according to my understanding it should be abcdababc. Am I missing something?

import cats.effect.{ExitCode, IO, IOApp}
import fs2.Stream
import scodec.Err
import scodec.bits.{BitVector, HexStringSyntax}
import scodec.codecs.bits
import scodec.stream.StreamDecoder

object Test extends IOApp {
  val input = hex"1a bc d7 ab 7a bc"

  val decoder = StreamDecoder.many(bits(4)).flatMap { a =>
    //println(s"a: ${a.toHex}")
    StreamDecoder.tryMany(
      bits(4).flatMap { b =>
        //println(s"b: ${b.toHex}")
        if (b.toHex == "7") scodec.codecs.fail[BitVector](Err("AAA"))
        else scodec.codecs.provide(b)
      }
    )
  }

  override def run(args: List[String]): IO[ExitCode] =
    Stream
      .emits(input.toArray.map(BitVector.apply(_)))
      .covary[IO]
      .through(decoder.toPipe)
      .evalTap(a => IO(println(s"Decoded: ${a.toHex}")))
      .compile
      .drain >> IO(ExitCode.Success)
}

Release for FS2 0.10.0-M2

Are there any plans for a release compatible with FS2 0.10.0-M2? (I see that the implementation seems to be ready on the series/1.1.x branch.)

sbt-assembly fails due to repeated and conflicting jar references

Note that both scodec-bits_2.10-1.0.5.jar and scodec-bits_2.10-1.0.4.jar are referenced

A project that only references scodec-core does not fail.

[info] Including from cache: scodec-bits_2.10-1.0.5.jar
[info] Including from cache: scodec-stream_2.10-0.7.0.jar
[info] Including from cache: scalaz-stream_2.10-0.6a.jar
[info] Including from cache: scalaz-effect_2.10-7.1.0.jar
[info] Including from cache: scodec-core_2.10-1.7.0.jar
[info] Including from cache: scodec-bits_2.10-1.0.4.jar
[info] Including from cache: shapeless_2.10.4-2.1.0.jar
[info] Including from cache: quasiquotes_2.10-2.0.0.jar
[info] Including from cache: scala-reflect.jar
[info] Including from cache: scala-library-2.10.5.jar
[info] Including from cache: scalaz-core_2.10-7.1.0.jar
[info] Including from cache: scalaz-concurrent_2.10-7.1.0.jar
[info] Checking every *.class/*.jar file's SHA-1.
[info] Merging files...
[warn] Merging 'META-INF/MANIFEST.MF' with strategy 'discard'
[error] 74 errors were encountered during merge
java.lang.RuntimeException: deduplicate: different file contents found in the following:
/home/rsearle/.ivy2/cache/org.scodec/scodec-core_2.10/bundles/scodec-core_2.10-1.7.0.jar:scodec/bits/Bases$Alphabets$Base64$.class
/home/rsearle/.ivy2/cache/org.scodec/scodec-bits_2.10/bundles/scodec-bits_2.10-1.0.5.jar:scodec/bits/Bases$Alphabets$Base64$.class
/home/rsearle/.ivy2/cache/org.typelevel/scodec-bits_2.10/bundles/scodec-bits_2.10-1.0.4.jar:scodec/bits/Bases$Alphabets$Base64$.class

build.sbt

name := "scodec streaming assembly"

version := "0.1"

scalaVersion := "2.10.5"

resolvers += "Scalaz Bintray Repo" at "http://dl.bintray.com/scalaz/releases"

libraryDependencies ++= Seq(
   "org.scodec" %% "scodec-stream" % "0.7.0"
)

project/build.properties

sbt.version=0.13.7

project/assembly.sbt

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.13.0")

encode.emit genericity

Is there any reason for encode.emit to return a StreamEncoder[Nothing] instead of being generic and returning a StreamEncoder[A]? Wouldn't something like this work?

    def emit[A](bits: BitVector): StreamEncoder[A] = {
      StreamEncoder.instance[A] { h =>
        Pull.output1(bits) >> Pull.pure(h -> encode.empty[A])
      }
    }

Incorrect behavior in StreamDecoder with listOfN codec

Hello. First off, sorry if this is the wrong place to report the issue, I think the problem originates from the listOfN implementation in Scodec, but it manifests itself when using StreamDecoder with a codec using listOfN. Please let me know if you'd like me to re-open the issue in the Scodec main repo.

If you have some encoded list of N items, but the end of item M < N happens to align with the end of the current decoding buffer (0 bits remaining), the decoder will report a list of M items, then error out because it was expecting N.

I was able to distill this down to a small example:

object Test extends IOApp {
	def run(args: List[String]): IO[ExitCode] = {
		val in = List(1,2,3,4,5,6,7)
		val codec = listOfN(uint16, uint16)
		val encodedBytes = codec.encode(in).require.toByteArray
		println(encodedBytes.toList)

		val firstPart = encodedBytes.slice(0, 6)
		val secondPart = encodedBytes.slice(6, encodedBytes.length)
		// val stream = Stream.chunk[IO, Byte](Chunk.array(encodedBytes)) // WORKS
		val stream = Stream.chunk[IO, Byte](Chunk.array(firstPart)) ++ Stream.chunk[IO, Byte](Chunk.array(secondPart)) // ERROR

		val pipe = StreamDecoder.many(codec).toPipeByte[IO]

		stream.through(pipe).evalTap{ list => IO(println(s"Decoded: $list")) }.compile.drain >> IO(ExitCode.Success)
	}
}

dependencies:

// a few of these not strictly necessary for the example above, 
// but I was playing with a bit more and it's easier to leave them all in for now
libraryDependencies ++= Seq(
  "org.scodec" %% "scodec-core" % "1.11.7",
  "org.scodec" %% "scodec-stream" % "2.0.0",
  "co.fs2" %% "fs2-core" % "2.2.1",
  "co.fs2" %% "fs2-io" % "2.2.1",
  "co.fs2" %% "fs2-reactive-streams" % "2.2.1",
  "org.typelevel" %% "cats-core" % "2.0.0",
  "org.typelevel" %% "cats-kernel" % "2.0.0",
  "org.typelevel" %% "cats-effect" % "2.0.0"
)

In the example, I split the encoded array into two chunks, where the first chunk is just enough to hold the list length plus two of the list items. The output is as follows:

[info] running Test
List(0, 7, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7)
scodec.stream.CodecError: Insufficient number of elements: decoded 2 but should have decoded 7
[error] Nonzero exit code: 1

If you use the commented-out val stream which uses the whole array in one chunk, the list is successfully decoded.

My current use-case is to receive a stream of data in an Http4s webserver, and as far as I'm aware, I don't have any control over where the byte/buffer boundaries end up. Every once in a while the request/response handler will crash with the error above, apparently due to the way the items in the stream happen to line up.

I think the underlying issue is the way the ListCodec is used under the hood - if you have a listOfN, the N is passed as a limit to Decoder.decodeCollect[List, A](codec, limit)(buffer) and so you can end up with fewer items than intended as long as there are no more bits after the last A in the buffer. I think a solution could be to make listOfN return an Err.InsufficientBits instead of a plain Err("...") so that the StreamDecoder will continue its loop instead of crashing, but maybe you can think of a better way.

Decoder.choiceDecoder works incorrectly when via streaming.

Consider this code

import cats.effect.{ExitCode, IO, IOApp}
import fs2.Stream
import scodec.Attempt.Successful
import scodec.bits.{BitVector, _}
import scodec.codecs._
import scodec.{DecodeResult, Decoder}

import scala.concurrent.duration._

object Fs2Scodec extends IOApp {

  override def run(args: List[String]): IO[ExitCode] = {
    val cafeMsg = hex"cafe"
    val babeMsg = hex"babe"

    val cafeDec = constant(cafeMsg).map(_ => "cafe")
    val babeDec = constant(babeMsg).map(_ => "babe")

    val choiceDec = Decoder.choiceDecoder(cafeDec, babeDec)

    val Successful(DecodeResult(cafeRes, BitVector.empty)) = choiceDec.decode(cafeMsg.bits)
    val Successful(DecodeResult(babeRes, BitVector.empty)) = choiceDec.decode(babeMsg.bits)

    def streamedRes[T](msg: ByteVector, dec: Decoder[T]) =
      Stream.emits(msg.toArray).covary[IO]
        .metered(100.millis)
        .through(StreamDecoder.tryMany(dec).toPipeByte)
        .compile.toList

    for {
      cafeChoiceStreamRes <- streamedRes(cafeMsg, choiceDec)
      cafeStreamRes <- streamedRes(cafeMsg, cafeDec)
      babeChoiceStreamRes <- streamedRes(babeMsg, choiceDec)
      babeStreamRes <- streamedRes(babeMsg, babeDec)
      _ <- IO.delay(println(
        s"""
           | cafeRes: $cafeRes
           | babeRes: $babeRes
           | cafeChoiceStreamRes: $cafeChoiceStreamRes
           | cafeStreamRes: $cafeStreamRes
           | babeChoiceStreamRes: $babeChoiceStreamRes
           | babeStreamRes: $babeStreamRes
           |""".stripMargin
      ))
    } yield ExitCode.Success
  }
}

Expected result:


 cafeRes: cafe
 babeRes: babe
 cafeChoiceStreamRes: List(cafe)
 cafeStreamRes: List(cafe)
 babeChoiceStreamRes: List(babe)
 babeStreamRes: List(babe)

Actual result:

 cafeRes: cafe
 babeRes: babe
 cafeChoiceStreamRes: List()
 cafeStreamRes: List(cafe)
 babeChoiceStreamRes: List()
 babeStreamRes: List(babe)

The choice stream terminates with error:
scodec.stream.CodecError: composite errors (cannot acquire 16 bits from a vector that contains 8 bits,cannot acquire 16 bits from a vector that contains 8 bits)

  • At first I'd like to discuss whether this behavior should be considered as an error
  • If yes, then this is a PR #74 , that resolves the issue.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.