Giter VIP home page Giter VIP logo

magic-bytes's Introduction

Magic bytes

Build Status

Magic Bytes is a javascript library analyzing the first bytes of a file to tell you its type. The procedure is based on https://en.wikipedia.org/wiki/List_of_file_signatures.

Installation

Run npm install magic-bytes.js

Usage

On server:

import filetype from 'magic-bytes.js'

filetype(fs.readFileSync("myimage.png")) // ["png"]

Using HTML:

<input type="file" id="file" />

 <script src="node_modules/magic-bytes.js/dist/browser.js" type="application/javascript"></script>
<script>
    document.getElementById("file").addEventListener('change', (event, x) => {
      const fileReader = new FileReader();
      fileReader.onloadend = (f) => {
        const bytes = new Uint8Array(f.target.result);
        console.log("Possible filetypes: " + filetypeinfo(bytes))
      }
      fileReader.readAsArrayBuffer(event.target.files[0])
    })
</script>

API

The following functions are availble:

  • filetypeinfo(bytes: number[]) Contains typeinformation like name, extension and mime type: [{typename: "zip"}, {typename: "jar"}]
  • filetypenames(bytes: number[]) : Contains type names only: ["zip", "jar"]
  • filetypemime(bytes: number[]) : Contains type mime types only: ["application/zip", "application/jar"]
  • filetypeextensions(bytes: number[]) : Contains type extensions only: ["zip", "jar"]

Both function return an empty array [] otherwise, which means it could not detect the file signature. Keep in mind that txt files for example fall in this category.

You don't have to load the whole file in memory. For validating a file uploaded to S3 using Lambda for example, it may be
enough to load the files first 100 bytes and validate against them. This is especially useful for big files.

see examples for practical usage.

Tests

Run npm test

Example

See examples/

How does it work

The create-snapshot.js creates a new tree. The tree has a similar shape to the following

{
  "0x47": {
    "0x49": {
      "0x46": {
        "0x38": {
          "0x37": {
            "0x61": {
              "matches": [
                {
                  "typename": "gif",
                  "mime": "image/gif",
                  "extension": "gif"
                }
              ]
            }
          },
        }
      }
    }
  }
}

It acts as a giant lookup map for the given byte signatures. To check all available entries, have a look at pattern-tree.js and its generated pattern-tree.snapshot, which acts as a static resource.

Supported types

Please refer to src/pattern-tree.js

Roadmap

  • Specialize type detection(like zip) using offset-subtrees
  • Add encoding detection

magic-bytes's People

Contributors

larskoelpin avatar dependabot[bot] avatar daveteu avatar pastelmind avatar bioruebe avatar jakubborowski-lb avatar mbuhot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.