Giter VIP home page Giter VIP logo

mime's Introduction

Mime

.NET wrapper for libmagic

NuGet license

Install

via NuGet:

PM> Install-Package Mime

Requirements

Supported runtimes:

  • linux-musl-x64
  • linux-x64
  • osx-arm64(tested on macOS 13 Ventura)
  • osx-x64
  • win-x64

Basic usage

using HeyRed.Mime;

// (Optionally) You can set path to magic database file manually.
MimeGuesser.MagicFilePath = "/path/to/magic.mgc";

// Guess mime type of file(overloaded method takes byte array or stream as arg.)
MimeGuesser.GuessMimeType("path/to/file"); //=> image/jpeg

// Get extension of file(overloaded method takes byte array or stream as arg.)
MimeGuesser.GuessExtension("path/to/file"); //=> jpeg

// Get mime type and extension of file(overloaded method takes byte array or stream as arg.)
MimeGuesser.GuessFileType("path/to/file"); //=> FileType

Advanced

Want more than just the mime type? Use the Magic class:

string calc = @"C:\Windows\System32\calc.exe";
using var magic = new Magic(MagicOpenFlags.MAGIC_NONE);
magic.Read(calc); //=> PE32+ executable (GUI) x86-64, for MS Windows

// Check encoding:
string textFile = @"F:\Temp\file.txt";
using var magic = new Magic(MagicOpenFlags.MAGIC_MIME_ENCODING);
magic.Read(textFile); //=> Output: utf-8

Also, we can combine flags with "|" operator. See all flags for more info.

Remarks

  • The Magic class is not thread safe, but if you use different instances on different threads it seems to work fine.
  • The MimeGuesser is thread safe, since it generates a new instance of Magic class on each use.

Possible problems

Exception Solution
DllNotFoundException Make sure that your bin folder contains runtimes directory. If you publishing platform dependent app, then bin should be contains libmagic-1(.dll, .so or .dylib) and magic.mgc files.
BadImageFormatException Make sure when you target the AnyCPU platform the Prefer 32-bit option is unchecked. Or try to target x64/arm64 instead.
MagicException: Could not find any valid magic files! Make sure your magic.mgc file contains in one of /runtimes/ subdirs or along with libmagic-1.[dll|lib|dylib]. Or set path to custom database as described in basic usage

License

MIT

mime's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mime's Issues

No such file or directory

var mimeType = MimeGuesser.GuessMimeType(filePath);

On server throwns exception:
Cannot stat 'C:\inetpub\wwwroot\StudentFiles\Portfolio\5179\2\???????? C# 6.0.pdf' (No such file or directory)

But file exists, and it checked before. filePath have right path with Cyrillic chars. If rename file to with only english chars it works normally.

On dev computer this exception not thrown, it happen only on prod server. Why?

"Could not find any valid magic files!" when included in a Nuget package

I've included the Mime package as part of my own Nuget package NsfwSpy, but when I run the following lines of code I get the error message "Could not find any valid magic files!" on my fresh Ubuntu VM.

var nsfwSpy = new NsfwSpy();
var result = nsfwSpy.ClassifyImage(@"C:\Users\username\Documents\flower.jpg");

Is there something I need to do to include the libmagic dependency in my Nuget package?

Different results on Linux and Windows

The Windows and Linux versions return slightly different results for the same file.

Windows:
SMTP mail, Unicode text, UTF-8 (with BOM) text, with very long lines (881u)
Linux:
SMTP mail, Unicode text, UTF-8 (with BOM) text, with very long lines (881)

example.eml

Could this possibly be changed so that they return the same?

Different results using filePath vs stream input

First I created an empty .xlsx document using Excel. Then I ran it through MimeGuesser.GuessMimeType, once with the filePath overload and once with the stream overload.

The filePath overload produced the correct result, while the stream overload produced an application/zip result.

image

DB path

Provide property to set magic db path.

Libmagic not compatibile with Xamarin on mobile devices

As discovered in #36 trying to run code using this dependency will sometimes cause this or a similar error message to appear:

System.DllNotFoundException: 'libmagic-1 assembly: type: member:(null)'

Since this is almoast the same issue as in #36 it can easily be fixed on Win-x64 machines running Xamarin.UWP by simply moving the libmagic files to the current bin folder as previously discussed.

But on Xamarin.Android the libmagic files need to be inserted into the apk file manually with an apk editor each time the app is build and are only supported by x64 devices, so that they have to be recompiled for ARM Android. I have not managed to get it working successfully yet and I have no way to test compability for iOS devices but expect the same issue there as well.

I actually neither know how to create the shared libraries for android that I guess are needed nor do I know if there is a simpler method to fix this issue. If anyone has an idea how to fix this issue and what to do with the libmagic .so and .mgc files on these mobile platforms I would be happy to hear about it :)

native lib for M1 Mac os

Hi there

Would you be able to add support for m1 mac os? Seems like you would need an arm64 native lib.

I can help out if needed, but I would need some guidance on how to manage the default platform. As the default platform for mac os will no longer be x64.

I was able to get all the tests passing with a local build of libmagic v5.41, so it looks like no other code changes are required. Just needs another platform added.

I couldn't figure out how to get the dylib from the older version of brew (the newer one is too new), so I built it from source from here
https://astron.com/pub/file/

Thanks for your project.

Cannot run it on Linux

Hi,

I cannot use ver 2.2.0 on Linux, neither WSL (Ubuntu 16.04) nor official MS docker, both with dotnet cli 1.0.4 aren't working.

Exception from WSL (Ubuntu 16.04):

Unhandled Exception: HeyRed.Mime.MagicException: Could not find any valid magic files!
   at HeyRed.Mime.Magic..ctor(MagicOpenFlags flags, String dbPath)
   at HeyRed.Mime.MimeGuesser.GuessMimeType(Byte[] buffer, Int32 size)
   at HeyRed.Mime.MimeGuesser.GuessFileType(Byte[] buffer, Int32 size)
   at test.Program.Main(String[] args) in /home/test/Program.cs:line 11

Exception from docker:

Unhandled Exception: System.DllNotFoundException: Unable to load DLL 'libmagic-1': The specified module could not be found.
 (Exception from HRESULT: 0x8007007E)
   at HeyRed.Mime.MagicNative.magic_open(MagicOpenFlags flags)
   at HeyRed.Mime.Magic..ctor(MagicOpenFlags flags, String dbPath)
   at HeyRed.Mime.MimeGuesser.GuessMimeType(Byte[] buffer, Int32 size)
   at HeyRed.Mime.MimeGuesser.GuessFileType(Byte[] buffer, Int32 size)
   at test.Program.Main(String[] args) in /home/test/Program.cs:line 11

Program code:

using System;
using HeyRed.Mime;

namespace Test
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Hello World!");			
            Console.WriteLine(MimeGuesser.GuessFileType(new byte[] { 0xFF, 0xD8, 0xFF  }));
        }
    }
}

libmagic dependency mismatch on MacOS Monterey (5.41)

Hey,
Currently, I am getting the following exception in my application (c# Azure function) when using Mime due to a mismatch of dependencies (I think):

Exception:

Mime: File 5.41 supports only version 16 magic files. `/bin/output/bin/magic.mgc' is version 14.

This is what I ran when installing libmagic:

brew install libmagic
brew link libmagic
env ARCHFLAGS="-arch x86_64" gem install ruby-filemagic -- --with-magic-include=/usr/local/include --with-magic-lib=/usr/local/lib/

This happened on a 64bit Intel Mac running Monterey. The Azure function is running on the following Mime version: 3.0.0

Have you come across this issue before and if so is there a solution?

file type is detected Unknown

Hello

We have an issue we can't figure it out, you might be able to help.

On our local dev machines everything is fine, but when we deploy to qa or uat or prod, file types are detected as 'unknown', we have no clue why it's happening. is there any log somewhere to take a look

we are using it with .net core 2.1.2
p.s we tried with both 3.0.1 and 3.0.2 and result is the same

Update documentation to mention that magic_open isn't thread safe (new Magic())

Hello,
Using this library in a parallel loop (creating new Magic() inside the loop) was causing my app to crash randomly with a heap corruption exception code.

The documentation should be updated to mention that you cannot create new Magic instances simultaneously (or alternatively add a lock inside the constructor) to prevent random crashes for people consuming the library.

Once the separate magic instances are serially constructed, they can be used concurrently with no ill effects.

All SVG are identified as text/svg instead of text/svg+xml

Hi, I've been using this library for security check on file upload.

As a recommended security check, we shouldn't allow upload of files where the declared mimeType (.svg) does not match the detected mime type.
However, when trying a few file types, we get some misleading results.

.svg has been identified as image/svg and all svg files generated from several applications generates as 'image/svg+xml'

Some .xmlx, .docx, .pptx are identified as application/zip
Can you have a look and/or point if I'm doing something wrong?
Essentially just using like the following

        internal string GetMimeType(byte[] buffer)
        {
            return HeyRed.Mime.MimeGuesser.GuessMimeType(buffer);
        }

Excel file type is detected as "bin"

Steps to reproduce

  • Right click and create a new Excel file
  • use MimeGuesser.GuessFileType to detect file type

Actual result
Is detected as 'bin'

Expected result
detected as 'xlsx'

I attached the file to this issue

Note: If you open the file in excel and save it, the issue is gone (and size increased from 7kb to 8kb)

New Microsoft Excel Worksheet (2).xlsx

missing alpine linux x64 support

dotnet core 3.1 supports alpine linux but mime does not.

when running a dotnet core 3.1 app with this mime library under docker with mcr.microsoft.com/dotnet/core/runtime-deps:3.1-alpine and installed libmagic 5.38 - I get an app crash (segmentation fault).
But when running with the debian based image with mcr.microsoft.com/dotnet/core/runtime-deps:3.1 the app runs successfully (libmagic is installed out of the box).
So I suppose the bundled libmagic-1.so is not compatible with libmagic 5.38 installed in alpine linux.

Appreciate any hint how to get mime running on alpine linux based docker images.

Unable to load magic1.dll manually in windows

it would be great if you update this :>
ty in advance .

Errors :

  1. Unable to load DLL 'libmagic1': The specified module could not be found. (Exception from HRESULT: 0x8007007E)

  2. An attempt was made to load a program with an incorrect format. (Exception from HRESULT: 0x8007000B)

Stack Trace :
at HeyRed.Mime.MagicNative.magic_open(MagicOpenFlags flags)
at HeyRed.Mime.Magic..ctor(MagicOpenFlags flags, String dbPath)
at HeyRed.Mime.MimeGuesser.GuessMimeType(Byte[] buffer, Int32 size)
.....

Copy native libs to correct build output directory for Azure Functions

This issue is basically a continuation of issue #36.

Currently, when running Mime in an Azure Function project app, it appears that the function build targets strip out any unnecessary native libs such as libmagic-1.so and magic.mgc.

For context, Azure Functions are run via func start (see azure-functions-core-tools repository). The build target for this creates a second build directory with unnecessary dependencies and native libs stripped out.

This results in the following output directories:

  • bin/output - the first build output directory
  • bin/output/bin - the second function build output directory

This curiously results in a bin directory within the bin directory. This bin/output/bin directory is what is deployed to the function runtime.

Unfortunately, it doesn't seem like we can easily configure what files are included in the function build output directory and this is causing most of Mime's native libs to be removed.

If we look in the bin/output/bin/runtimes directory, we end up with something like this:

runtimes/
  linux-msl-64/ <-- No Mime native libs
  linux-x64/    <-- No Mime native libs
  osx-arm64/    <-- No Mime native libs
  osx-x64/
  win-x64/
    native/     <-- Has native Mime libs, but missing magic.mgc
      libgnurx-0.dll
      libmagic-1.dll

As the native libs aren't included, this causes Mime to throw errors around not being able to find libmagic-1 or magic.mgc.

Ideal solution

Ideally, it would be nice if Mime had a built-in way to ensure that its native libs are included in build output for Azure Functions. I'm really just a MSBuild novice so I don't really know what to suggest though.

Workaround

One workaround I've currently settled on is to include the following in my csproj:

<!-- Copy files that function build strips out that should normally be included  -->
<Target Name="CopyFilesAfterBuild" AfterTargets="_GenerateFunctionsPostBuild">
  <ItemGroup>
    <NativeLibs Include="$(OutDir)runtimes\**\libmagic-1.so" />
    <NativeLibs Include="$(OutDir)runtimes\**\libmagic-1.dylib" />
    <NativeLibs Include="$(OutDir)runtimes\**\libmagic-1.dll" />
    <NativeLibs Include="$(OutDir)runtimes\**\libgnurx-0.dll" />
  </ItemGroup>
  <Copy SourceFiles="@(NativeLibs)" DestinationFolder="$(OutDir)bin\runtimes\%(RecursiveDir)" />
  <!-- Doesn't matter which runtime the magic.mgc file comes from, they're all the same -->
  <Copy SourceFiles="$(OutDir)runtimes\win-x64\native\magic.mgc" DestinationFolder="$(OutDir)bin" />
</Target>

Does not pick up Parquet files

Hi, thanks for this awesome library! I noticed it does not detect parquet files. I understand there is no standard MIME type for it, but perhaps the extension returned from GuessExtension could be parquet?

Is there a general set of guidelines for how to handle content-types that aren't known or documented?

Got error with visual studio 2015

Hi,

I want use MIME lib in my project to find the file extension but when I am going to install by visual studio 2015 after install I have got following error. please help me out.

error : Invalid static method invocation syntax: "[System.Runtime.InteropServices.RuntimeInformation]::IsOSPlatform($([System.Runtime.InteropServices.OSPlatform]::Linux))". The type "System.Runtime.InteropServices.RuntimeInformation" is either not available for execution in an MSBuild property function or could not be found

image

Differece in MimeType detection

There is difference in MimeType detection for file and byte array for the same file

FileStream fs = new FileStream(filePath, FileMode.Open, FileAccess.Read);
byte[] ImageData = new byte[fs.Length];
fs.Read(ImageData, 0, System.Convert.ToInt32(fs.Length));

var res = MimeGuesser.GuessFileType(ImageData);
var res2 = MimeGuesser.GuessFileType(filePath);

check for *.docx files.

PNG

Why MimeGuesser.GuessMimeType determines PNG image file as "image/jpeg" instead of "image/png" ?

Update file to 5.44

Any objections to upgrading to 5.44? I'm generally seeing more useful, specific, and detailed results from 5.44 compared to 5.41.

Unable to find libmagic-1 on Ubuntu 20.04

When running some Azure Function code that uses this dependency on Ubuntu 20.04, it seems that I get the following:

Unhandled exception. System.DllNotFoundException: Unable to load shared library 'libmagic-1' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: liblibmagic-1: cannot open shared object file: No such file or directory

I'm using Azure's func CLI tool to run the code, but I'm not entirely sure running this code via the dotnet CLI would yield much better results. Seems like it boils down to MAGIC_LIB_PATH expecting libmagic to be referenced as libmagic-1.

Running apt install libmagic-dev installs libmagic to: /usr/lib/x86_64-linux-gnu/libmagic.so. I would expect there to be a inbuilt way for this library to more robustly detect where libmagic is or to be able to change MAGIC_LIB_PATH ourselves.

Workaround

Thankfully, there's a workaround. You can create symlinks to the proper libmagic like so:

cd /usr/lib/x86_64-linux-gnu/
sudo ln -s libmagic.so.1.0.0 libmagic-1.so
sudo ln -s libmagic.so.1.0.0 libmagic-1.so.1

HeyRed.Mime.MagicException: Could not find any valid magic files! on Windows Server 2012 R2 when program is on path with ů

Hi, on Windows 11 on two machines and Microsoft Windows Server 2019 Standard
10.0.17763 N/A Build 17763, it works fine.
However on Windows Server 2012 R2, it reports:

HeyRed.Mime.MagicException: Could not find any valid magic files!

I did not install anything manual on the Windows 11 machines.
The application is distributed using MSI installer x64.

Magic files are present (I cut the folder path to show relative path):

obrazek

Do you please have any idea why?

Edit: I noticed that if I change instalation path to C:/ instead of C:/Program Files/path with spaces and symbols like ů, it works correctly.
Edit2: removing the symbol ů from path helped. It works correctly on newer machines, though, regardless of path.

linux-musl-x64 is missing from runtimeTargets

Hello, I am trying to make your library work on Alpine container and I see that you have recently updated runtimes to support linux-musl-x64 in MIME 3.2.0 .
However I still get errors like "HeyRed.Mime.MagicException : Could not find any valid magic files!" in my Alpine container.
I have noticed that my project.assets.json still looks like this:

      "Mime/3.2.0": {
        "type": "package",
        "dependencies": {
          "MimeTypesMap": "1.0.8"
        },
        "compile": {
          "lib/netstandard2.0/Mime.dll": {}
        },
        "runtime": {
          "lib/netstandard2.0/Mime.dll": {}
        },
        "build": {
          "build/Mime.targets": {}
        },
        "runtimeTargets": {
          "runtimes/linux-x64/native/libmagic-1.so": {
            "assetType": "native",
            "rid": "linux-x64"
          },
          "runtimes/linux-x64/native/magic.mgc": {
            "assetType": "native",
            "rid": "linux-x64"
          },
          "runtimes/osx-x64/native/libmagic-1.dylib": {
            "assetType": "native",
            "rid": "osx-x64"
          },
          "runtimes/osx-x64/native/magic.mgc": {
            "assetType": "native",
            "rid": "osx-x64"
          },
          "runtimes/win-x64/native/libgnurx-0.dll": {
            "assetType": "native",
            "rid": "win-x64"
          },
          "runtimes/win-x64/native/libmagic-1.dll": {
            "assetType": "native",
            "rid": "win-x64"
          },
          "runtimes/win-x64/native/magic.mgc": {
            "assetType": "native",
            "rid": "win-x64"
          }
        }
      }

I am not very familiar with runtimes and nuget packages, but can it be the issue that "linux-musl-x64" is still missing from "runtimeTargets" or I should look at somewhere else? Thanks

Add Additional Info

the linux file command returns more information about the file type, it's encoding, endians, sometimes even a description. It would be nice to have that data available too, assuming that kind of data is present in the magic.mgc file.

Also, I just wanted to say, thank you so much for making this project, after a few days of banging my head against the wall I found it and it looks promising!

Excel files are detected as "zip"

I have 'xlsx' files. When I try to validate them using the MimeGuesser.GuessExtension, I get "zip".

My expectation was to receive "xlsx" and "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"

MacOS support

It would be nice if macOS support could be added, then we would be able to use it in our dev environment as well. I tried to clone the repo and add the .dylib and csproj changes myself, but I couldn't get it running with the same sort of configuration as for Windows and Linux. It couldn't find the lib when it was place in the runtimes/osx.10.12-x64/native/ folder, with the same csproj config as for the other platforms. When it was placed in the root folder everything was fine (but of course not the way to go).
I could provide the .dylib if you need it.

Implement magic_list

Would there be any chance that you could implement the magic_list function to get a list of all supported file types?

So far I wasn't able to capture the output using the Console.SetOut method.

[htm, html, xml] Detected as text/plain

Hi @hey-red, I have encountered an issue-or-unwanted behavior, when validating files of types htm, html and xml. Everything works fine till we work with files using line endings appropriate for host system.

But once we try to check on Linux htm file with Windows line endings (CR LF) it detects text/plain. Same with Unix line endings (LF) checked against Windows host.

Is there some magic flag to magic tool that will help me positively identify files without converting line endings.

htm/html detected as text/plain

Hi,
when i try MimeGuesser on htm, html files I get "text/plain". If I execute file on command line it correctly reports them as "text/html".
(i have used attached file for testing)
MIME type guesser.zip

I have pointed MagicFilePath to the very same magic.mgc file command uses.
Also, I have tested with the one from https://launchpad.net/ubuntu/artful/amd64/libmagic-mgc/1:5.29-3, same results: text/plain
In linux, if I delete magic.mgc then file also reports the test file as "text/plain", but if is able to find the magic.mgc file then its properly detected as "text/html"

Thanks.

Building this project to net 4.5/x86 specification

My original question was about compiling this for framework 4.5. And if the current project structure can be explained better? lib dlls, etc?

Have spent some time with it, I did it! now I have net45 version of this code. But I would like to know if it possible to create these runtime ddls fro 32-bit environments? These dlls are unmanaged.

Different extension compared to the `file` command

It seems that an error occurs here and it recognizes the .gz as .bin. While the file has recognized it correctly. I don't know, doesn't the file have an api to get the extension directly? If I understand correctly we are actually using MIME Type mapping as an alternative.

[Fact]
public void Guess_Gzip_ReturnSameAsNative()
{
    // small gzip file: https://github.com/mathiasbynens/small
    byte[] s_gzipBytes =
    [
        0x1f, 0x8b, 0x08, 0x00, 0xae, 0x86, 0xe1, 0x5b, 0x02, 0x03, 0x03, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00
    ];

    var actualMimeType = GuessMimeType(s_gzipBytes);
    var actualExtension = GuessExtension(s_gzipBytes);

    // $ file gzip.gz --mime
    // → gzip.gz: application/gzip; charset=binary
    string expectedMimeType = "application/gzip";

    // $ file gzip.gz --extension
    // → gzip.gz: gz/tgz/tpz/ipk/vbox-extpack/svgz
    string[] expectedExtensions = [ "gz", "tgz", "tpz", "ipk", "vbox-extpack", "svgz"];

    Assert.Equal(expectedMimeType, actualMimeType);
    Assert.Contains(expectedExtensions, e => e == actualExtension); // ← Exception raised here
}

Assert.Contains() Failure

Assert.Contains() Failure
Not found: (filter expression)
In value:  String[] ["gz", "tgz", "tpz", "ipk", "vbox-extpack", ...]
   at Test.UnitTest.Guess_Gzip_ReturnsSameAsNative() in .../UnitTest.cs:line 28
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(Object obj, BindingFlags invokeAttr)

This is probably because MimeTypesMap — which depends on MIME types known by Apache:

public static string GuessExtension(byte[] buffer) => MimeTypesMap.GetExtension(GuessMimeType(buffer));

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.