Comments (28)
It's definitely something in libmagic1 5.09-2. I downloaded the source for libmagic 5.10 and 5.15 (most recent) from ftp://ftp.astron.com/pub/file/ and it started working. Strange.
from python-magic.
Apparently libmagic1 5.09-2 isn't detecting the mime type, so python-magic is treating it as an error and throwing the "MagicException". I wrote my own little C test program using libmagic1, and version 5.11-2ubuntu4 of libmagic1 is returning "application/vnd.ms-excel; charset=binary" as the mime type, but version 5.09-2 is returning the mime type set to "; charset=binary".
I suppose I will have to treat all instances of this MagicException as an "unable to determine mime type" error. Either that or fork my own copy of python-magic and modify the errorcheck_null method to return "application/octet-stream" if the MIME type cannot be determined (which is probably a more RFC-compliant solution). For what it's worth, libmagic1 5.09-2 is correctly identifying it as an Excel spreadsheet, it's just not finding the mime-type for this particular file. I suspect it's a bug in libmagic1.
As another test, I used "dd if=/dev/random" to just create a big file of random binary data, and fed THAT to my program. It correctly identified it as "application/octet-stream", so I don't know why it's choking on this Excel spreadsheet.
My short-term fix will be to modify errorcheck_null to return "application/octet-stream" if result is None.
from python-magic.
Sigh. This is a new error checking path that is (supposed) to more closely adhere to the documented error behaviour.
"The magic_buffer(), magic_getpath(), and magic_file(), functions return a string on success and NULL on failure. "
The number in the args array is a pointer to the magic_t that was created when you initially initialized the library.
I just pushed a hypothetical fix for this issue to master, take a look at 75eab74.
Does this work for you?
from python-magic.
Looks like there's a bug. You are looking for self.flags in line 88 of magic.py, and that's not a attribute of the Magic class.
$ ./test_magic.py fail
Traceback (most recent call last):
File "./test_magic.py", line 35, in <module>
mime_type = magic.from_file(filename, mime=True)
File "/usr/local/lib/python2.7/dist-packages/python_magic-0.4.6-py2.7.egg/magic.py", line 132, in from_file
return m.from_file(filename)
File "/usr/local/lib/python2.7/dist-packages/python_magic-0.4.6-py2.7.egg/magic.py", line 82, in from_file
return self._handle509Bug(e)
File "/usr/local/lib/python2.7/dist-packages/python_magic-0.4.6-py2.7.egg/magic.py", line 88, in _handle509Bug
if e.message is None and (self.flags & MAGIC_MIME):
AttributeError: Magic instance has no attribute 'flags'
from python-magic.
Changing flags to self.flags in the "init()" method of class Magic in magic.py works, in that it returns "application/octet-stream" as the mime type.
from python-magic.
This should be fixed now. Thanks for the report and help with the fix!
from python-magic.
Are you going to bump the setup.py and make this an official change, or do I need to continue pulling the code from GitHub?
from python-magic.
Bump. Can this be released to PyPI please?
from python-magic.
This or something super similar is still an issue on 5.15 on ArchLinux. I am experiencing this issue with an empty file.
In my case the exception returned does not actually have a message property, so I get a double wammy
Traceback:
$ python reproduce_bug.py
Traceback (most recent call last):
File "/tmp/python-magic-bug/libs/python-magic/magic.py", line 67, in from_buffer
return magic_buffer(self.cookie, buf)
File "/tmp/python-magic-bug/libs/python-magic/magic.py", line 227, in magic_buffer
return _magic_buffer(cookie, buf, len(buf))
File "/tmp/python-magic-bug/libs/python-magic/magic.py", line 180, in errorcheck_null
raise MagicException(err)
magic.MagicException: None
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "reproduce_bug.py", line 10, in <module>
mime_magic.from_buffer(blob_chunk)
File "/tmp/python-magic-bug/libs/python-magic/magic.py", line 69, in from_buffer
return self._handle509Bug(e)
File "/tmp/python-magic-bug/libs/python-magic/magic.py", line 88, in _handle509Bug
if e.message is None and (self.flags & MAGIC_MIME):
AttributeError: 'MagicException' object has no attribute 'message'
OS and library version:
ArchLinux
libmagic verion: 5.15-1
How to reproduce:
Create virtualenv with latest python-magic checkout
Create empty file
$ touch empty_file
Create reproduce_bug.py
import magic
mime_magic = magic.Magic(mime_encoding=True)
empty_file = open("empty_file")
blob_chunk = empty_file.read()
mime_magic.from_buffer(blob_chunk)
empty_file.close()
Run test case
$ python ./reproduce_bug.py
from python-magic.
Out of curiosity, I ran your test on two Ubuntu versions:
- Ubuntu 13.10, libmagic 5.11-2ubuntu4
- Ubuntu 12.04 LTS, libmagic1 5.09-2
In both cases I had the latest Git version of python-magic, but I don't get an exception, but I get "None" returned when it analyzes the blob_chunk.
Changed your code to print result:
import magic, pprint
mime_magic = magic.Magic(mime_encoding=True)
empty_file = open("empty_file")
blob_chunk = empty_file.read()
pprint.pprint( mime_magic.from_buffer(blob_chunk) )
empty_file.close()
Here's the output:
$ python reproduce_bug.py
None
Also, in both cases, running the "file" command gave this mime type:
$ file --mime-type empty_file
empty_file: inode/x-empty
from python-magic.
this works for me as well so clearly file is able to figure it out.
$ file --mime-type empty_file
empty_file: inode/x-empty
from python-magic.
So a little more info. I am using python3. Under python2 I get the same output as you (probably should have mentioned it before but skipped my mind)
The are apparently python binding included with "file" distribution here: ftp://ftp.astron.com/pub/file/
extract the file and look in ./python
The binding were able to identify the file as empty just fine, so I suspect libmagic works fine it might be some API that changed and python-magic is tripping up on it.
from python-magic.
@mojotx
FYI, the file python binding are able to identify the file correctly under python 2.7 and 3.3
from python-magic.
@mojotx
okie so for the record the example python binding included with file also return result as None when its trying to read the mimetype using from_buffer.
python-magic on the other hand tries to handle the None return as if is an error. It then fails to get the proper error using magic_error(args[0]) on line 179. However the magic_error also returns None as error, which when passed in as param to MagicException and handled byt _handle509bug errors out since None does not have a property of message.
"file" utility however returns mimetype as inode/x-empty, something is wrong somewhere.
from python-magic.
Ok, I think I located the bug in libmagic ... the file_buffer function does not handle the case when flag is set to MAGIC_MIME_ENCODING. I am going to try to locate the maintainer and see if my fix is correct or its late and I am tripping.
from python-magic.
I might have found a bug and fix in libmagic
See: https://github.com/glensc/file/blob/master/src/funcs.c#L176
Line should probably be this:
if ((!mime || (mime & MAGIC_MIME_TYPE) || (mime & MAGIC_MIME_ENCODING)) &&
This way when python-magic has mime_encoding=True it still works
from python-magic.
You can report the bug here:
http://bugs.gw.com/my_view_page.php
from python-magic.
So I was able to reproduce the condition using file.
Determine mime-encoding gives the same error
$ file --special-files --mime-encoding empty_file
empty_file: ERROR: (null)
However, running
$ file --mime-encoding empty_file
empty_file: binary
"--special-files" is a flag to tell file to skip "stat" style detection, treat the file as normal and let it be read into a buffer and analized, so as far as I can tell it uses magic_buffer rather then magic_file.
There is a similar but more extensive condition with a 1-byte file
Create 1-byte file
echo -n 1 > onebytefile
$ file --special-files --mime-encoding onebytefile
onebytefile: ERROR: (null)
$ file --mime-encoding onebytefile
onebytefile: ERROR: (null)
In short magic is not actually detecting mime-encoding for file that are 0 or 1 byte in length when using buffer for either case. In the case of 1-byte file there is no "stat" detection as well.
So my take away is that handling for mime-encoding for these 3 conditions needs to be added by libmagic.
I am going to try to investigate this more and file a bug (and maybe a potential fix).
Now to the handle509 fix. I get 2 errors when I run 5.15 against empty (and now 1 byte file). I am not sure what case exactly _handle509bug handles (since I was not reproducing it) but would suggest it is a bit broad.
if e.message is None and (self.flags & MAGIC_MIME):
return "application/octet-stream"
For one it returns "application/octet-stream" for both MAGIC_MIME_TYPE and MAGIC_MIME_ENCODING since
MAGIC_MIME = MAGIC_MIME_TYPE|MAGIC_MIME_ENCODING
So probably a more precise fix
if e.message is None and (self.flags & MAGIC_MIME_TYPE):
Additonally in the case I experienced the "err passed into MagicException) is None since apparently this is not an error that libmagic handles (which might be its own bug), As such _handle509Bug method actually causes its own bug
So a the full, more precise fix might be:
class MagicException(Exception):
def __init__(self, magic_err, *args, **kwargs):
self.magic_err = magic_err
Exception.__init__(self, magic_err, *args, **kwargs)
And in _handle509Bug:
if e.magic_err is not None and e.message is None and (self.flags & MAGIC_MIME):
return "application/octet-stream"
If I may ask was a there a bug filed with libmagic to handle condition described in _handle509Bug?
from python-magic.
let me know if the updated fix above is more reasonable and I'll do a pull request.
from python-magic.
@mojotx @ahupp
One more question. Why is the mimetype returned in _handle509Bug "application/octet-stream"? Does this happen only for binary files, Can you maybe provide a the file that is throwing the exception?
I ask because python-magic is now doing a job of libmagic which can be problematic if all usecases are not flushed out.
from python-magic.
Additionally if the fix only handles 5.09 issues we should check the magic lib for that version.
from python-magic.
@goodwillcoding In the case of my particular bug (certain Microsoft Office documents not being identified) the problem was resolved in later versions of file/libmagic. I compiled 5.15 from source and that "fixed" my particular problem. I've filed a bug with Canonical/Launchpad (https://bugs.launchpad.net/ubuntu/+source/file/+bug/1243938) asking that the Ubuntu 12.04 LTS version of file and libmagic1 be updated, but I filed no bug report with the maintainers of actual file/libmagic source.
It sounds like you've found some further bugs, with 5.15, and it sounds entirely reasonable to file a bug report with them. Once you file it, if you need assistance with people chiming in saying "me too!" to get it fixed, just let me know. Unfortunately, when it comes to open source projects, the squeaky wheel is often the only one greased.
from python-magic.
Bug filed with file/libmagic maintainers: http://bugs.gw.com/view.php?id=294
from python-magic.
@goodwillcoding I left a "me too" note on your bug report.
from python-magic.
@mojotx thank you.
By the way do you know which magic version fixed your bug? Can you check please? Or give me a sample file to check it on.
from python-magic.
I can't share the sample file. I compiled 5.10 from source and that worked, using a C program linked with libmagic1.
Here's the C code I used. I had to copy the magic.h file from the source (there is no libmagic1-dev), but I just passed the Windows document name as an argument.
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include "magic.h"
#if 0
#define MY_MAGIC_FLAGS (MAGIC_NONE|MAGIC_DEBUG|MAGIC_MIME)
#else
#define MY_MAGIC_FLAGS (MAGIC_NONE|MAGIC_MIME)
#endif
int main(int argc, const char **argv)
{
magic_t cookie;
const char *fileInfo;
const char **fp=argv;
if (argc<2) {
fprintf(stderr, "Usage: %s [filename1 . . . filenameN]\n", *fp );
return 0;
}
if ((cookie=magic_open( MY_MAGIC_FLAGS ) ) == NULL ) {
int err=errno;
const char *me = magic_error( cookie );
fprintf( stderr, "Error opening the magic file: %s (%s)\n", strerror(err), (me ? me : "NULL"));
return -1;
}
if ((magic_load(cookie, NULL )) != 0) {
int err=errno;
const char *me = magic_error( cookie );
fprintf( stderr, "Error opening the magic database: %s (%s)\n", strerror(err), (me ? me : "NULL"));
return -1;
}
for (++fp; *fp; ++fp ) {
printf( "processing %s\n", *fp );
if ((fileInfo=magic_file( cookie, *fp )) == NULL ) {
int err=errno;
const char *me = magic_error( cookie );
fprintf( stderr, "Error analyzing the file %s: %s (%s)\n", *fp, strerror(err), (me ? me : "NULL"));
return -1;
}
printf( "fileInfo=\"%s\" %s\n", fileInfo, *fp );
}
magic_close( cookie );
return 0;
}
from python-magic.
Also, I used C to prove that it was an issue with the actual libmagic library, and not something in python-magic. My bug was really two separate issues:
- A bug in the libmagic1 implementation for Ubuntu 12.04 LTS, not handling some common file types correctly. Since my issue appears to have been fixed with 5.10, I didn't bother the maintainers of file/libmagic, but instead filed the Launchpad bug to try to get a fix pushed for 12.04 LTS.
- A bug in python-magic, not handling the libmagic1 bug gracefully
from python-magic.
Based on the above conversation I am doing a pull request that still accommodates the problem described by @mojotx but does not cause problems for when other libmagic errors like the one I filed on empty/1-byte files occur. It is the same fix but it only runs when the right conditions are mode like:
- magic version < 510
- when magic_error actually returns an error and its message can be checked.
@mojotx can you test the fix using 509, your file and my feature branch here: https://github.com/goodwillcoding/python-magic/tree/updated_509_fix
from python-magic.
Related Issues (20)
- MagicException: regex error HOT 1
- Error: The specified module could not be found HOT 1
- ImportError: failed to find libmagic. Check your installation HOT 9
- Package missing from the AUR HOT 2
- Upcoming test suite breakage to to changes in file HOT 3
- 0.4.27: pytest is failing HOT 1
- Error iterating files in directories HOT 13
- UnicodeDecodeError when filename includes non ASCII characters HOT 1
- Segmentation fault when attempting to load `msys-magic-1.dll` from Git SCM HOT 2
- magic.from_file() fails for files with German umlauts in their name although Windows 10 permits such filenames HOT 1
- Binary distribution for libmagic on Windows HOT 2
- Adding libmagic to python-magic wheel on PyPI HOT 4
- Please make `from_file` work on directories HOT 5
- Add a way to specify a default for `magic_file`. HOT 1
- Magic can't get a proper mime type from a MP3 file HOT 2
- On AlmaLinux 8, corrupt .gz files no longer raise an exception HOT 2
- Please make a new release HOT 2
- Please update compat bindings from "file"
- Having Trouble Building Serverless execution HOT 7
- Problems finding MIME type of .pptx HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from python-magic.