Comments (11)
Can you please upload the files themselves? Those XML files look like they were created with --output=XML
, not --output=OLDXML
.
from pymediainfo.
Sorry, didn't realize it was --output=OLDXML
until I started debugging it.
I didn't want to send up the files, so I debugged the issue and proposed a fix in PR #56
You'll want to take a look at the attached files in my original message and see the weird characters in the <Writing_application>
tag.
from pymediainfo.
I see weird chars indeed but I cannot make pymediainfo fail with those. Apparently those XML output files are recognised because I can run pymediainfo.MediaInfo.parse("bad_a1.txt").tracks
.
I really need to see one of the files that trigger a failure, especially for the other issue which is probably a bug in MediaInfo itself.
from pymediainfo.
I zeroed out the content in these files, but mediainfo should still be able to parse them.
The attached zip files is really a .rar file. Please rename or use rar to extract.
bad_files.zip
from pymediainfo.
Thanks, I'll look into these ASAP.
from pymediainfo.
Can you try the issue54
branch? Does it work for you?
from pymediainfo.
So the way I tested your changes was to pip3 install your repo in a venv and run my test script. While I'm no longer seeing the 'type' error, it seems I'm getting empty tracks on both good and bad files. i.e. it seems unusable. There must be some pretty drastic changes between what I have installed (2.3.0 from https://files.pythonhosted.org/packages/36/6c/b91e5e0a037aac454136666fae12bcdc069ca7a02fc2de708a67ccc7d1bf/pymediainfo-2.3.0.tar.gz) and what's in your repo?
I also tried the issue54 branch. Same empty tracks for both good and bad files and the encoding exception. Will stick with 2.3.0 and my patches.
Traceback (most recent call last):
File "/mnt/d/ff/get_duration.py", line 19, in <module>
print("%s" % get_duration('/mnt/d/ff/test.wmv'))
File "/mnt/d/ff/get_duration.py", line 6, in get_duration
m = MediaInfo.parse(filename)
File "/home/choyj/x/venv/lib/python3.6/site-packages/pymediainfo/__init__.py", line 239, in parse
return cls(xml, encoding_errors)
File "/home/choyj/x/venv/lib/python3.6/site-packages/pymediainfo/__init__.py", line 136, in __init__
self.xml_dom = MediaInfo._parse_xml_data_into_dom(xml, encoding_errors)
File "/home/choyj/x/venv/lib/python3.6/site-packages/pymediainfo/__init__.py", line 139, in _parse_xml_data_into_dom
return ET.fromstring(xml_data.encode("utf-8", encoding_errors))
UnicodeEncodeError: 'utf-8' codec can't encode character '\ude98' in position 2323: surrogates not allowed
from pymediainfo.
I also tried the issue54 branch. Same empty tracks for both good and bad files and the encoding exception. Will stick with 2.3.0 and my patches.
You need to use the new encoding_errors
parameter: master...issue54#diff-78f1fb93e3c37202e83112bf2277293dR130
As for the type error, can you please give me more info? It was a very simple problem due to a track containing a track
attribute so I'm wondering how it can cause you to see empty tracks.
These are the only changes, nothing drastic: v2.3.0...master
from pymediainfo.
You need to use the new encoding_errors parameter: master...issue54diff-78f1fb93e3c37202e83112bf2277293dR130
How do I use the new encoding_errors
parameter? Do I need to provide an extra parameter to pymediainfo?
As for the type error, can you please give me more info? It was a very simple problem due to a track containing a
track
attribute so I'm wondering how it can cause you to see empty tracks. These are the only changes, nothing drastic: v2.3.0...master
Wow, I need to learn github a bit more. Did not know you can see diffs that way! Yes, I'm curious, too. I'll try and debug later tonight when I get home from work. Possible user error on my part given there are few changes.
from pymediainfo.
How do I use the new encoding_errors parameter? Do I need to provide an extra parameter to pymediainfo?
Yes, you can pass it to parse()
:
pymediainfo/pymediainfo/__init__.py
Lines 170 to 171 in ed6f7ed
from pymediainfo.
Sorry, I should've looked at your code changes. :)
Passing encoding_errors="replace"
addressed the UnicodeEncodeErrors I had, so thanks for the fix.
As for the type error, there was a change you made in the last commit here, that appeared to have broken it. I removed the File/
and now I am able to parse all the bad files I had before as well as the good ones.
from pymediainfo.
Related Issues (20)
- Failing test_thread_safety HOT 10
- OSError: Failed to load library; Error occurred in macOS version 11.0.1 HOT 4
- build failure with setuptools_scm 6.0.1 on Python 2.7 HOT 7
- How to fetch `Format settings, GOP` info? HOT 6
- OSError: Failed to load library HOT 5
- Replace pkg_resources with importlib.metadata HOT 3
- How to use template, or specify --outputfile. HOT 1
- Deployment process in Aws lambda function[Not as a lambda layer] HOT 7
- when i use this pymediainfo, i meet some problem like that , see the picture below. it can't loading dylib. (python3.8 macOS 11.3) HOT 2
- Cannot load MediaInfo.dll (windows) from Python 3.8 HOT 10
- Duration for .webm videos returns as string HOT 1
- Mac Install issue: ModuleNotFoundError: No module named 'pymediainfo' HOT 4
- PyInstaller + macOS: "An error occured while opening" files with special characters HOT 5
- Error HOT 7
- output="" for MediaInfo.parse is equivalent to -f HOT 4
- Segmentation Fault HOT 8
- On Mac Os X HOT 3
- AttributeError: Undefined symbol "MediaInfo_Inform" since libmediainfo-update HOT 6
- Add support for macOS ARM HOT 12
- Wheels for Linux HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pymediainfo.