Giter VIP home page Giter VIP logo

Comments (11)

sbraz avatar sbraz commented on May 25, 2024

Can you please upload the files themselves? Those XML files look like they were created with --output=XML, not --output=OLDXML.

from pymediainfo.

choyj avatar choyj commented on May 25, 2024

Sorry, didn't realize it was --output=OLDXML until I started debugging it.

I didn't want to send up the files, so I debugged the issue and proposed a fix in PR #56

You'll want to take a look at the attached files in my original message and see the weird characters in the <Writing_application> tag.

from pymediainfo.

sbraz avatar sbraz commented on May 25, 2024

I see weird chars indeed but I cannot make pymediainfo fail with those. Apparently those XML output files are recognised because I can run pymediainfo.MediaInfo.parse("bad_a1.txt").tracks.
I really need to see one of the files that trigger a failure, especially for the other issue which is probably a bug in MediaInfo itself.

from pymediainfo.

choyj avatar choyj commented on May 25, 2024

I zeroed out the content in these files, but mediainfo should still be able to parse them.

The attached zip files is really a .rar file. Please rename or use rar to extract.
bad_files.zip

from pymediainfo.

sbraz avatar sbraz commented on May 25, 2024

Thanks, I'll look into these ASAP.

from pymediainfo.

sbraz avatar sbraz commented on May 25, 2024

Can you try the issue54 branch? Does it work for you?

from pymediainfo.

choyj avatar choyj commented on May 25, 2024

So the way I tested your changes was to pip3 install your repo in a venv and run my test script. While I'm no longer seeing the 'type' error, it seems I'm getting empty tracks on both good and bad files. i.e. it seems unusable. There must be some pretty drastic changes between what I have installed (2.3.0 from https://files.pythonhosted.org/packages/36/6c/b91e5e0a037aac454136666fae12bcdc069ca7a02fc2de708a67ccc7d1bf/pymediainfo-2.3.0.tar.gz) and what's in your repo?

I also tried the issue54 branch. Same empty tracks for both good and bad files and the encoding exception. Will stick with 2.3.0 and my patches.

Traceback (most recent call last):
  File "/mnt/d/ff/get_duration.py", line 19, in <module>
    print("%s" % get_duration('/mnt/d/ff/test.wmv'))
  File "/mnt/d/ff/get_duration.py", line 6, in get_duration
    m = MediaInfo.parse(filename)
  File "/home/choyj/x/venv/lib/python3.6/site-packages/pymediainfo/__init__.py", line 239, in parse
    return cls(xml, encoding_errors)
  File "/home/choyj/x/venv/lib/python3.6/site-packages/pymediainfo/__init__.py", line 136, in __init__
    self.xml_dom = MediaInfo._parse_xml_data_into_dom(xml, encoding_errors)
  File "/home/choyj/x/venv/lib/python3.6/site-packages/pymediainfo/__init__.py", line 139, in _parse_xml_data_into_dom
    return ET.fromstring(xml_data.encode("utf-8", encoding_errors))
UnicodeEncodeError: 'utf-8' codec can't encode character '\ude98' in position 2323: surrogates not allowed

from pymediainfo.

sbraz avatar sbraz commented on May 25, 2024

I also tried the issue54 branch. Same empty tracks for both good and bad files and the encoding exception. Will stick with 2.3.0 and my patches.

You need to use the new encoding_errors parameter: master...issue54#diff-78f1fb93e3c37202e83112bf2277293dR130

As for the type error, can you please give me more info? It was a very simple problem due to a track containing a track attribute so I'm wondering how it can cause you to see empty tracks.
These are the only changes, nothing drastic: v2.3.0...master

from pymediainfo.

choyj avatar choyj commented on May 25, 2024

You need to use the new encoding_errors parameter: master...issue54diff-78f1fb93e3c37202e83112bf2277293dR130

How do I use the new encoding_errors parameter? Do I need to provide an extra parameter to pymediainfo?

As for the type error, can you please give me more info? It was a very simple problem due to a track containing a track attribute so I'm wondering how it can cause you to see empty tracks. These are the only changes, nothing drastic: v2.3.0...master

Wow, I need to learn github a bit more. Did not know you can see diffs that way! Yes, I'm curious, too. I'll try and debug later tonight when I get home from work. Possible user error on my part given there are few changes.

from pymediainfo.

sbraz avatar sbraz commented on May 25, 2024

How do I use the new encoding_errors parameter? Do I need to provide an extra parameter to pymediainfo?

Yes, you can pass it to parse():

def parse(cls, filename, library_file=None, cover_data=False,
encoding_errors="strict"):

from pymediainfo.

choyj avatar choyj commented on May 25, 2024

Sorry, I should've looked at your code changes. :)
Passing encoding_errors="replace" addressed the UnicodeEncodeErrors I had, so thanks for the fix.

As for the type error, there was a change you made in the last commit here, that appeared to have broken it. I removed the File/ and now I am able to parse all the bad files I had before as well as the good ones.

from pymediainfo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.