liamks / libpytunes Goto Github PK
View Code? Open in Web Editor NEWPython Itunes Library parser
Home Page: https://github.com/liamks/pyitunes
License: MIT License
Python Itunes Library parser
Home Page: https://github.com/liamks/pyitunes
License: MIT License
Hello! I have a few use cases where I'm loading iTunes libraries from sources other than an XML file on disk (for example, from a compressed archive or over a network). Previously, this was trivially possible, since Library.__init__()
took an open file-like object as input to read XML from With the recent change to replace the obsolete readPlist
function (26b3fdf), Library.__init__()
now expects itunesxml
to be a filename, since it calls open(itunesxml, 'rb') as f
internally.
Would it be possible to restore the original behavior, perhaps intelligently detecting whether itunesxml
is a filename or a file-like object? Something like:
if os.path.isfile(itunesxml):
# itunesxml is a filename, and must be opened
with open(itunesxml, 'rb') as f:
self.il = plistlib.load(f)
else:
# itunesxml is a file-like object and can be loaded directly
self.il = plistlib.load(f)
As far as I can tell, this function is defined but not used anywhere. It checks for availability of an xspf dependency, but nothing in the package installs xspf.
Appears to be cruft. Can we remove it, along with checks for xspfAvailable
? Or am I misunderstanding?
Just got this using 2.7, conda environment w/ tensorflow. My library definitely has a lot of accents
---------------------------------------------------------------------------
UnicodeEncodeError Traceback (most recent call last)
<ipython-input-2-df9cbca52326> in <module>()
19 if song and song.rating:
20 if song.rating > 80:
---> 21 print("{n}, {r}".format(n=song.name, r=song.rating))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 5: ordinal not in range(128)
I'm simply testing out the code for using pickle.
Just curious how long you want to support legacymode? We could remove a fair bit of clutter without it.
Consider adding a quick note about how to install. I haven't been able to find it on pypi. Is it expected to pull down the project and use it locally?
Thanks for the project!
While I'm working on fixing my first issue, I ran into a problem with the most recent build of the library.
Loading my library, I ran into this error:
Python 2.7.10 (default, May 25 2015, 13:06:17)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)] on darwin
>>> import pyItunes
>>> l = pyItunes.Library("/Users/roger/projects/Fun with Data/itunes/iTunes Music Library.xml")
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "/Users/roger/projects/pyitunes/pyItunes/Library.py", line 36, in __init__
self.getSongs()
File "/Users/roger/projects/pyitunes/pyItunes/Library.py", line 77, in getSongs
s.location = text_type(urlparse.unquote(urlparse.urlparse(attributes.get('Location')).path[1:]))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcc in position 40: ordinal not in range(128)
0xcc
corresponds to Ì, a "Latin capital letter I with grave". What's weird about this error is that this character doesn't appear at all in my library:
Rogers-iMac:pyitunes roger$ cat "/Users/roger/projects/Fun with Data/itunes/iTunes Music Library.xml" | grep -c Ì
0
The relevant line of code is here:
if ( self.musicPathXML is None or self.musicPathSystem is None ):
# s.location = text_type(urlparse.unquote(urlparse.urlparse(attributes.get('Location')).path[1:]),"utf8")
s.location = text_type(urlparse.unquote(urlparse.urlparse(attributes.get('Location')).path[1:]))
else:
# s.location = text_type(urlparse.unquote(urlparse.urlparse(attributes.get('Location')).path[1:]).replace(self.musicPathXML,self.musicPathSystem),"utf8")
s.location = text_type(urlparse.unquote(urlparse.urlparse(attributes.get('Location')).path[1:]).replace(self.musicPathXML,self.musicPathSystem))
Commenting out the first s.location line allows me to load the library.
With my 98,000 track library, on a very fast Mac, every access through pyitunes takes at least 55 seconds (running the sample code in the readme through time
). At first I was thinking of forking the project and using some of your code to populate a database and accessing data via the Django or SQLAlchemy ORM. That would have advantages or arbitrary query access, but would be a lot of work. Then realized it would be easier to cache the first access. Python provides pickle
for this purpose. By pickling, I was able to reduce each access to around 5 seconds. Not as good as a database, but a huge improvement. Here is a version of the readme that includes fixes from my previous ticket and includes pickling performance improvements with a user-settable cache expiry:
import os.path
import pickle
import time
from pyItunes import Library
lib_path = "/Users/[username]/Music/iTunes/iTunes Library.xml"
pickle_file = "itl.p"
expiry = 60 * 60 # Refresh pickled file if older than (in seconds)
epoch_time = int(time.time()) # Now
# Generate pickled version of database if stale or doesn't exist
if not os.path.isfile(pickle_file) or os.path.getmtime(pickle_file) + expiry < epoch_time:
itl_source = Library(lib_path)
pickle.dump(itl_source, open(pickle_file, "wb"))
itl = pickle.load(open(pickle_file, "rb"))
print("\nHighly-rated tracks:")
for id, song in itl.songs.items():
if song and song.rating:
if song.rating > 80:
print("{n}, {r}".format(n=song.name, r=song.rating))
print("\nPlaylists:")
playlists = itl.getPlaylistNames()
for playlist in playlists:
print(playlist)
print("\nTracks in first playlist")
for song in itl.getPlaylist(playlists[0]).tracks:
print("{t} {a} - {n}".format(t=song.track_number, a=song.artist, n=song.name))
Would it be possible to add a flag to Playlist to verify if the playlist is a Genius generated playlist or not? The existence of the "Smart Info", "Smart Criteria" or "Genius Track ID" indicates if its a Genius generated plugin or not.
The primary "Usage Example" in the readme is broken in several ways.
First: if song and song.rating > 80:
At least under Python3.6, this breaks for any song that does not have a rating with
TypeError: '>' not supported between instances of 'NoneType' and 'int'
Can be fixed with:
for id, song in l.songs.items():
if song and song.rating:
if song.rating > 80:
print(song.name, song.rating)
Second:
print("[%d] %s - %s" % (song.number, song.artist, song.name))
According to "Attributes of the Song class" documentation, there is no such attribute as song.number
, so this fails immediately. If you correct this to:
print("[%d] %s - %s" % (song.track_number, song.artist, song.name))
the attribute now exists, but it fails for any tracks missing a track number with:
TypeError: %d format: a number is required, not NoneType
This can be fixed (perhaps not gracefully) by changing to
print("[%s] %s - %s" % (song.track_number, song.artist, song.name))
(don't force track_number to be formatted as an int). But now it prints a bunch of "None"s. Could be fixed internally by replacing None with "-" or similar.
Finally, the syntax for string formatting is ancient and should be updated to use format()
:
print("{t} {a} - {n}".format(t=song.track_number, a=song.artist, n=song.name))
I'm seeing pretty big differences between the counts provided by pyitunes and iTunes itself. For example iTunes thinks I have 89,203 tracks but:
l = Library("iTunes Music Library.xml")
(Pdb) len(l.songs)
90744
Seeing similar discrepancies with playlist counts. Ideas?
Hi,
could you upload your library to PyPi, please? This would ease the installation (via pip) and usage.
Cheers, Philipp
iTunes track_id does not remain persistent. Any changes to your iTunes library, such and changing any metadata may very well cause the track_id to change. I had this happen to me. Only the persistent_id field remains unchanged no matter what you do to the library. This can result in errant behavior if one is relying on the track_id to never change for a particular track.
Is there a specific reason the persistent_id is not being used to build the dictionaries/lists?
Is there a reason that a Playlist is a list of PlTracks, rather than a list of Songs?
It looks like a PlTrack just pulls a few select attributes from the corresponding Song referenced by a Playlist, but you can't convert a PlTrack to a dict the way you can with a Song.
There might be a technical reason, but otherwise if I were to compile a list of dicts of all the tracks in a playlist, I would have to reinvent the wheel by going back to the original XML and getting track IDs, looking them up.
Are there any plans to update this to work with Apple Music and the new database?
Library.py references these two objects, but it is unclear where they're supposed to be passed in from or what they do. Comments say:
# musicPathXML and musicPathSystem will do path conversion
# when xml is being processed on different OS then iTunes
iTunes is not an OS. I can't make any sense of this. If no one objects I'd like to delete these references.
Do you have a preferred license for your code? Could you add one? I'm looking to use your library in an command line script to get/update the genre of the currently playing iTunes song and would like include your code + open source my usage.
I prefer MIT, but there are several recommendations here: http://choosealicense.com/
Thanks for the code!
I like libpytunes for exporting playlist files from music app. If you want to make this script useable for python 3, I had to modify Library.py as follows:
add function into Library class:
with open(f, 'rb') as fp:
plst = plistlib.load(fp)
return plst
replace line 28 with:
self.il = Library.plistLoad(itunesxml) # Much better support of xml special characters
In this example (from my personal project), we use the included test library to step through all tracks in the library, then go through the playlists and create ordered PlaylistEntry objects. It is assumed that the Playlists section of the XML will only reference tracks that exist in the library.
Is it intentional that the XML Playlists reference non-existent tracks for testing purposes, or should they all be valid, as would be the typical case?
397/402 Aerosmith - Train Kept a Rollin' (Live)
398/402 Dixie Chicks - I Like It
399/402 J.J. Cale & Eric Clapton - Dead End Road
400/402 Gnarls Barkley - Storm Coming
401/402 U2 - City of Blinding Lights
Traceback (most recent call last):
File "./manage.py", line 22, in <module>
execute_from_command_line(sys.argv)
File "/Users/shacker/Sites/virtualenvs/pyrex/lib/python3.5/site-packages/django/core/management/__init__.py", line 363, in execute_from_command_line
utility.execute()
File "/Users/shacker/Sites/virtualenvs/pyrex/lib/python3.5/site-packages/django/core/management/__init__.py", line 355, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/Users/shacker/Sites/virtualenvs/pyrex/lib/python3.5/site-packages/django/core/management/base.py", line 283, in run_from_argv
self.execute(*args, **cmd_options)
File "/Users/shacker/Sites/virtualenvs/pyrex/lib/python3.5/site-packages/django/core/management/base.py", line 330, in execute
output = self.handle(*args, **options)
File "/Users/shacker/dev/pyrex/itl/management/commands/sync.py", line 130, in handle
plists = [itl.getPlaylist(p) for p in itl.getPlaylistNames()]
File "/Users/shacker/dev/pyrex/itl/management/commands/sync.py", line 130, in <listcomp>
plists = [itl.getPlaylist(p) for p in itl.getPlaylistNames()]
File "/Users/shacker/dev/libpytunes/libpytunes/Library.py", line 118, in getPlaylist
t = self.songs[id]
KeyError: 63488
Running into very simple problem causing the sample program in the read me to fail. Song.number is never populated. In fact, it doesn't even exist in the Song.py class. I added a couple lines (to Library.py and Song.py) that took care of this and fixed it...
Library.py
diff --git a/pyItunes/Library.py b/pyItunes/Library.py
index 60bb75b..2b91266 100644
--- a/pyItunes/Library.py
+++ b/pyItunes/Library.py
@@ -37,6 +37,7 @@ class Library:
format = "%Y-%m-%d %H:%M:%S"
for trackid,attributes in self.il['Tracks'].items():
s = Song()
+ s.number = int(trackid)
s.name = attributes.get('Name')
s.artist = attributes.get('Artist')
s.album_artist = attributes.get('Album Artist')
and
Song.py
diff --git a/pyItunes/Song.py b/pyItunes/Song.py
index e6f00a2..4dc605e 100644
--- a/pyItunes/Song.py
+++ b/pyItunes/Song.py
@@ -2,6 +2,7 @@ class Song:
"""
Song Attributes:
name (String)
+ number (Integer)
artist (String)
album_artist (String)
composer = None (String)
@@ -32,6 +33,7 @@ class Song:
length = None (Integer)
"""
name = None
+ number = None
artist = None
album_artist = None
composer = None
Spot-checking my library, these two fields seem to yield the same results. Is there any case where they're different? If not, should we drop one of them? Which one?
Hi,
I am using the library to load the data into MongoDB to generate some stats about my iTunes Library. An interesting thing to look at is the release date. Could this be added to the song attributes?
I realise that not all entries have it, but might still be interesting to do. Happy to try to add it myself if it helps.
Thanks,
Carlos
I use the checkmark in iTunes to organize a lot of my music. One of the main reasons I wanted to use libpytunes was to build a project to sync all my checked songs (10K+) to my Android. But this attribute doesn't seem to be recorded in the song object.
In the iTunes XML it's recorded for unchecked songs as:
<key>Disabled</key><true/>
And for checked songs, the Disabled key is simply missing (thanks, Apple).
Would it be possible to add this attribute?
I'd like to filter my library as its being parsed as I have a mixture of music, podcasts, and audiobooks. Is there an attribute to identify which of these a Song is?
How would I use this to save any changes back to iTunes?
I think currently this is not possible yet?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.