Describe the Bug
When using the original version of tmdb_helper.py
, Chinese titled collections cannot retrieve corresponding TMDB data. Here's an example of the log:
2024-03-12 02:09:49,141 (70000a6bf000) : DEBUG (plex_api_helper:454) - Getting database info for item: 黑夜传说(系列)
2024-03-12 02:09:49,145 (70000a6bf000) : DEBUG (networking:139) - Fetching 'http://127.0.0.1:32400/services/tmdb?uri=/search/collection?query=%5Cu9ed1%5Cu591c%5Cu4f20%5Cu8bf4%5Cuff08%5Cu7cfb%5Cu5217%5Cuff09' from the HTTP cache
2024-03-12 02:09:49,150 (70000a6bf000) : DEBUG (tmdb_helper:117) - TMDB data: {'total_results': 0, 'total_pages': 1, 'page': 1, 'results': []}
2024-03-12 02:09:49,150 (70000a6bf000) : DEBUG (plex_api_helper:562) - Database info for item: 黑夜传说(系列), database_info: ('movie_collections', 'themoviedb', 'tv.plex.agents.movie', None)
After testing, I found that the issue lies with URL encoding. Since collections are searched and matched on TMDB using their titles, the original script uses the String.Quote
function which utilizes Unicode escape sequence URL encoding. For example, "黑夜传说(系列)" would be encoded as:
%5Cu9ed1%5Cu591c%5Cu4f20%5Cu8bf4%5Cuff08%5Cu7cfb%5Cu5217%5Cuff09
I made some modifications to the script. Now, when the title is in Chinese, it uses the urllib.quote
function with UTF-8 encoding for URL encoding. For titles in other languages, the script still uses the String.Quote
function. For example, "黑夜传说(系列)" would now be encoded as:
%E9%BB%91%E5%A4%9C%E4%BC%A0%E8%AF%B4%EF%BC%88%E7%B3%BB%E5%88%97%EF%BC%89
The modified tmdb_helper.py
is as follows:
# -*- coding: utf-8 -*-
# plex debugging
try:
import plexhints # noqa: F401
except ImportError:
pass
else: # the code is running outside of Plex
from plexhints.constant_kit import CACHE_1DAY # constant kit
from plexhints.log_kit import Log # log kit
from plexhints.parse_kit import JSON # parse kit
from plexhints.util_kit import String # util kit
# imports from Libraries\Shared
from typing import Optional, Union
import urllib
import urlparse
# url borrowed from TheMovieDB.bundle
tmdb_base_url = 'http://127.0.0.1:32400/services/tmdb?uri='
def get_tmdb_id_from_external_id(external_id, database, item_type):
# type: (Union[int, str], str, str) -> Optional[int]
"""
Convert IMDB ID to TMDB ID.
Use the builtin Plex tmdb api service to search for a movie by IMDB ID.
Parameters
----------
external_id : Union[int, str]
External ID to convert.
database : str
Database to search. Must be one of 'imdb' or 'tvdb'.
item_type : str
Item type to search. Must be one of 'movie' or 'tv'.
Returns
-------
Optional[int]
Return TMDB ID if found, otherwise None.
Examples
--------
>>> get_tmdb_id_from_external_id(imdb_id='tt1254207', database='imdb', item_type='movie')
10378
>>> get_tmdb_id_from_external_id(imdb_id='268592', database='tvdb', item_type='tv')
48866
"""
if database.lower() not in ['imdb', 'tvdb']:
Log.Exception('Invalid database: {}'.format(database))
return
if item_type.lower() not in ['movie', 'tv']:
Log.Exception('Invalid item type: {}'.format(item_type))
return
# according to https://www.themoviedb.org/talk/5f6a0500688cd000351c1712 we can search by external id
# https://api.themoviedb.org/3/find/tt0458290?api_key=###&external_source=imdb_id
find_url_suffix = 'find/{}?external_source={}_id'
url = '{}/{}'.format(
tmdb_base_url,
find_url_suffix.format(String.Quote(s=str(external_id), usePlus=True), database.lower())
)
try:
tmdb_data = JSON.ObjectFromURL(
url=url, sleep=2.0, headers=dict(Accept='application/json'), cacheTime=CACHE_1DAY, errors='strict')
except Exception as e:
Log.Debug('Error converting external ID to TMDB ID: {}'.format(e))
else:
Log.Debug('TMDB data: {}'.format(tmdb_data))
try:
# this is already an integer, but let's force it
tmdb_id = int(tmdb_data['{}_results'.format(item_type.lower())][0]['id'])
except (IndexError, KeyError, ValueError):
Log.Debug('Error converting external ID to TMDB ID: {}'.format(tmdb_data))
else:
return tmdb_id
def get_tmdb_id_from_collection(search_query):
# type: (str) -> Optional[int]
"""
Search for a collection by name.
Use the builtin Plex tmdb api service to search for a tmdb collection by name.
Parameters
----------
search_query : str
Name of collection to search for.
Returns
-------
Optional[int]
Return collection ID if found, otherwise None.
Examples
--------
>>> get_tmdb_id_from_collection(search_query='James Bond Collection')
645
>>> get_tmdb_id_from_collection(search_query='James Bond')
645
"""
# /search/collection?query=James%20Bond%20Collection&include_adult=false&language=en-US&page=1"
query_url = 'search/collection?query={}'
# Plex returns 500 error if spaces are in collection query, same with `_`, `+`, and `%20`... so use `-`
if any(u'\u4e00' <= ch <= u'\u9fff' for ch in search_query):
search_query = urllib.quote(search_query.encode('utf8'))
else:
search_query = String.Quote(s=search_query.replace(' ', '-'), usePlus=True)
url = '{}/{}'.format(tmdb_base_url, query_url.format(search_query))
try:
tmdb_data = JSON.ObjectFromURL(
url=url, sleep=2.0, headers=dict(Accept='application/json'), cacheTime=CACHE_1DAY, errors='strict')
except Exception as e:
Log.Debug('Error searching for collection {}: {}'.format(search_query, e))
else:
collection_id = None
Log.Debug('TMDB data: {}'.format(tmdb_data))
end_string = 'Collection' # collection names on themoviedb end with 'Collection'
try:
for result in tmdb_data['results']:
if result['name'].lower() == search_query.lower() or \
'{} {}'.format(search_query.lower(), end_string).lower() == result['name'].lower():
collection_id = int(result['id'])
except (IndexError, KeyError, ValueError):
Log.Debug('Error searching for collection {}: {}'.format(search_query, tmdb_data))
else:
return collection_id
After the modification, TMDB data for collections with Chinese titles can now be retrieved. For example:
2024-03-16 04:28:12,457 (7000034c9000) : DEBUG (plex_api_helper:454) - Getting database info for item: 黑夜传说(系列)
2024-03-16 04:28:12,465 (7000034c9000) : DEBUG (networking:139) - Fetching 'http://127.0.0.1:32400/services/tmdb?uri=/search/collection?query=%E9%BB%91%E5%A4%9C%E4%BC%A0%E8%AF%B4%EF%BC%88%E7%B3%BB%E5%88%97%EF%BC%89' from the HTTP cache
2024-03-16 04:28:12,474 (7000034c9000) : DEBUG (tmdb_helper:123) - TMDB data: {'total_results': 1, 'total_pages': 1, 'page': 1, 'results': [{'poster_path': '/aK8qq0X0pZbf5ncE3JLQ27hdC4F.jpg', 'name': 'Underworld Collection', 'overview': u'A centuries-old war is waged between vampires and lycans (short for lycanthrope, from the Greek \'luk\' [wolf] + \'\xe1nthr\u014dpos\' [human], or werewolf). It begins with a fifth-century man, the sole survivor of a plague that wipes out his village. Somehow, his body turns the disease to his benefit, and afterward, he ceases to age. After living secretly, he moves often to prevent his immunity to the ravages of time from being discovered, even marrying on occasion. After he fathers twin sons who also inherit his gift, they learn its ultimate price after they become the first vampire and lycan after one suffers a bite from a bat and the other from a wolf. Once they learn they can "turn" normal humans into others like them by inflicting their bites on them, they terrorize the nearby countryside, taking from them either victims or tribute. The vampires ultimately enslave the Lycans until love leads to the Lycans\' escape, igniting a bitter war that seems destined to have no end.', 'original_name': 'Underworld Collection', 'backdrop_path': '/2gSaXagD9ZCuBHOsXF4tvtW7Djd.jpg', 'adult': False, 'id': 2326, 'original_language': 'en'}]}
2024-03-16 04:28:12,474 (7000034c9000) : DEBUG (plex_api_helper:562) - Database info for item: 黑夜传说(系列), database_info: ('movie_collections', 'themoviedb', 'tv.plex.agents.movie', None)
2024-03-16 04:28:12,474 (7000034c9000) : DEBUG (plex_api_helper:114) - item title: 黑夜传说(系列)
However, collections with Chinese titles still cannot fetch theme songs, while collections with English titles can. For example, I have one collection titled "James Bond Collection" and another titled "詹姆斯·邦德(系列)" in two separate libraries. The collection with the English title successfully fetched the theme song, but the one with the Chinese title did not.
2024-03-16 04:28:57,203 (700004cdb000) : DEBUG (plex_api_helper:454) - Getting database info for item: James Bond Collection
2024-03-16 04:28:57,208 (700004cdb000) : DEBUG (networking:139) - Fetching 'http://127.0.0.1:32400/services/tmdb?uri=/search/collection?query=James-Bond-Collection' from the HTTP cache
2024-03-16 04:28:57,213 (700004cdb000) : DEBUG (tmdb_helper:123) - TMDB data: {'total_results': 1, 'total_pages': 1, 'page': 1, 'results': [{'poster_path': '/ofwSiqOFShhunAIYYdSMHMJQSx2.jpg', 'name': 'James Bond Collection', 'overview': 'The James Bond film series is a British series of spy films based on the fictional character of MI6 agent James Bond, codename "007". With all of the action, adventure, gadgetry & film scores that Bond is famous for.', 'original_name': 'James Bond Collection', 'backdrop_path': '/dOSECZImeyZldoq0ObieBE0lwie.jpg', 'adult': False, 'id': 645, 'original_language': 'en'}]}
2024-03-16 04:28:57,214 (700004cdb000) : DEBUG (plex_api_helper:562) - Database info for item: James Bond Collection, database_info: ('movie_collections', 'themoviedb', 'tv.plex.agents.movie', None)
2024-03-16 04:28:31,560 (7000034c9000) : DEBUG (plex_api_helper:454) - Getting database info for item: 詹姆斯·邦德(系列)
2024-03-16 04:28:31,571 (7000034c9000) : DEBUG (networking:139) - Fetching 'http://127.0.0.1:32400/services/tmdb?uri=/search/collection?query=%E8%A9%B9%E5%A7%86%E6%96%AF%C2%B7%E9%82%A6%E5%BE%B7%EF%BC%88%E7%B3%BB%E5%88%97%EF%BC%89' from the HTTP cache
2024-03-16 04:28:31,575 (7000034c9000) : DEBUG (tmdb_helper:123) - TMDB data: {'total_results': 1, 'total_pages': 1, 'page': 1, 'results': [{'poster_path': '/ofwSiqOFShhunAIYYdSMHMJQSx2.jpg', 'name': 'James Bond Collection', 'overview': 'The James Bond film series is a British series of spy films based on the fictional character of MI6 agent James Bond, codename "007". With all of the action, adventure, gadgetry & film scores that Bond is famous for.', 'original_name': 'James Bond Collection', 'backdrop_path': '/dOSECZImeyZldoq0ObieBE0lwie.jpg', 'adult': False, 'id': 645, 'original_language': 'en'}]}
2024-03-16 04:28:31,576 (7000034c9000) : DEBUG (plex_api_helper:562) - Database info for item: 詹姆斯·邦德(系列), database_info: ('movie_collections', 'themoviedb', 'tv.plex.agents.movie', None)
I'm not sure if there are notifications in the logs for fetching theme songs for collections because I haven't seen any "data found for collection" notifications while monitoring the logs. It might be challenging to filter logs based on this. However, in the WebUI, I noticed that only collections in the English library fetched theme songs, while collections in the Chinese library did not. I hope we can find the reason for this discrepancy.
Furthermore, even though we've retrieved data from TMDB, the collection IDs in the logs still show as None. Is this normal? Also, in the WebUI, it displays as "No known ID," despite some collections having successfully matched data from TMDB.
I'm puzzled as well. Both collections with Chinese and English titles have successfully matched with TMDB. So, it's unclear why only collections with English titles are fetching theme songs. There might be a specific issue or limitation in the retrieval process that's causing this discrepancy. It could be worth investigating further to understand the root cause.
How does Themerr search and match collections in ThemerrDB? Is the issue possibly occurring here?
Expected Behavior
No response
Additional Context
No response