cbh123 / narrator Goto Github PK
View Code? Open in Web Editor NEWDavid Attenborough narrates your life
David Attenborough narrates your life
This is a very cool project. .... is there a complete step by step guide for this as all the additional bits to set up Replicate, voice in Elevenlab etc have lost me a bit...
π€π€
It would be nice if the pause between narrations was shorter and more natural. (Also, do you know of a method to keep the context/memory of previous frames?)
ποΈ David says:
As I am not allowed to provide descriptions or any other details regarding the individuals in the image youβve provided, I can't assist with your request. If you have a different kind of inquiry not involving personal details, feel free to ask!
any way around this? I assume this is a new limitation of openAI?
The README directs users to create a Replicate account and set an API key, but I skipped those steps and (other issues aside), the script worked fine. If that's true, the README can be simplified to remove the Replicate steps.
Hello,
anyone else having similar problem to mine? I entered the api_key for OpenAI as an environmental variable with cmd command setx OPENAI_API_KEY "sk-nU..." and also tried adding a system variable but I always get the same error when launching narrator.py:
openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or
by setting the OPENAI_API_KEY environment variable
Any tips what would be the problem?
On Mac, Python 3.10.13,
i had to pip install pillow and simpleaudio.
Now i have
Pillow 10.1.0
simpleaudio 1.0.4
elevenlabs.api.error.APIError: A voice for the voice_id ENfvYmv6CRqDodDZTieQ was not found.
it doesn't seem to be listed here also:
After using the install methode and adding the API keys I receive this error.
π David is watching...
The model gpt-4-vision-preview
does not exist or you do not have access to it. Learn more: https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4.', 'type': 'invalid_request_error', 'param': None, 'code': 'model_not_found'}}
The current implementation of the Narrator project on the cbh123:main
branch is not functional due to outdated API integrations and missing environment variable handling.
This issue has been addressed and fixed in the following pull request: Update ElevenLabs API Integration, Enhance Security, and Improve Narrator Functionality #53.
To resolve the issues and ensure a working application, please pull in the latest changes from the following forked repository: mgennings/narrator.
These updates include:
.env
file.Thank you for your attention to this matter.
Line 94 in 4bab104
I'm not sure the best way to include user prompts in the messages history here, since you don't want to include the actual image every time, but a placeholder of some sort that shows the LLM was prompted into the given response may help consistency and avoid an 'intro' every time, I'm not certain.
Something like:
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image (user uploaded image)"},
],
},
capture.py seems to be capturing the images (webcam is on, console repeating "Say Cheese"
However when running narrator.py in a different terminal, I am getting an error message saying that frames/frame.jpg doesn't exist
I do not see this directory or files
python capture.py
Traceback (most recent call last):
File "/Users/jr/Documents/hobby/narrator/capture.py", line 3, in <module>
from PIL import Image
ModuleNotFoundError: No module named 'PIL'
Not entirely sure what the issue is, wrong Python version? I'm on Python 3.11.4. Tried installing PIL without success:
pip3 install PIL
ERROR: Could not find a version that satisfies the requirement PIL (from versions: none)
ERROR: No matching distribution found for PIL
When I run Narrate.py I get the following:
Traceback (most recent call last):
File "/Users/ahmed/Desktop/Dev/narrate/narrator/narrator.py", line 12, in
set_api_key(os.environ.get("ELEVENLABS_API_KEY"))
File "/Users/ahmed/Desktop/Dev/narrate/venv/lib/python3.10/site-packages/elevenlabs/simple.py", line 17, in set_api_key
os.environ["ELEVEN_API_KEY"] = api_key
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/os.py", line 684, in setitem
value = self.encodevalue(value)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/os.py", line 756, in encode
raise TypeError("str expected, not %s" % type(value).name)
TypeError: str expected, not NoneType
You know what would be cool? having this stuff run on a mobile phone, do you think that would be possible?
Like going in a trip and starting this and using the camera of the phone. That would be so handy!
Thank you!
I'm able to get everything to work, but I run into problems with the Eleven Labs voice reading the text.
I do have a paid account and the voice ID, but I can't get it recognized.
I get the following error after it displays text that correctly describes the image.
I followed the @mgennings instructions for creating a .env file to help streamline that process of setting the keys, but still no luck.
File "C:\narrator-main\narrator-main\narrator.py", line 105, in <module>
main()
File "C:\narrator-main\narrator-main\narrator.py", line 96, in main
play_audio(analysis)
File "C:\narrator-main\narrator-main\narrator.py", line 31, in play_audio
audio = generate(text, voice=os.environ.get("ELEVENLABS_VOICE_ID"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\exhibits\AppData\Roaming\Python\Python312\site-packages\elevenlabs\simple.py", line 61, in generate
assert isinstance(voice, Voice)
^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
My apologies if this is a silly problem with an easy fix, but I promise I have googled it and could not fix it.
The capture.py runs smoothly for me, but the narrator.py throws the following error; and it does not depend on the Eleven Labs audio id that I use.
I'd really appreciate any help you'd be able to offer :)
File "\narrator-main\narrator.py", line 114, in
main()
File "\narrator-main\narrator.py", line 105, in main
play_audio(analysis)
File "\narrator-main\narrator.py", line 40, in play_audio
audio = generate(text, voice=os.environ.get("7Wqa3tuynJ4uUcRnTwAI"))
File "\anaconda3\lib\site-packages\elevenlabs\simple.py", line 61, in generate
assert isinstance(voice, Voice)
AssertionError
And people wonder why Nix
is necessary... /eyeroll
Building wheels for collected packages: simpleaudio
Building wheel for simpleaudio (setup.py) ... error
error: subprocess-exited-with-error
Γ python setup.py bdist_wheel did not run successfully.
β exit code: 1
β°β> [22 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.macosx-10.9-universal2-cpython-39
creating build/lib.macosx-10.9-universal2-cpython-39/simpleaudio
copying simpleaudio/__init__.py -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio
copying simpleaudio/shiny.py -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio
copying simpleaudio/functionchecks.py -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio
creating build/lib.macosx-10.9-universal2-cpython-39/simpleaudio/test_audio
copying simpleaudio/test_audio/c.wav -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio/test_audio
copying simpleaudio/test_audio/e.wav -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio/test_audio
copying simpleaudio/test_audio/g.wav -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio/test_audio
copying simpleaudio/test_audio/left_right.wav -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio/test_audio
copying simpleaudio/test_audio/notes_2_16_44.wav -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio/test_audio
running build_ext
building 'simpleaudio._simpleaudio' extension
creating build/temp.macosx-10.9-universal2-cpython-39
creating build/temp.macosx-10.9-universal2-cpython-39/c_src
clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -iwithsysroot/System/Library/Frameworks/System.framework/PrivateHeaders -iwithsysroot/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/Headers -Werror=implicit-function-declaration -Wno-error=unreachable-code -arch root:xnu-10002.60.71.505.1~3/RELEASE_ARM64_T6000 -DDEBUG=0 -I/Users/pmarreck/Documents/narrator/venv/include -I/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/include/python3.9 -c c_src/posix_mutex.c -o build/temp.macosx-10.9-universal2-cpython-39/c_src/posix_mutex.o -mmacosx-version-min=10.6
clang: error: invalid arch name '-arch root:xnu-10002.60.71.505.1~3/RELEASE_ARM64_T6000'
error: command '/usr/bin/clang' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for simpleaudio
Running setup.py clean for simpleaudio
Failed to build simpleaudio
ERROR: Could not build wheels for simpleaudio, which is required to install pyproject.toml-based projects
In README.md,
change:
export OPENAI_API_KEY=<token>
export ELEVENLABS_API_KEY=<eleven-token>
to
export OPENAI_API_KEY=<token>
export ELEVEN_API_KEY=<eleven-token>
Got this error but no clue...anyone?
π David is watching...
Traceback (most recent call last):
File "/narrator/narrator.py", line 102, in
main()
File "/narrator/narrator.py", line 88, in main
analysis = analyze_image(base64_image, script=script)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/narrator/narrator.py", line 57, in analyze_image
response = client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/openai/_utils/_utils.py", line 299, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/openai/resources/chat/completions.py", line 556, in create
return self._post(
^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/openai/_base_client.py", line 1055, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/openai/_base_client.py", line 834, in request
return self._request(
^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/openai/_base_client.py", line 854, in _request
request = self._build_request(options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/openai/_base_client.py", line 435, in _build_request
headers = self._build_headers(options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/openai/_base_client.py", line 393, in _build_headers
headers = httpx.Headers(headers_dict)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/httpx/_models.py", line 70, in init
self._list = [
^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/httpx/_models.py", line 74, in
normalize_header_value(v, encoding),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/httpx/_utils.py", line 53, in normalize_header_value
return value.encode(encoding or "ascii")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'ascii' codec can't encode character '\u201d' in position 58: ordinal not in range(128)
OpenAI now returns the following error: I'm sorry, but I'm not able to provide visual descriptions of images with real people. If you have any other questions or need information on a different topic, feel free to ask!
Hi,
The calls to the processes can be simplified for the end-user, may I get access to open a branch an a PR to simplify this?
Thanks!
I have a paid account with OPEN AI but recieve this error message, suggesting I am exceeding my rate limit.
The limits for (gpt-4)[https://platform.openai.com/account/limits] are:
gpt-4 | 10,000 TPM | 3 RPM200 RPD |
---|
My opportunity is:
, line 877, in _request
raise self._make_status_error_from_response(err.response) from None
openai.RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}
Hello,
This is my first ever "programming" experience and I am having some issues, here is what I get when I try to run narrator.py
What can I do to solve this?
π David is watching...
Traceback (most recent call last):
File "/Users/ME/projectai/narrator/narrator.py", line 102, in
main()
File "/Users/ME/projectai/narrator/narrator.py", line 88, in main
analysis = analyze_image(base64_image, script=script)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/ME/projectai/narrator/narrator.py", line 57, in analyze_image
response = client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/narrator/lib/python3.12/site-packages/openai/_utils/_utils.py", line 299, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/narrator/lib/python3.12/site-packages/openai/resources/chat/completions.py", line 556, in create
return self._post(
^^^^^^^^^^^
File "/opt/anaconda3/envs/narrator/lib/python3.12/site-packages/openai/_base_client.py", line 1055, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/narrator/lib/python3.12/site-packages/openai/_base_client.py", line 834, in request
return self._request(
^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/narrator/lib/python3.12/site-packages/openai/_base_client.py", line 877, in _request
raise self._make_status_error_from_response(err.response) from None
openai.NotFoundError: Error code: 404 - {'error': {'message': 'The model gpt-4-vision-preview
does not exist or you do not have access to it. Learn more: https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4.', 'type': 'invalid_request_error', 'param': None, 'code': 'model_not_found'}}
(narrator) ME@MacBook-Pro-de-ME narrator %
Everything seems to be working:
Getting "πΈ Say cheese! Saving frame." and "ποΈ David says:" in the terminal in VSC.
But no audio is playing, not sure how to trouble-shoot. Any ideas? π
when running narrator.py, the error message below comes up.
openai.NotFoundError: Error code: 404 - {'error': {'message': 'The model gpt-4-vision-preview
has been deprecated, learn more here: https://platform.openai.com/docs/deprecations', 'type': 'invalid_request_error', 'param': None, 'code': 'model_not_found'}}
looking at https://platform.openai.com/docs/deprecations, it shows that 'gpt-4-vision-preview's recommended replacement is 'gpt-4o'
When I replace the above in narrator.py line 58 with 'gpt-4o', the new error message I get is
openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid type for 'messages[1].content[1].image_url': expected an object, but got a string instead.", 'type': 'invalid_request_error', 'param': 'messages[1].content[1].image_url', 'code': 'invalid_type'}}
Anyone know a fix for this?
Trying to run narrator and getting this issue:
import simpleaudio as sa
File "/Users/jakeboyles/Documents/repos/narrator/venv/lib/python3.11/site-packages/simpleaudio/init.py", line 1, in
from simpleaudio.shiny import *
File "/Users/jakeboyles/Documents/repos/narrator/venv/lib/python3.11/site-packages/simpleaudio/shiny.py", line 5, in
import simpleaudio._simpleaudio as _sa
ImportError: dlopen(/Users/jakeboyles/Documents/repos/narrator/venv/lib/python3.11/site-packages/simpleaudio/_simpleaudio.cpython-311-darwin.so, 0x0002): tried: '/Users/jakeboyles/Documents/repos/narrator/venv/lib/python3.11/site-packages/simpleaudio/_simpleaudio.cpython-311-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64')), '/System/Volumes/Preboot/Cryptexes/OS/Users/jakeboyles/Documents/repos/narrator/venv/lib/python3.11/site-packages/simpleaudio/_simpleaudio.cpython-311-darwin.so' (no such file), '/Users/jakeboyles/Documents/repos/narrator/venv/lib/python3.11/site-packages/simpleaudio/_simpleaudio.cpython-311-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))
Tried to arch -arm64 pip install simpleaudio but didn't work.
I get the following error when I run it, do I need to change the version of pydantic or elevenlabs for it to work?
Traceback (most recent call last):
File "/Users/Documents/GitHub/narrator/narrator.py", line 8, in
from elevenlabs import generate, play, set_api_key, voices
File "/Users/anaconda3/lib/python3.11/site-packages/elevenlabs/init.py", line 1, in
from .api import * # noqa F403
^^^^^^^^^^^^^^^^^^
File "/Users/anaconda3/lib/python3.11/site-packages/elevenlabs/api/init.py", line 2, in
from .history import * # noqa F403
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/anaconda3/lib/python3.11/site-packages/elevenlabs/api/history.py", line 6, in
from pydantic import model_validator
ImportError: cannot import name 'model_validator' from 'pydantic' (/Users/anaconda3/lib/python3.11/site-packages/pydantic/init.cpython-311-darwin.so)
(base) @iMac-2 narrator % python narrator.py
Traceback (most recent call last):
File "/Users/Documents/GitHub/narrator/narrator.py", line 8, in
from elevenlabs import generate, play, set_api_key, voices
File "/Users/anaconda3/lib/python3.11/site-packages/elevenlabs/init.py", line 1, in
from .api import * # noqa F403
^^^^^^^^^^^^^^^^^^
File "/Users/anaconda3/lib/python3.11/site-packages/elevenlabs/api/init.py", line 2, in
from .history import * # noqa F403
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/anaconda3/lib/python3.11/site-packages/elevenlabs/api/history.py", line 6, in
from pydantic import model_validator
ImportError: cannot import name 'model_validator' from 'pydantic' (/Users/anaconda3/lib/python3.11/site-packages/pydantic/init.cpython-311-darwin.so)
What python version do i need for this?
Simple audio needs 3.8 but i found that another module required 3.9 so i was not able to get it running. Any help is greatly appreciated.
Hi! Thanks for hacking this one, it's super cool :)
Narrator doesn't work for me, the elevenpath API always returns
elevenlabs.api.error.APIError: A voice for the voice_id XXX was not found.
where XXX is the voiceId of my freshly created voice.
any ideas? I can't find the error in their docs
Getting this when running narrator:
ValueError: ffplay from ffmpeg not found, necessary to play audio. On mac you can install it with 'brew install ffmpeg'. On linux and windows you can install it from https://ffmpeg.org/
Shouldn't ffmpeg be in the requirements.txt? Where should the executables be placed?
Thanks!
Where can I add my API keys so that I don't need to export them every time I launch the project?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.