harry0703 / audionotes Goto Github PK

View Code? Open in Web Editor NEW

870.0 7.0 96.0 750 KB

快速提取音视频内容，整理成一份结构化的markdown笔记

License: MIT License

Dockerfile 4.14% Python 95.86%

ai asr funasr ollama python qwen2 whisper

audionotes's Introduction

AudioNotes

基于 FunASR 和 Qwen2 构建的音视频转结构化笔记系统

能够快速提取音视频的内容，并且调用大模型进行整理，成为一份结构化的markdown笔记，方便快速阅读

FunASR: https://github.com/modelscope/FunASR

Qwen2: https://ollama.com/library/qwen2

效果展示

音视频识别和整理

与音视频内容对话

使用方法

① 安装 Ollama

下载对应系统的 Ollama 安装包进行安装

https://ollama.com/download

② 拉取模型

我以 阿里的千问2 7b 为例 https://ollama.com/library/qwen2

ollama pull qwen2:7b

③ 部署服务

有两种部署方式，一种是使用 Docker 部署，另一种是本地部署

Docker部署（推荐）🐳

curl -fsSL https://github.com/harry0703/AudioNotes/raw/main/docker-compose.yml -o docker-compose.yml
docker-compose up

docker 启动后，访问 http://localhost:15433/

登录账号为 admin，密码为 admin （可以在 docker-compose.yml 文件里面修改）

本地部署 📦

需要有可访问的 postgresql 数据库

conda create -n AudioNotes python=3.10 -y
conda activate AudioNotes
git clone https://github.com/harry0703/AudioNotes.git
cd AudioNotes
pip install -r requirements.txt

将 .env.example 重命名为 .env，修改相关配置信息

chainlit run main.py

服务启动后，访问 http://localhost:8000/

登录账号为 admin，密码为 admin （可以在 .env 文件里面修改）

audionotes's People

Contributors

Stargazers

Watchers

Forkers

studyyyyt hdybibi nbzg beimingmaster upcreat geekonlinecode tengjunhe hpshark wysstartgo speedgeeker johnadonis semagtdi cellinlab 16892434 davaded crackercat knpau itsharex xinqiyang shuaibibobo jinzaizhichi jefjin meitianjinbu k8scat fai666 605910034 taocao 306026185 shaohan0228 coydlee haxine pashanitw winnerking2010 jackkang1984 zhq1 waaaok wsadczh cuiyuheng liunix61 psaaa better2025 xueminghui glaceage 66my charlielei-ee codeyu001 daweidie 123456789zws itpromiseland sunbcy zuiyuewentian petercao augustdzw lzy198436 fyang93 b08240 pw-studio cpa519904 lad993 yeyuqiandeng cloudenginehub huangzhenhao90 sharedy wangyankecn zouxiaodong secrettian jidechao enternalcode ygas mason0510 tomdog2016 lsprivategit yomaser fullstackbusiness logit507 cyxiaofeng sitexa lyhiving yanghai666 lovlin999 hzwjs laofuciu assassindesign zcfrank1st moon-tool leavenotrace dst1213 ice5631024721 huiguyy xiaoyubing qq594913901 cgy1992 andychatgpt hirajanwin chrisyangchao izhuqiang

audionotes's Issues

Translation file for zh-CN not found

感谢大佬分享，这个问题影响结果吗？

 Translation file for zh-CN not found. Using default translation en-US

mac docker, no email register, only login?

Thank you for so pretty good job, but i can't login at http://localhost:15433/login

大模型对于录音的总结能力

大佬好，我发现多提问2次，大模型就不能很好的总结录音了，开始自己编造答案了，有什么办法解决吗？谢谢

识别视频时异常

环境:
本地搭建
python 3.10

操作方式，选择一个视频文件识别时报错。这个视频只有背景音乐，视频内容都是文字。

2024-07-24 11:46:31,913 - modelscope - INFO - Use user-specified model revision: master
  File "G:\Anaconda-EVN\AudioNotes\lib\site-packages\chainlit\utils.py", line 40, in wrapper
    return await user_function(**params_values)
  File "F:\CodeProject\\AudioNotes\main.py", line 72, in on_chat_start
    asr_result = await transcribe_file(file)
  File "F:\CodeProject\\AudioNotes\main.py", line 56, in transcribe_file
    result = await loop.run_in_executor(None, funasr.transcribe, uploaded_file.path)
  File "G:\Anaconda-EVN\AudioNotes\lib\asyncio\futures.py", line 285, in __await__
    yield self  # This tells Task to wait for completion.
  File "G:\Anaconda-EVN\AudioNotes\lib\asyncio\tasks.py", line 304, in __wakeup
    future.result()
  File "G:\Anaconda-EVN\AudioNotes\lib\asyncio\futures.py", line 201, in result
    raise self._exception.with_traceback(self._exception_tb)
  File "G:\Anaconda-EVN\AudioNotes\lib\concurrent\futures\thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "F:\CodeProject\\AudioNotes\app\services\asr_funasr.py", line 56, in transcribe
    text = res[0]['text']
IndexError: list index out of range

docker安装启动audio_notes_webui失败

docker安装启动audio_notes_webui失败，报错如下：exec /usr/local/bin/chainlit: exec format error

本地部署后, 访问不了http://localhost:8000/,拒绝连接,是什么原因啊

操作步骤:
1.ollama pull qwen2:7b 成功,http://localhost:11434/能输出Ollama is running.
2.PG数据库安装成功,自己测试数据库名,用户名,密码没有问题.
3.执行main.py输出是:
2024-07-30 15:05:08 - Loaded .env file
2024-07-30 15:05:14 - new registry table has been added: preprocessor_classes
2024-07-30 15:05:15 - new registry table has been added: adaptor_classes
2024-07-30 15:05:15 - new registry table has been added: lid_predictor_classes
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1722323117.217673 22812 config.cc:230] gRPC experiments enabled: call_status_override_on_cancellation, event_engine_client, event_engine_dns, event_engine_listener, http2_stats_fix, monitoring_experiment, pick_first_new, trace_record_callops, work_serializer_clears_time_cache
请问这个输出是正确的吗,为啥打不开http://localhost:8000/这个网址,打印端口号占用情况8000端口没有任何东西,

【需求】上传的视频和音频是否可以支持在线预览查看？

Translation file for zh-CN not found. Using default translation en-US.

audio notes webui 中的chainlit找不到中文语言文件，请问如何手动翻译并添加？

这个打开还得账号登录？怎么登录

这个为啥还得账号登录？

audio_notes_webui exited with code 137

docker部署上传视频后识别很久然后会报错

请问如何在云服务器上部署，例如Autodl

请问如何接入公有云的SaaS模型API

现在各大模型价格战，API成本很低，如何接入Qwen、DeepSeek、文心等？

大佬们，本地部署，运行main之后打开网页，输入admin的账号和密码，为什么显示没有权限，无法登录

本地部署，运行main之后打开网页，输入admin的账号和密码，为什么显示没有权限，无法登录

exec format error

When running docker-compose, I get:
audio_notes_webui | exec /usr/local/bin/chainlit: exec format error

本地搭建出现问题

conda create -n AudioNotes python=3.10 -y conda activate AudioNotes git clone https://github.com/harry0703/AudioNotes.git cd AudioNotes pip install -r requirements.txt

requirements.txt 这个文件代码里没提供啊？我卡在这步了

docker compose运行，有以下问题：The requested image's platform (linux/arm64) does not match the detected host platform

docker compose运行，有以下问题：

webui The requested image's platform (linux/arm64) does not match the detected host platform (linux/amd64/v3) and no specific platform was requested

exec /usr/local/bin/chainlit: exec format error

docker compose up
...
[+] Running 4/3
 ✔ Network audionotes_audio_notes                                                                                                                       Created                0.2s
 ✔ Container audio_notes_pg                                                                                                                             Created                1.0s
 ✔ Container audio_notes_webui                                                                                                                          Created                0.0s
 ! webui The requested image's platform (linux/arm64) does not match the detected host platform (linux/amd64/v3) and no specific platform was requested                        0.0s
Attaching to audio_notes_pg, audio_notes_webui
audio_notes_pg     | The files belonging to this database system will be owned by user "postgres".
audio_notes_pg     | This user must also own the server process.
audio_notes_pg     |
audio_notes_pg     | The database cluster will be initialized with locale "en_US.utf8".
audio_notes_pg     | The default database encoding has accordingly been set to "UTF8".
audio_notes_pg     | The default text search configuration will be set to "english".
audio_notes_pg     |
audio_notes_pg     | Data page checksums are disabled.
audio_notes_pg     |
audio_notes_pg     | fixing permissions on existing directory /var/lib/postgresql/data ... ok
audio_notes_pg     | creating subdirectories ... ok
audio_notes_pg     | selecting dynamic shared memory implementation ... posix
audio_notes_pg     | selecting default max_connections ... 100
audio_notes_pg     | selecting default shared_buffers ... 128MB
audio_notes_pg     | selecting default time zone ... Etc/UTC
audio_notes_pg     | creating configuration files ... ok
audio_notes_pg     | running bootstrap script ... ok
audio_notes_webui  | exec /usr/local/bin/chainlit: exec format  @error
audio_notes_pg     | performing post-bootstrap initialization ... ok
audio_notes_pg     | initdb: warning: enabling "trust" authentication for local connections
audio_notes_pg     | initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.
audio_notes_pg     | syncing data to disk ... ok
audio_notes_pg     |
audio_notes_pg     |
audio_notes_pg     | Success. You can now start the database server using:
audio_notes_pg     |
audio_notes_pg     |     pg_ctl -D /var/lib/postgresql/data -l logfile start
audio_notes_pg     |
audio_notes_pg     | waiting for server to start....2024-07-20 16:43:24.007 UTC [49] LOG:  starting PostgreSQL 16.3 (Debian 16.3-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
audio_notes_pg     | 2024-07-20 16:43:24.008 UTC [49] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
audio_notes_pg     | 2024-07-20 16:43:24.012 UTC [52] LOG:  database system was shut down at 2024-07-20 16:43:23 UTC
audio_notes_pg     | 2024-07-20 16:43:24.015 UTC [49] LOG:  database system is ready to accept connections
audio_notes_pg     |  done
audio_notes_pg     | server started
audio_notes_webui exited with code 0
audio_notes_pg     | CREATE DATABASE
audio_notes_pg     |
audio_notes_pg     |
audio_notes_pg     | /usr/local/bin/docker-entrypoint.sh: ignoring /docker-entrypoint-initdb.d/*
audio_notes_pg     |
audio_notes_pg     | waiting for server to shut down...2024-07-20 16:43:24.222 UTC [49] LOG:  received fast shutdown request
audio_notes_pg     | .2024-07-20 16:43:24.225 UTC [49] LOG:  aborting any active transactions
audio_notes_pg     | 2024-07-20 16:43:24.231 UTC [49] LOG:  background worker "logical replication launcher" (PID 55) exited with exit code 1
audio_notes_pg     | 2024-07-20 16:43:24.232 UTC [50] LOG:  shutting down
audio_notes_pg     | 2024-07-20 16:43:24.234 UTC [50] LOG:  checkpoint starting: shutdown immediate
audio_notes_pg     | 2024-07-20 16:43:24.284 UTC [50] LOG:  checkpoint complete: wrote 922 buffers (5.6%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.033 s, sync=0.012 s, total=0.053 s; sync files=301, longest=0.003 s, average=0.001 s; distance=4255 kB, estimate=4255 kB; lsn=0/1912048, redo lsn=0/1912048
audio_notes_pg     | 2024-07-20 16:43:24.297 UTC [49] LOG:  database system is shut down
audio_notes_webui  | exec /usr/local/bin/chainlit: exec format error
audio_notes_webui  | exec /usr/local/bin/chainlit: exec format error
audio_notes_pg     |  done
audio_notes_pg     | server stopped
audio_notes_pg     |
audio_notes_pg     | PostgreSQL init process complete; ready for start up.
audio_notes_pg     |
audio_notes_pg     | 2024-07-20 16:43:24.372 UTC [1] LOG:  starting PostgreSQL 16.3 (Debian 16.3-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
audio_notes_pg     | 2024-07-20 16:43:24.372 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
audio_notes_pg     | 2024-07-20 16:43:24.372 UTC [1] LOG:  listening on IPv6 address "::", port 5432
audio_notes_pg     | 2024-07-20 16:43:24.375 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
audio_notes_pg     | 2024-07-20 16:43:24.381 UTC [65] LOG:  database system was shut down at 2024-07-20 16:43:24 UTC
audio_notes_pg     | 2024-07-20 16:43:24.391 UTC [1] LOG:  database system is ready to accept connections
audio_notes_webui exited with code 0
audio_notes_webui  | exec /usr/local/bin/chainlit: exec format error
audio_notes_webui  | exec /usr/local/bin/chainlit: exec format error
audio_notes_webui  | exec /usr/local/bin/chainlit: exec format error
audio_notes_webui exited with code 1
audio_notes_webui  | exec /usr/local/bin/chainlit: exec format error
audio_notes_webui  | exec /usr/local/bin/chainlit: exec format error
audio_notes_webui  | exec /usr/local/bin/chainlit: exec format error
...
audio_notes_webui  | exec /usr/local/bin/chainlit: exec format error

Translation file for zh-CN not found. Using default translation en-US

decoding, empty speech

2024-08-01 11:12:51 - Your app is available at http://localhost:8000
2024-08-01 11:12:53 - Translated markdown file for zh-CN not found. Defaulting to chainlit.md.
You are using the latest version of funasr-1.1.4
2024-08-01 11:12:58 - download models from model hub: ms
2024-08-01 11:12:58,680 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-08-01 11:12:58,680 - modelscope - INFO - Use user-specified model revision: master
2024-08-01 11:13:00 - Loading pretrained params from C:\Users\YUMEI.cache\modelscope\hub\iic\speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\model.pt
2024-08-01 11:13:00 - ckpt: C:\Users\YUMEI.cache\modelscope\hub\iic\speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\model.pt
2024-08-01 11:13:01 - scope_map: ['module.', 'None']
2024-08-01 11:13:01 - excludes: None
2024-08-01 11:13:01 - Loading ckpt: C:\Users\YUMEI.cache\modelscope\hub\iic\speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\model.pt, status:
2024-08-01 11:13:03 - Building VAD model.
2024-08-01 11:13:03 - download models from model hub: ms
2024-08-01 11:13:04,067 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-08-01 11:13:04,067 - modelscope - INFO - Use user-specified model revision: master
2024-08-01 11:13:04 - Loading pretrained params from C:\Users\YUMEI.cache\modelscope\hub\iic\speech_fsmn_vad_zh-cn-16k-common-pytorch\model.pt
2024-08-01 11:13:04 - ckpt: C:\Users\YUMEI.cache\modelscope\hub\iic\speech_fsmn_vad_zh-cn-16k-common-pytorch\model.pt
2024-08-01 11:13:04 - scope_map: ['module.', 'None']
2024-08-01 11:13:04 - excludes: None
2024-08-01 11:13:04 - Loading ckpt: C:\Users\YUMEI.cache\modelscope\hub\iic\speech_fsmn_vad_zh-cn-16k-common-pytorch\model.pt, status:
2024-08-01 11:13:04 - Building punc model.
2024-08-01 11:13:04 - download models from model hub: ms
2024-08-01 11:13:04,669 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-08-01 11:13:04,670 - modelscope - INFO - Use user-specified model revision: master
Building prefix dict from the default dictionary ...
2024-08-01 11:13:06 - Building prefix dict from the default dictionary ...
Loading model from cache C:\Users\YUMEI\AppData\Local\Temp\jieba.cache
2024-08-01 11:13:06 - Loading model from cache C:\Users\YUMEI\AppData\Local\Temp\jieba.cache
Loading model cost 0.403 seconds.
2024-08-01 11:13:07 - Loading model cost 0.403 seconds.
Prefix dict has been built successfully.
2024-08-01 11:13:07 - Prefix dict has been built successfully.
2024-08-01 11:13:23 - Loading pretrained params from C:\Users\YUMEI.cache\modelscope\hub\iic\punc_ct-transformer_cn-en-common-vocab471067-large\model.pt
2024-08-01 11:13:23 - ckpt: C:\Users\YUMEI.cache\modelscope\hub\iic\punc_ct-transformer_cn-en-common-vocab471067-large\model.pt
2024-08-01 11:13:23 - scope_map: ['module.', 'None']
2024-08-01 11:13:23 - excludes: None
2024-08-01 11:13:25 - Loading ckpt: C:\Users\YUMEI.cache\modelscope\hub\iic\punc_ct-transformer_cn-en-common-vocab471067-large\model.pt, status:
2024-08-01 11:13:25 - Building SPK model.
2024-08-01 11:13:25 - download models from model hub: ms
2024-08-01 11:13:26,765 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-08-01 11:13:26,765 - modelscope - INFO - Use user-specified model revision: master
Detect model requirements, begin to install it: C:\Users\YUMEI.cache\modelscope\hub\iic\speech_campplus_sv_zh-cn_16k-common\requirements.txt
install model requirements successfully
2024-08-01 11:13:28 - Loading pretrained params from C:\Users\YUMEI.cache\modelscope\hub\iic\speech_campplus_sv_zh-cn_16k-common\campplus_cn_common.bin
2024-08-01 11:13:28 - ckpt: C:\Users\YUMEI.cache\modelscope\hub\iic\speech_campplus_sv_zh-cn_16k-common\campplus_cn_common.bin
2024-08-01 11:13:28 - scope_map: ['module.', 'None']
2024-08-01 11:13:28 - excludes: None
2024-08-01 11:13:28 - Loading ckpt: C:\Users\YUMEI.cache\modelscope\hub\iic\speech_campplus_sv_zh-cn_16k-common\campplus_cn_common.bin, status:
rtf_avg: 0.148: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00, 1.56s/it]
0%| | 0/1 [00:00<?, ?it/s]2024-08-01 11:13:30 - decoding, utt: be11b39b-1f43-48d0-95ab-ca3ad43f3b79, empty speech