Giter VIP home page Giter VIP logo

aliyun-speech-recognition's Introduction

aliyun-speech-recognition

阿里云语音识别工具包

注意事项:

  1. 阿里云对于字词 Words 的开始和结束时间划分的并不准确。句子 Sentences 的开始和结束时间划分比较准确。

程序配置

在项目根目录下创建一个 app_config.js 文件,将一下内容填入

const path = require('path');
const pinyin = require('pinyin');

module.exports = {
    accessKeyId: 'Kyoasdz90lVU7ttN',                        // 您的AccessKey Id
    secretAccessKey: 'B2asdVJTaBwyUQ6ANxxe3BwnrRfoqg',      // 您的AccessKey Secret
    appkey: 'z67asdyn0aC8byNK',                             // 您的appkey
    recognitionResultsPath: path.resolve(__dirname, 'recognition_results'),     // 语音识别结果保存位置
    transcodeAudiosPath: path.resolve(__dirname, 'transcode_audios'),           // 音频转码保存位置
    fileServerRoot: 'http://program-hub.cn/index',                              // 文件服务器根路径
    reviseTextPath: path.resolve(__dirname, 'revise_text'),                     // 修订文本路径
    pinyinOption: { segment: true, style: pinyin.STYLE_TONE2 }                  // 为文字标注拼音 设置方法参考 https://github.com/hotoo/pinyin
};

npm run transcode

将媒体文件转换成阿里云要求的格式与编码 (单声道 16k采样率 MP3),结果保存在 transcodeAudiosPath

npm run submit_task

向阿里云提交语音识别任务。要进行识别的文件需放到 transcodeAudiosPath 中,而且需要用一台带域名的的服务器,反向代理该目录来让阿里云访问,具体的方法可参考《使用 nginx + xshell5 实现内网穿透 (反向隧道)》

提交完成后不要立即关闭反向代理,因为阿里云之后可能还会再次访问,如果无法访问可能会导致识别失败

npm run receive_result

检查识别进度,将识别结果保存到 recognitionResultsPath

recognitionResultsPath 目录下的 processing_tasks.json 记录的是还没有收到结果或者识别失败的任务。

npm run add_pinyin

为语音识别结果添加拼音

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.