Giter VIP home page Giter VIP logo

jim-schwoebel / voicebook Goto Github PK

View Code? Open in Web Editor NEW
367.0 25.0 82.0 306.46 MB

🗣️ A book and repo to get you started programming voice computing applications in Python (10 chapters and 200+ scripts).

Home Page: https://neurolex.ai/voicebook

License: Apache License 2.0

Python 83.34% HTML 1.97% MATLAB 0.04% Makefile 0.22% C++ 14.38% Dockerfile 0.01% SWIG 0.04%
voice voice-assistant voice-recognition voice-recording transcription featurization data data-cleaning visualization generation voice-activity-detection voice-control server security encryption-decryption python3 machine-learning wake-word-detection voice-computing

voicebook's Introduction

Voicebook

Tweet

This is an assortment of all the scripts associated with the Intro to Voice Computing Book. These scripts should give you a jumpstart in learning how to write Python code for voice-related applications.

^^ Note: If you find this code useful, please leave a star :) ^^

Voicebook video

Getting started: setting up environment (mac)

Watch the overview video for this book here.

Note that these are the instructions for Mac computers; you may need some custom setup for FFmpeg and/or SoX if you're using a Windows and/or Linux computer.

First, clone the repository and submodules:

git clone --recurse-submodules -j8 https://github.com/jim-schwoebel/voicebook

Now you need to run the setup.py script in a virtual environment to make sure you have all the required dependencies for all the chapters of the book. To do this, run:

cd voicebook
pip3 install virtualenv
virtualenv env 
source env/bin/activate
python3 setup.py

Now you have all the dependencies necessary to follow along with the chapters in the book. You don’t need to worry about any other installations.

License

This repository is licensed under the Apache 2.0 License.

Feedback

Any feedback on the book or this repository is greatly appreciated.

  • If you find something that is missing or doesn't work, please consider opening a GitHub issue.
  • If you'd like to be mentored by someone on our team, check out the Innovation Fellows Program.
  • If you are looking for a framework to start building machine learning models in voice computing, check out Allie.
  • If you want to talk to me directly, please send me an email @ [email protected].

Citation

Please use the following citation when citing this book in your research work:

References

Check out the wiki or documentation below to follow along with each chapter in the book. In this way, you can get quickly up to speed with the 200+ scripts included in this repository.

voicebook's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

voicebook's Issues

Chapter 7 - Page 269 - grammar error

Top of page --> however, if you __this code __ has a few issues.

Need to change to --> However, if you look at this code you'll notice a few issues:

I am having trouble installing the Voicebook on Linux

The book was intended mostly for Mac OSs here, but you can definitely run this on Linux. You just need to custom setup FFmpeg and Sox using:

sudo apt-get install sox
sudo apt-get install ffmpeg

You also need to modify the requirements.txt here to work on linux. I have attached a different requirements.txt here and you can run this command:

pip3 install -r requirements.txt 

And the rest of the modules should install. I believe the only modules that cannot be installed on linux are:

pyobjc-framework-Accounts==6.2
pyobjc-framework-AddressBook==6.2
pyobjc-framework-AdSupport==6.2
pyobjc-framework-AppleScriptKit==6.2
pyobjc-framework-AppleScriptObjC==6.2
pyobjc-framework-ApplicationServices==6.2
pyobjc-framework-AuthenticationServices==6.2
pyobjc-framework-AutomaticAssessmentConfiguration==6.2
pyobjc-framework-Automator==6.2
pyobjc-framework-AVFoundation==6.2
pyobjc-framework-AVKit==6.2
pyobjc-framework-BusinessChat==6.2
pyobjc-framework-CalendarStore==6.2
pyobjc-framework-CFNetwork==6.2
pyobjc-framework-CloudKit==6.2
pyobjc-framework-Cocoa==6.2
pyobjc-framework-Collaboration==6.2
pyobjc-framework-ColorSync==6.2
pyobjc-framework-Contacts==6.2
pyobjc-framework-ContactsUI==6.2
pyobjc-framework-CoreAudio==6.2
pyobjc-framework-CoreAudioKit==6.2
pyobjc-framework-CoreBluetooth==6.2
pyobjc-framework-CoreData==6.2
pyobjc-framework-CoreHaptics==6.2
pyobjc-framework-CoreLocation==6.2
pyobjc-framework-CoreMedia==6.2
pyobjc-framework-CoreMediaIO==6.2
pyobjc-framework-CoreML==6.2
pyobjc-framework-CoreMotion==6.2
pyobjc-framework-CoreServices==6.2
pyobjc-framework-CoreSpotlight==6.2
pyobjc-framework-CoreText==6.2
pyobjc-framework-CoreWLAN==6.2
pyobjc-framework-CryptoTokenKit==6.2
pyobjc-framework-DeviceCheck==6.2
pyobjc-framework-DictionaryServices==6.2
pyobjc-framework-DiscRecording==6.2
pyobjc-framework-DiscRecordingUI==6.2
pyobjc-framework-DiskArbitration==6.2
pyobjc-framework-DVDPlayback==6.2
pyobjc-framework-EventKit==6.2
pyobjc-framework-ExceptionHandling==6.2
pyobjc-framework-ExecutionPolicy==6.2
pyobjc-framework-ExternalAccessory==6.2
pyobjc-framework-FileProvider==6.2
pyobjc-framework-FileProviderUI==6.2
pyobjc-framework-FinderSync==6.2
pyobjc-framework-FSEvents==6.2
pyobjc-framework-GameCenter==6.2
pyobjc-framework-GameController==6.2
pyobjc-framework-GameKit==6.2
pyobjc-framework-GameplayKit==6.2
pyobjc-framework-ImageCaptureCore==6.2
pyobjc-framework-IMServicePlugIn==6.2
pyobjc-framework-InputMethodKit==6.2
pyobjc-framework-InstallerPlugins==6.2
pyobjc-framework-InstantMessage==6.2
pyobjc-framework-Intents==6.2
pyobjc-framework-IOSurface==6.2
pyobjc-framework-iTunesLibrary==6.2
pyobjc-framework-LatentSemanticMapping==6.2
pyobjc-framework-LaunchServices==6.2
pyobjc-framework-libdispatch==6.2
pyobjc-framework-LinkPresentation==6.2
pyobjc-framework-LocalAuthentication==6.2
pyobjc-framework-MapKit==6.2
pyobjc-framework-MediaAccessibility==6.2
pyobjc-framework-MediaLibrary==6.2
pyobjc-framework-MediaPlayer==6.2
pyobjc-framework-MediaToolbox==6.2
pyobjc-framework-Metal==6.2
pyobjc-framework-MetalKit==6.2
pyobjc-framework-ModelIO==6.2
pyobjc-framework-MultipeerConnectivity==6.2
pyobjc-framework-NaturalLanguage==6.2
pyobjc-framework-NetFS==6.2
pyobjc-framework-Network==6.2
pyobjc-framework-NetworkExtension==6.2
pyobjc-framework-NotificationCenter==6.2
pyobjc-framework-OpenDirectory==6.2
pyobjc-framework-OSAKit==6.2
pyobjc-framework-OSLog==6.2
pyobjc-framework-PencilKit==6.2
pyobjc-framework-Photos==6.2
pyobjc-framework-PhotosUI==6.2
pyobjc-framework-PreferencePanes==6.2
pyobjc-framework-PubSub==6.2
pyobjc-framework-PushKit==6.2
pyobjc-framework-Quartz==6.2
pyobjc-framework-QuickLookThumbnailing==6.2
pyobjc-framework-SafariServices==6.2
pyobjc-framework-SceneKit==6.2
pyobjc-framework-ScreenSaver==6.2
pyobjc-framework-ScriptingBridge==6.2
pyobjc-framework-SearchKit==6.2
pyobjc-framework-Security==6.2
pyobjc-framework-SecurityFoundation==6.2
pyobjc-framework-SecurityInterface==6.2
pyobjc-framework-ServiceManagement==6.2
pyobjc-framework-Social==6.2
pyobjc-framework-SoundAnalysis==6.2
pyobjc-framework-Speech==6.2
pyobjc-framework-SpriteKit==6.2
pyobjc-framework-StoreKit==6.2
pyobjc-framework-SyncServices==6.2
pyobjc-framework-SystemConfiguration==6.2
pyobjc-framework-SystemExtensions==6.2
pyobjc-framework-UserNotifications==6.2
pyobjc-framework-VideoSubscriberAccount==6.2
pyobjc-framework-VideoToolbox==6.2
pyobjc-framework-Vision==6.2
pyobjc-framework-WebKit==6.2

Thanks! LSTM wanted...

Hi, many Thanks for great overview and samples - I defently will recommend this information as "start with".

Would be nice also to add CNN and RNN for audio classification.

Ongoing list of suggestions (book 2)

Introduction

Chapter 1 - Collection
-Add in James comments about a short demo to collect audio away and close to microphone to make the lesson more intuitive. In feedback folder on google drive.

Chapter 3 - Featurization

Chapter 4 - Modeling

  • Introduce AutoML techniques just like in classification.

Chapter 6 - Visualization

  • Add an elastic search / kibana section.

homebrew install

homebrew install not mentioned in voicebook prior to python3 setup ('install' missing)?).

Chapter 1 - collection

Add in James comments about a short demo to collect audio away and close to microphone to make the lesson more intuitive.

In feedback folder on google drive.

fatal: The remote end hung up unexpectedly fatal: early EOF fatal: index-pack failed

This is often due to a slow internet connection.

For a resolution, see https://stackoverflow.com/questions/38618885/error-rpc-failed-curl-transfer-closed-with-outstanding-read-data-remaining

====

It happens more often than not, I am on a slow internet connection and I have to clone a decently-huge git repository. The most common issue is that the connection closes and the whole clone is cancelled.

Cloning into 'large-repository'...
remote: Counting objects: 20248, done.
remote: Compressing objects: 100% (10204/10204), done.
error: RPC failed; curl 18 transfer closed with outstanding read data remaining 
fatal: The remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed

After a lot of trial and errors and a lot of “remote end hung up unexpectedly” I have a way that works for me. The idea is to do a shallow clone first and then update the repository with its history.

$ git clone http://github.com/large-repository --depth 1
$ cd large-repository
$ git fetch --unshallow

Page 40 - Typo

"You know know why voice computing is important."
==>
"You now know why..."

Chapter 1 - Page 35 - Table 1.1.1 mis-labeled [SAMPLE]

The table is not labeled properly. The proper label should be Table 1.3.4. Also, the description is a bit unclear. You may want to eliminate the last sentence

Note this is intended to just be a demo issue to suggest enhancements on the book.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.