Giter VIP home page Giter VIP logo

audeering / opensmile Goto Github PK

View Code? Open in Web Editor NEW
520.0 19.0 72.0 10.74 MB

The Munich Open-Source Large-Scale Multimedia Feature Extractor

Home Page: https://audeering.github.io/opensmile/

License: Other

CMake 1.23% PowerShell 0.04% Shell 0.22% PHP 1.78% C++ 85.77% Kotlin 0.36% SWIG 0.15% Swift 0.50% Objective-C 0.03% Objective-C++ 0.14% C# 1.17% Python 0.66% C 6.61% Perl 1.27% Gnuplot 0.03% Makefile 0.01% HTML 0.01%
machine-learning feature-extraction audio

opensmile's Introduction

Latest release Latest release date All releases Documentation

openSMILE (open-source Speech and Music Interpretation by Large-space Extraction) is a complete and open-source toolkit for audio analysis, processing and classification especially targeted at speech and music applications, e.g. automatic speech recognition, speaker identification, emotion recognition, or beat tracking and chord detection.

It is written purely in C++, has a fast, efficient, and flexible architecture, and runs on desktop, mobile, and embedded platforms such as Linux, Windows, macOS, Android, iOS and Raspberry Pi.

See also the standalone opensmile Python package for an easy-to-use wrapper if you are working in Python.

What's new

Please see our blog post on audeering.com for a summary of the new features in version 3.0.

Quick start

Pre-built x64 binaries for Windows, Linux, and macOS are provided on the Releases page. Alternatively, you may follow the steps below to build openSMILE yourself, if desired.

For more details on how to customize builds, build for other platforms, and use openSMILE, see Section Get started in the documentation.

Linux/MacOS

Prerequisites:

  • A version of gcc and g++ or Clang needs to be installed that supports C++11.
  • CMake 3.5.1 or later needs to be installed and in the PATH.
  1. In build_flags.sh, set build flags and options as desired.
  2. Run bash build.sh.

Build files will be generated in the ./build subdirectory. You can find the main SMILExtract binary in ./build/progsrc/smilextract.

Windows

Prerequisites:

  • Visual Studio 2017 or higher with C++ components is required.
  • CMake 3.15 or later needs to be installed and in the PATH.
  1. In build_flags.ps1, set build flags and options as desired.
  2. Run powershell -ExecutionPolicy Bypass -File build.ps1.

Build files will be generated in the ./build subdirectory. You can find the main SMILExtract.exe binary in ./build/progsrc/smilextract.

Documentation

You can find extensive documentation with step-by-step instructions on how to build openSMILE and get started at https://audeering.github.io/opensmile/.

History

The toolkit was first developed at the Institute for Human-Machine Communication at the Technische Universität München in Munich, Germany. It was started within the SEMAINE EU-FP7 research project. The toolkit is now owned and maintained by audEERING GmbH, who provide intelligent audio analysis solutions, automatic speech emotion recognition, and paralinguistic speech analysis software packages as well as consulting and development services on these topics.

Contributing and Support

We welcome contributions! For feedback and technical support, please use the issue tracker.

Licensing

openSMILE follows a dual-licensing model. Since the main goal of the project is widespread use of the software to facilitate research in the field of machine learning from audio-visual signals, the source code, and binaries are freely available for private, research, and educational use under an open-source license (see LICENSE). It is not allowed to use the open-source version of openSMILE for any sort of commercial product. Fundamental research in companies, for example, is permitted, but if a product is the result of the research, we require you to buy a commercial development license. Contact us at [email protected] (or visit us at https://www.audeering.com) for more information.

Original authors: Florian Eyben, Felix Weninger, Martin Wöllmer, Björn Schuller
Copyright © 2008-2013, Institute for Human-Machine Communication, Technische Universität München, Germany
Copyright © 2013-2015, audEERING UG (haftungsbeschränkt)
Copyright © 2016-2023, audEERING GmbH

Citing

Please cite openSMILE in your publications by citing the following paper:

Florian Eyben, Martin Wöllmer, Björn Schuller: "openSMILE - The Munich Versatile and Fast Open-Source Audio Feature Extractor", Proc. ACM Multimedia (MM), ACM, Florence, Italy, ISBN 978-1-60558-933-6, pp. 1459-1462, 25.-29.10.2010.

opensmile's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

opensmile's Issues

eGeMAPS Implementation

I am a graduate student who uses eGeMAPS on a regular basis — both low level descriptors and functionals. I am studying for my qualifying exam and I am expected to know exactly how these are implemented. How do I find their implementation in the opensmile code base?

I bought the thesis book by the opensmile author, which is helpful. However, many details are missing, and I want to be able to know how the math is being represented in code. I can find the config but not the implementation.

Let’s start with the first low level descriptor — Loudness. Where is this implemented?

(MSG): No filename given

image
Thanks for your amazing software.
I have encounter some messages when I was using opensmile to extact the feature of .wav file.
I wondered what is the problem with my usage.
Could you give me some advice on it ?
The command I used looks like this:
cd /content/drive/MyDrive/opensmile && ./SMILExtract -C /content/drive/MyDrive/opensmile/config/is09-13/IS09_emotion.conf -I /content/drive/MyDrive/audio_0.wav -O /content/drive/opensmile_extract/audio_0.txt

What are the valid output range of feature values?

I have used opensmile to extract features from audio using "ComParE2016" . That returned 6373 features. I have converted that to numpy and plotted it. So I wanna know that is it correct or are those outliers? because i got some values which are multiple of 10^13 and many other which are just below 5000.

Ndst in ./src/lldcore/intensity.cpp

Hi, I was toying around with the intensity/loudness calculations and I noticed that if I make a standard config with frame step 0.1s and frame width of 0.25s:

   // The frame size in seconds (0.0 = full input, same as frameMode=full)
frameSize = 0.025000

   // The frame step (frame sampling period) in seconds (0.0 = set to the same value as 'frameSize')
frameStep = 0.010000

then Ndst always seems to have value 1, so the line:

long safeN = MIN(Nsrc,MIN(nWin,Ndst));

is always 1 (while Nsrc and nWin will usually be 400 for 16kHz) and so only the first datapoint in the frame is used (and the hamming window is effectively just a scaling value).

This looks like a bug but maybe I'm just mis-interpreting or mis-using the code (?) Is there maybe a way to get Ndst to behave better or maybe is this expected?

Installation with PortAudio

Hey guys, thanks for building this interesting tool,

I am following the tutorial on Getting Started with Portaudio.

As Instructed, I first install PortAudio with vcpkg. Then, in "build_flags.ps1" in the opensmile-directory, I uncomment "-DCMAKE_TOOLCHAIN_FILE=C:\Users\schne\Documents\github\everyday_audio\packages\vcpkg\scripts\buildsystems\vcpkg.cmake"

Then I compile opensmile with powershell -ExecutionPolicy Bypass -File build.ps1

However, when checking for SMILExtract -H cPortaudio, I do not see any configurations for Portaudio components (see code below).

I've tried to install with the triplets "portaudio:x64-windows", "portaudio:x64-windows-static", and "portaudio:x86-windows". All have the same result.

When trying to record with SMILExtract -C config/demo/audiorecorder.conf -sampleRate 44100 -channels 2 -O output.wav, I only get the following error message:

(ERR) [1] configManager: cFileConfigReader::openInput : cannot find input file 'config/demo/audiorecorder.conf'!

I would be glad about some feedback. Thanks in advance for your help.

Here is the output from the x64 Native Tools Command Prompt for VS 2019:

**********************************************************************
** Visual Studio 2019 Developer Command Prompt v16.11.5
** Copyright (c) 2021 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'

C:\Windows\System32>cd C:\Users\schne\Documents\github\everyday_audio\packages\vcpkg

C:\Users\schne\Documents\github\everyday_audio\packages\vcpkg>.\\bootstrap-vcpkg.bat
Downloading https://github.com/microsoft/vcpkg-tool/releases/download/2021-09-10/vcpkg.exe -> C:\Users\schne\Documents\github\everyday_audio\packages\vcpkg\vcpkg.exe... done.
Validating signature... done.

Telemetry
---------
vcpkg collects usage data in order to help us improve your experience.
The data collected by Microsoft is anonymous.
You can opt-out of telemetry by re-running the bootstrap-vcpkg script with -disableMetrics,
passing --disable-metrics to vcpkg on the command line,
or by setting the VCPKG_DISABLE_METRICS environment variable.

Read more about vcpkg telemetry at docs/about/privacy.md

C:\Users\schne\Documents\github\everyday_audio\packages\vcpkg>.\\vcpkg integrate install
Applied user-wide integration for this vcpkg root.

All MSBuild C++ projects can now #include any installed libraries.
Linking will be handled automatically.
Installing new libraries will make them instantly available.

CMake projects should use: "-DCMAKE_TOOLCHAIN_FILE=C:/Users/schne/Documents/github/everyday_audio/packages/vcpkg/scripts/buildsystems/vcpkg.cmake"

C:\Users\schne\Documents\github\everyday_audio\packages\vcpkg>.\\vcpkg install portaudio:x64-windows-static
Computing installation plan...
The following packages will be built and installed:
    portaudio[core]:x64-windows-static -> 19.7#2
Detecting compiler hash for triplet x64-windows-static...
Restored 0 packages from C:\Users\schne\AppData\Local\vcpkg\archives in 283.2 us. Use --debug to see more details.
Starting package 1/1: portaudio:x64-windows-static
Building package portaudio[core]:x64-windows-static...
-- Using cached PortAudio-portaudio-147dd722548358763a8b649b3e4b41dfffbcfbb6.tar.gz.
-- Cleaning sources at C:/Users/schne/Documents/github/everyday_audio/packages/vcpkg/buildtrees/portaudio/src/dfffbcfbb6-810893c379.clean. Use --editable to skip cleaning for the packages you specify.
-- Extracting source C:/Users/schne/Documents/github/everyday_audio/packages/vcpkg/downloads/PortAudio-portaudio-147dd722548358763a8b649b3e4b41dfffbcfbb6.tar.gz
-- Applying patch fix-library-can-not-be-found.patch
-- Using source at C:/Users/schne/Documents/github/everyday_audio/packages/vcpkg/buildtrees/portaudio/src/dfffbcfbb6-810893c379.clean
-- Found external ninja('1.10.2').
-- Configuring x64-windows-static
-- Building x64-windows-static-dbg
-- Building x64-windows-static-rel
-- Installing: C:/Users/schne/Documents/github/everyday_audio/packages/vcpkg/packages/portaudio_x64-windows-static/share/portaudio/copyright
-- Fixing pkgconfig file: C:/Users/schne/Documents/github/everyday_audio/packages/vcpkg/packages/portaudio_x64-windows-static/lib/pkgconfig/portaudio-2.0.pc
-- Using cached msys-mingw-w64-i686-pkg-config-0.29.2-2-any.pkg.tar.zst.
-- Using cached msys-mingw-w64-i686-libwinpthread-git-8.0.0.5906.c9a21571-1-any.pkg.tar.zst.
-- Using msys root at C:/Users/schne/Documents/github/everyday_audio/packages/vcpkg/downloads/tools/msys2/aa5af7b2aa7e90e8
-- Fixing pkgconfig file: C:/Users/schne/Documents/github/everyday_audio/packages/vcpkg/packages/portaudio_x64-windows-static/debug/lib/pkgconfig/portaudio-2.0.pc
-- Performing post-build validation
-- Performing post-build validation done
Stored binary cache: C:\Users\schne\AppData\Local\vcpkg\archives\4b\4b9f0d859c11b38fc6fb236eddfea460226ce533585ac4170f8bab4f9bb0346b.zip
Installing package portaudio[core]:x64-windows-static...
Elapsed time for package portaudio:x64-windows-static: 8.945 s

Total elapsed time: 12.4 s

The package portaudio provides CMake targets:

    find_package(portaudio CONFIG REQUIRED)
    target_link_libraries(main PRIVATE portaudio_static)


C:\Users\schne\Documents\github\everyday_audio\packages\vcpkg>cd ../opensmile-3.0.0

C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0>powershell -ExecutionPolicy Bypass -File build.ps1
-- Selecting Windows SDK version 10.0.19041.0 to target Windows 10.0.19042.
-- The C compiler identification is MSVC 19.29.30136.0
-- The CXX compiler identification is MSVC 19.29.30136.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: C:/Program Files/Git/cmd/git.exe (found version "2.24.1.windows.2")
-- Looking for pthread.h
-- Looking for pthread.h - not found
-- Found Threads: TRUE
-- Configuring done
-- Generating done
-- Build files have been written to: C:/Users/schne/Documents/github/everyday_audio/packages/opensmile-3.0.0/build
Microsoft (R) Build Engine version 16.11.1+3e40a09f8 for .NET Framework
Copyright (C) Microsoft Corporation. All rights reserved.

  Checking Build System
  Building Custom Rule C:/Users/schne/Documents/github/everyday_audio/packages/opensmile-3.0.0/src/newmat/CMakeLists.tx
  t
  bandmat.cpp
  cholesky.cpp
  evalue.cpp
  fft.cpp
  hholder.cpp
  jacobi.cpp
  myexcept.cpp
  newfft.cpp
  newmat1.cpp
  newmat2.cpp
  newmat3.cpp
  newmat4.cpp
  newmat5.cpp
  newmat6.cpp
  newmat7.cpp
  newmat8.cpp
  newmat9.cpp
  newmatex.cpp
  newmatnl.cpp
  newmatrm.cpp
  Generating Code...
  Compiling...
  solution.cpp
  sort.cpp
  submat.cpp
  svd.cpp
  Generating Code...
  newmat.vcxproj -> C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0\build\src\newmat\newmat.dir
  \Release\newmat.lib
  Building Custom Rule C:/Users/schne/Documents/github/everyday_audio/packages/opensmile-3.0.0/CMakeLists.txt
  juliusSink.cpp
  svm.cpp
  libsvmliveSink.cpp
C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0\src\classifiers\libsvmliveSink.cpp(323,42): war
ning C4018: '>': signed/unsigned mismatch [C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0\buil
d\opensmile.vcxproj]
C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0\src\classifiers\libsvmliveSink.cpp(337,46): war
ning C4018: '>': signed/unsigned mismatch [C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0\buil
d\opensmile.vcxproj]
  svmSink.cpp
  commandlineParser.cpp
  componentManager.cpp
  configManager.cpp
  dataMemory.cpp
  dataProcessor.cpp
  dataReader.cpp
  dataSelector.cpp
  dataSink.cpp
  dataSource.cpp
  dataWriter.cpp
  exceptions.cpp
  nullSink.cpp
  smileCommon.cpp
  smileComponent.cpp
  smileLogger.cpp
  vecToWinProcessor.cpp
  Compiling...
  vectorProcessor.cpp
  vectorTransform.cpp
  winToVecProcessor.cpp
  windowProcessor.cpp
  dbA.cpp
  signalGenerator.cpp
  smileResample.cpp
  specResample.cpp
  vadV1.cpp
  specScale.cpp
  acf.cpp
  amdf.cpp
  contourSmoother.cpp
  deltaRegression.cpp
  fftmagphase.cpp
  framer.cpp
  fullinputMean.cpp
  fullturnMean.cpp
  monoMixdown.cpp
  preemphasis.cpp
  Compiling...
  transformFft.cpp
  turnDetector.cpp
  vectorMVN.cpp
  vectorPreemphasis.cpp
  windower.cpp
  exampleSink.cpp
  exampleSource.cpp
  simpleMessageSender.cpp
  ffmpegSource.cpp
  functionalComponent.cpp
  functionalCrossings.cpp
  functionalDCT.cpp
  functionalExtremes.cpp
  functionalLpc.cpp
  functionalMeans.cpp
  functionalMoments.cpp
  functionalOnset.cpp
  functionalPeaks.cpp
  functionalPeaks2.cpp
  functionalPercentiles.cpp
  Compiling...
  functionalRegression.cpp
  functionalSamples.cpp
  functionalSegments.cpp
  functionalTimes.cpp
  functionalModulation.cpp
  functionals.cpp
  libsvmSink.cpp
  arffSink.cpp
  arffSource.cpp
  csvSink.cpp
  csvSource.cpp
  datadumpSink.cpp
  dataPrintSink.cpp
  htkSink.cpp
  htkSource.cpp
  externalSink.cpp
  externalSource.cpp
  externalAudioSource.cpp
  waveSink.cpp
  waveSinkCut.cpp
  Compiling...
  waveSource.cpp
  cens.cpp
  chroma.cpp
  formantLpc.cpp
  formantSmoother.cpp
  lpc.cpp
  lsp.cpp
  pitchDirection.cpp
  pitchJitter.cpp
  pitchShs.cpp
  pitchSmootherViterbi.cpp
  tonefilt.cpp
  tonespec.cpp
  harmonics.cpp
  energy.cpp
  intensity.cpp
  melspec.cpp
  mfcc.cpp
  mzcr.cpp
  pitchACF.cpp
  Compiling...
  pitchBase.cpp
  pitchSmoother.cpp
  plp.cpp
  spectral.cpp
  bowProducer.cpp
  maxIndex.cpp
  valbasedSelector.cpp
  vectorConcat.cpp
  vectorBinaryOperation.cpp
  vectorOperation.cpp
  externalMessageInterface.cpp
  portaudioDuplex.cpp
  portaudioSink.cpp
  portaudioSource.cpp
  portaudioWavplayer.cpp
  openslesSource.cpp
  jniMessageInterface.cpp
  rnn.cpp
  rnnProcessor.cpp
  rnnSink.cpp
  Compiling...
  rnnVad2.cpp
  smileUtilCsv.cpp
  zerosolve.cpp
  JsonClasses.cpp
  openCVSource.cpp
  fftsg.c
  smileUtil.c
  smileUtilSpline.c
  opensmile.vcxproj -> C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0\build\Release\opensmile.
  lib
  Building Custom Rule C:/Users/schne/Documents/github/everyday_audio/packages/opensmile-3.0.0/progsrc/smilextract/CMak
  eLists.txt
  Building Custom Rule C:/Users/schne/Documents/github/everyday_audio/packages/opensmile-3.0.0/progsrc/smileapi/CMakeLi
  sts.txt
  SMILEapi.cpp
  SMILExtract.cpp
  opensmile.lib(smileCommon.obj) : MSIL .netmodule or module compiled with /GL found; restarting link with /LTCG; add /
  LTCG to the link command line to improve linker performance
  opensmile.lib(smileCommon.obj) : MSIL .netmodule or module compiled with /GL found; restarting link with /LTCG; add /
  LTCG to the link command line to improve linker performance
  Generating code
     Creating library C:/Users/schne/Documents/github/everyday_audio/packages/opensmile-3.0.0/build/progsrc/smileapi/Re
  lease/SMILEapi.lib and object C:/Users/schne/Documents/github/everyday_audio/packages/opensmile-3.0.0/build/progsrc/s
  mileapi/Release/SMILEapi.exp
  Generating code
C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0\src\lld\harmonics.cpp(431): warning C4723: pote
ntial divide by 0 [C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0\build\progsrc\smilextract\SM
ILExtract.vcxproj]
C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0\src\lld\harmonics.cpp(431): warning C4723: pote
ntial divide by 0 [C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0\build\progsrc\smileapi\SMILE
api.vcxproj]
  Finished generating code
  SMILExtract.vcxproj -> C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0\build\progsrc\smilextr
  act\Release\SMILExtract.exe
  Finished generating code
  SMILEapi.vcxproj -> C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0\build\progsrc\smileapi\Re
  lease\SMILEapi.dll
  Building Custom Rule C:/Users/schne/Documents/github/everyday_audio/packages/opensmile-3.0.0/CMakeLists.txt

C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0>SMILExtract -H cPortaudio
(MSG) [2] SMILExtract: openSMILE starting!
(MSG) [2] SMILExtract: config file is: smile.conf
(MSG) [2] cComponentManager: successfully registered 103 component types.

C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0>

C:\Users\schne\Documents\github\everyday_audio\packages\opensmile-3.0.0>

when i build aar ,An error occurred

FAILURE: Build failed with an exception.

  • What went wrong:
    Execution failed for task ':opensmile:swig'.

A problem occurred starting process 'command 'sed''

Use ffmpeg to send input to opensmile to get features?

I want to use FFMEPG to send input to the opensmile and generate the features from egemaps, prosody or mfcc.
I am able to modify the config files to get the live input but now I want to take the input from a video source and extract audio via ffmpeg and send it to opensmile.

cFileConfigReader : cant find input file

I am trying to use opensmile 3.0 in python in windows 10. Downloaded from official website. Extract the folder. In cmd change working directory to opensmile\bin\SMILExtract -h
then it says installation is done as I can see the version and other details.

Finally for feature extraction

C:\Users\HP\Desktop\opensmile\bin>SMILExtract -C config\mfcc\MFCC12_0_D_A.conf -I Desktop\s.wav -O K.csv
(MSG) [2] SMILExtract: openSMILE starting!
(MSG) [2] SMILExtract: config file is: config\mfcc\MFCC12_0_D_A.conf
(MSG) [2] cComponentManager: successfully registered 103 component types.
(ERR) [1] configManager: cFileConfigReader::openInput : cannot find input file 'config\mfcc\MFCC12_0_D_A.conf'!

Please guide

Output Inconstancy of Feature set

Hi, I am using the python library to interact with opensmile.

smile = opensmile.Smile(
    feature_set=opensmile.FeatureSet.ComParE_2016 ,
    feature_level=opensmile.FeatureLevel.Functionals,
    verbose=True
)

When I use smile.process_file or smile.process_signal with a file opened through librosa, the outputted features are different. I don't understand how this is possible as it is the same file, so the extraction should be the same if I pass a file or a signal. I am using librosa because the signal is also used somewhere else in my code.

What do you advise ?

Why the output line in csv file always have some badlines?

When I use IS09_emotion.conf to extract fixed features from a large amount of wavs and save the output in a same csv file, some of the outputs is available, because when i want to read it with pd.read_csv, it always show some bad lines with warning 'Skipping line 725: expected 386 fields, saw 737\n',and the 737 can be 738\739...I guess it has put the results of two wavs in same line,and sometimes miss some values......

Error when running the build process

Hi, I try to set up opensmile on my laptop and when I run the command " powershell -ExecutionPolicy Bypass -File build.ps1", I got a warning and an error. I can't figure out what goes wrong. Could you give me some help?

zsh: command not found: SMILExtract

similar to issue #25, I got zsh: command not found: SMILExtract when running SMILExtract -h but the solution didn't solve my problem. My guess is the documentation says "CMake 3.5.1 or higher is required" but the Latest Release of CMake is only 3.23.2. Could that be the issue? Any help is appreciated!

How to extract dynamic features with OpenCV

I tried to extract 260-dimension dynamic features (one feature every 0.1 second) using the command: /Users/Documents/PMEmo/opensmile-3.0-osx-x64/bin/SMILExtract -C /Users/Documents/PMEmo/opensmile-3.0-osx-x64/config/is09-13/IS13_ComParE.conf -I /Users/Documents/dataset/Chorus/wav/233.wav -csvoutput test.csv. But the output put is 1 dimension and doesn't contain features for every 0.1 second. Which config or command should I use to extract dynamic features?

In the 2.1.0 version, I extracted lld features in frame size: 60ms, step size: 10ms using the following commands:
SMILExtract = os.path.join(opensmiledir,"SMILExtract")
config_file = os.path.join(opensmiledir,"config", "IS13_ComParE_lld.conf")
subprocess.check_call([SMILExtract, "-C", config_file, "-I", wavpath, "-O", distfile])

How about version 3.0?

use of undeclared identifier 'finite' error

Following 4 files are affected:
VektorMVN.cpp
libsvmSink.cpp
functionallRegression.cpp
functionalDCT.cpp

after changing it locally to isinfite the installation could be finished withoud errors

Extrcting features with opencv

Thanks for the software!

I am getting the following error when I provide the audio visual input:

(ERR) [0] commandlineParser: parse: unknown option '-P' on commandline!
(ERR) [1] cComponentManager: cannot add component (instname='openCVSource' type='cOpenCVSource'): unknown component type!!
(ERR) [1] cComponentManager: error during addComponent (returnVal=-1)!

Can you please help ? Do I need to externally download some models ?

Accept a pcm data, return to extract feature data

Hi,We know that when executing SMILExtract, you need to specify an input file and output a file. E.g:
SMILExtract -C config/emobase/emobase2010.conf -I ./1.WAV -O ./1.arff

I want to input a pcm data and return the extracted feature data. Is this possible?

Issue on recording using smileapi

Hello,

I was able to install/compile openSMILE with PortAudio and able to get various configuration option of the cPortAudio using SMILExtract -H. I was also able to record using SMILExtract with audiorecorder.conf configuration file and get the recorded wav file (have to use Ctrl C to exit).

However, when I try to use smileapi to record using same audiorecorder.conf did not record. In my C# code, in one thread I call following smileapi API.

smile.Initialize(configfile, options);
smile.Run();

In another thread, I am trying to exit smile object by using following API when some event occurs.

smile.Abort();
smile.Dispose();

These approach creates wav file with some file size, but no recorded sound exist. Any help or advise would be greatly helpful.

Thanks

Extract emotions from speech

Hello,

Is it possible to extract emotions from speech using opensmile (e.g. happy, anger, etc)? In the main page, it states that this toolkit can be also used for emotion recognition, but it is not clear whether this functionality is available (e.g. pretrained model) and how this can be achieved.

Any help or advise would be greatly helpful.
Thanks

potential crash for android example app

I believe the current example app could randomly crash upon stopping recording audio depending on which line of code get executed first.

In child thread, we have OpenSmileAdapter running:
https://github.com/audeering/opensmile/blob/master/progsrc/android-template/app/src/main/java/com/audeering/androidtemplate/OpenSmileExamples.kt#L112

In the main thread, we have
https://github.com/audeering/opensmile/blob/master/progsrc/android-template/app/src/main/java/com/audeering/androidtemplate/OpenSmileExamples.kt#L194
which then call
https://github.com/audeering/opensmile/blob/master/progsrc/android-template/app/src/main/java/com/audeering/androidtemplate/OpenSmileExamples.kt#L123

If line 122 ose.smile_abort() is executed, line 112 state = ose.smile_run() in child thread would finish executing, and then line 123 ose.smile_free() would run. This would be the ideal case. However, if the child thread runs slower and ose.smile_free() is executed first before ose.smile_run() finish properly, it would cause

A/libc: FORTIFY: pthread_mutex_lock called on a destroyed mutex (0x79cbf38144)
A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 21603 (AudioProcessing), pid 21308

This is quite a rare case, I could go for hundred runs without hitting it, however, there is a chance that it will crash (which has caused me some headache for days) when the condition is right.

I am not sure if my fix is the best but I moved ose.smile_free() to the child thread after ose.smile_run() to make sure the order of execution is properly maintained (if I understand the library correctly), and the crashes have stopped

Fail to get file list com.audeering.androidtemplate

getFolderSize() : Exception_1 = java.lang.NullPointerException: Attempt to get length of null array

When I click Test Opensles or any other audio recording button, the application closes and Logcat appears these errors.

Improve documentation

  • fix 'sed' on mac/windows or improve documentation
  • better documentation for adding flags to build opensmile for android
  • either remove the dangling WITH_OPENSLES from gradle.properties, or use it in the build.gradle
  • documentation regarding the default audio recording path (change path manually or grant permission)

eGeMAPS V01b and GeMAPS V01b

I am running OpenSmile 3.0.0 on two machines (Mac and Windows) to get GeMAPS features V01b. In both cases, I get the calculated features for the first frame only in the CSV file, and the following warning:

(WRN) [2] cComponentManager: The following component(s) could not perform any work because destination levels are full but no other component performed any work either: lldconcat. Processing will possibly be incomplete. For more details, enable the execDebug option of cComponentManager.

What could the be the problem and how can I fix it?

Error compiling opensmile on Linux CentOS 7

I'm trying to compile opensmile on Linux CentOS 7 and get the following compilation error:

...
[ 20%] Building CXX object CMakeFiles/opensmile.dir/src/core/componentManager.cpp.o
[ 20%] Building CXX object CMakeFiles/opensmile.dir/src/core/configManager.cpp.o
[ 21%] Building CXX object CMakeFiles/opensmile.dir/src/core/dataProcessor.cpp.o
/mnt/users/xxx/repos/opensmile/src/core/configManager.cpp: In member function ‘void cConfigFileBuffer::loadFile(const char*, FILE*)’:
/mnt/users/xxx/repos/opensmile/src/core/configManager.cpp:1722:6: error: use of deleted function ‘std::basic_ifstream& std::basic_ifstream::operator=(const std::basic_ifstream&)’
is = std::ifstream(filename, std::ifstream::in);
^
In file included from /mnt/users/xxx/repos/opensmile/src/include/core/configManager.hpp:27:0,
from /mnt/users/xxx/repos/opensmile/src/core/configManager.cpp:30:
/usr/include/c++/4.8.2/fstream:427:11: note: ‘std::basic_ifstream& std::basic_ifstream::operator=(const std::basic_ifstream&)’ is implicitly deleted because the default definition would be ill-formed:
class basic_ifstream : public basic_istream<_CharT, _Traits>
^
/usr/include/c++/4.8.2/fstream:427:11: error: use of deleted function ‘std::basic_istream& std::basic_istream::operator=(const std::basic_istream&)’
In file included from /usr/include/c++/4.8.2/iterator:65:0,
from /mnt/users/xxx/repos/opensmile/src/include/rapidjson/document.h:56,
from /mnt/users/xxx/repos/opensmile/src/include/core/configManager.hpp:22,
from /mnt/users/xxx/repos/opensmile/src/core/configManager.cpp:30:
/usr/include/c++/4.8.2/istream:58:11: note: ‘std::basic_istream& std::basic_istream::operator=(const std::basic_istream&)’ is implicitly deleted because the default definition would be ill-formed:
class basic_istream : virtual public basic_ios<_CharT, _Traits>
^
/usr/include/c++/4.8.2/istream:58:11: error: use of deleted function ‘std::basic_ios& std::basic_ios::operator=(const std::basic_ios&)’
In file included from /usr/include/c++/4.8.2/ios:44:0,
from /usr/include/c++/4.8.2/ostream:38,
from /usr/include/c++/4.8.2/iterator:64,
from /mnt/users/xxx/repos/opensmile/src/include/rapidjson/document.h:56,
from /mnt/users/xxx/repos/opensmile/src/include/core/configManager.hpp:22,
from /mnt/users/xxx/repos/opensmile/src/core/configManager.cpp:30:
/usr/include/c++/4.8.2/bits/basic_ios.h:66:11: note: ‘std::basic_ios& std::basic_ios::operator=(const std::basic_ios&)’ is implicitly deleted because the default definition would be ill-formed:
class basic_ios : public ios_base
^
In file included from /usr/include/c++/4.8.2/ios:42:0,
from /usr/include/c++/4.8.2/ostream:38,
from /usr/include/c++/4.8.2/iterator:64,
from /mnt/users/xxx/repos/opensmile/src/include/rapidjson/document.h:56,
from /mnt/users/xxx/repos/opensmile/src/include/core/configManager.hpp:22,
from /mnt/users/xxx/repos/opensmile/src/core/configManager.cpp:30:
/usr/include/c++/4.8.2/bits/ios_base.h:789:5: error: ‘std::ios_base& std::ios_base::operator=(const std::ios_base&)’ is private
operator=(const ios_base&);
^
In file included from /usr/include/c++/4.8.2/ios:44:0,
from /usr/include/c++/4.8.2/ostream:38,
from /usr/include/c++/4.8.2/iterator:64,
from /mnt/users/xxx/repos/opensmile/src/include/rapidjson/document.h:56,
from /mnt/users/xxx/repos/opensmile/src/include/core/configManager.hpp:22,
from /mnt/users/xxx/repos/opensmile/src/core/configManager.cpp:30:
/usr/include/c++/4.8.2/bits/basic_ios.h:66:11: error: within this context
class basic_ios : public ios_base
^
In file included from /mnt/users/xxx/repos/opensmile/src/include/core/configManager.hpp:27:0,
from /mnt/users/xxx/repos/opensmile/src/core/configManager.cpp:30:
/usr/include/c++/4.8.2/fstream:427:11: error: use of deleted function ‘std::basic_filebuf& std::basic_filebuf::operator=(const std::basic_filebuf&)’
class basic_ifstream : public basic_istream<_CharT, _Traits>
^
/usr/include/c++/4.8.2/fstream:72:11: note: ‘std::basic_filebuf& std::basic_filebuf::operator=(const std::basic_filebuf&)’ is implicitly deleted because the default definition would be ill-formed:
class basic_filebuf : public basic_streambuf<_CharT, _Traits>
^
In file included from /usr/include/c++/4.8.2/ios:43:0,
from /usr/include/c++/4.8.2/ostream:38,
from /usr/include/c++/4.8.2/iterator:64,
from /mnt/users/xxx/repos/opensmile/src/include/rapidjson/document.h:56,
from /mnt/users/xxx/repos/opensmile/src/include/core/configManager.hpp:22,
from /mnt/users/xxx/repos/opensmile/src/core/configManager.cpp:30:
/usr/include/c++/4.8.2/streambuf:810:7: error: ‘std::basic_streambuf<_CharT, _Traits>& std::basic_streambuf<_CharT, _Traits>::operator=(const std::basic_streambuf<_CharT, _Traits>&) [with _CharT = char; _Traits = std::char_traits]’ is private
operator=(const basic_streambuf& __sb)
^
In file included from /mnt/users/xxx/repos/opensmile/src/include/core/configManager.hpp:27:0,
from /mnt/users/xxx/repos/opensmile/src/core/configManager.cpp:30:
/usr/include/c++/4.8.2/fstream:72:11: error: within this context
class basic_filebuf : public basic_streambuf<_CharT, _Traits>
^
[ 22%] Building CXX object CMakeFiles/opensmile.dir/src/core/dataReader.cpp.o
gmake[2]: *** [CMakeFiles/opensmile.dir/src/core/configManager.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/opensmile.dir/all] Error 2
gmake: *** [all] Error 2

My build_flags.sh says:

#!/bin/bash

This file is part of openSMILE.

Copyright (c) audEERING. All rights reserved.

See the file COPYING for details on license terms.

build_flags=(
# libsvm and dependent components (cLibsvmLiveSink)
-DBUILD_LIBSVM

# LSTM RNN components
-DBUILD_RNN

# cSvmSink
-DBUILD_SVMSMO

)

cmake_flags=(
# switches between release and debug builds
-DCMAKE_BUILD_TYPE=Release

# use Clang compiler. Comment out to use system-default C and C++ compilers.
# Leave commented out when building for Android, as the NDK version of Clang is required.
#-DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++

# controls whether to build and link to libopensmile as static or shared library
# static builds are preferred unless you need openSMILE plugins
-DSTATIC_LINK=ON

# tune compiler optimizations to the processor of this machine.
# This ensures that openSMILE runs with optimal performance on the machine it was
# compiled on but it may not run at all on other machines.
# Disable if the compiled binary needs to be portable.
-DMARCH_NATIVE=OFF

# whether to compile with PortAudio support
-DWITH_PORTAUDIO=OFF

# whether to compile with FFmpeg support
# 1. download ffmpeg source distribution
# 2. run: mkdir build ; cd build
#         ../configure --enable-shared --disable-static
#         make ; sudo make install
-DWITH_FFMPEG=ON

# whether to compile with OpenCV support
-DWITH_OPENCV=OFF

# whether to compile with OpenSL ES support (only applies when building for Android)
-DWITH_OPENSLES=OFF

# whether to compile with Julius support
-DWITH_JULIUS=OFF

)

build_flags="${build_flags[@]}"

Installation Issues

I am having trouble installing opensmile on windows. I dont know exactly what I have to do or where do I have to type the commands specified. could you be a bit more specific on instructions?

Execution failed for task ':opensmile:compileDebugKotlin'. > Compilation error. See log for more details(Android_template))

java 11
swig 4.0.1
ubuntu 20

i execute this line
./gradlew app:assembleUseSource

error message: [Execution failed for task ':opensmile:compileDebugKotlin'. > Compilation error. See log for more details]
i have idea to edit AndroidManifest.xml
guide pls.

error log:
manifest-merger-debug-report.txt

AndroidManifest.xml file:

<manifest xmlns:android="http://schemas.android.com/apk/res/android" package="com.audeering.opensmile" />

Prosody extractor output binary output.csv on Linux

  1. install opensmile for Linux from source
  2. run ./build/progsrc/smilextract/SMILExtract -C config/prosody/prosodyShs.conf -I my.wav -O prosody.csv
  3. prosody.csv appears to be binary file and is unreadable by text editors

Output format

Hi,

In the "Get started" page I read this:

The output file format is the HTK parameter file format. For other file formats you must change the ‘cHtkSink’ component type in the configuration file to the type you want.

Which are the output format among I can choose? Is this valid for all the features, or only for plp? I am interested in mfcc and paraling, I 'd like to have a different format of output since I am having issues in reading .arff and .htk in python (or if you have experience with handling this type of files on any among Matlab, Python and R, any suggestion is accepted).

Thank you in advance for the availability.
Silvia

OpenSmile output export

Hi, does OpenSmile have a component has OSC to send data in real-time or we can only save the output and then send it?

SMILExtract -h: command not found

Hi,

I am trying to set up openSMILE on my mac. I am following the instructions for LINUX/MAC (from source) on the "Get started" webpage. Everything is fine, until this point:

This will configure and build the libopensmile library and the SMILExtract binary. After successful compilation, you may want to add the build/progsrc/smilextract directory to your path or copy the binary build/progsrc/smilextract/SMILExtract to a directory in your search path. Finally, start by running:

SMILExtract -h

I added the directory to my path but the command cannot be found. I am confused about what I can possible be doing wrong. Any idea of why?

Or, if there is another way to set up opensmile (or directly a python library-package that I can pip install), I am all ears.

Thank you to anyone available to give me their 2 cents!
Silvia

Executable SMILExtract.exe doesn't work

Hi,
I tried both installing the newest release version of openSMILE and building it from scratch, but whenever I try to launch SMILExtract.exe there just pops out an empty command prompt (for just a blink of an eye) and that's all. I tried on four different PC's (Win10/Win11) with exactly the same result. Has anyone got the idea what might be wrong?
Filip

Error when running build process on Linux

Hi, I try to set up open smile 3.0 on Linux and when I run the command " sh build. sh", I got an error. I can't figure out what goes wrong. Could you give me some help? I have cmake version 3.20.2 and gcc version 4.8.5 20150623 (Red Hat 4.8.5-39)
I had attached the error log file:
CMakeError.log

Support for multi-threading

When processing files using multiple threads in pyopensmile it fails with:

opensmile.core.SMILEapi.OpenSmileException: Code: 6

Would be nice if we could make SMILEapi thread-safe.

Write CSV output in sequence instead of chunk

When I tried to read the last line data from CSV in real-time while running opensmile command, I am unable to fetch every individual data but it skips a chunk of data.

I am assuming that the data is been written as a chunk instead of a single row at a time in CSV. If this is the case then, is there a way I can save the data one by one in sequence instead of dumping it as a chunk into the CSV?

Some issues while building with opencv flag

I encountered some issues while building with opencv flag. I fixed these issues on my own but it will be good to fix it at source code. Here are messages.

/home/brk/opensmile/src/video/openCVSource.cpp: In member function ‘virtual int cOpenCVSource::myFinaliseInstance()’:
/home/brk/opensmile/src/video/openCVSource.cpp:252:24: error: ‘CV_CAP_PROP_FPS’ was not declared in this scope
252 | if(mVideoCapture.get(CV_CAP_PROP_FPS) != 0)
/home/brk/opensmile/src/video/openCVSource.cpp: In member function ‘virtual eTickResult cOpenCVSource::myTick(long long int)’:
/home/brk/opensmile/src/video/openCVSource.cpp:311:52: error: ‘CV_BGR2GRAY’ was not declared in this scope
311 | cvtColor( mCurrentFrame_bgr, mCurrentFrame_gray, CV_BGR2GRAY );
/home/brk/opensmile/src/video/openCVSource.cpp: In member function ‘virtual ExtractedFace cOpenCVSource::extractFace(cv::Mat&, cv::Mat&)’:
/home/brk/opensmile/src/video/openCVSource.cpp:542:67: error: ‘CV_HAAR_SCALE_IMAGE’ was not declared in this scope
542 | mEyes_cascade.detectMultiScale( faceROI, eyes, 1.1, 2, 0 |CV_HAAR_SCALE_IMAGE, Size(30, 30) );

Execution failed for task ':opensmile:generateJsonModelDebug'.

Task :opensmile:generateJsonModelDebug FAILED

FAILURE: Build failed with an exception.

  • What went wrong:
    Execution failed for task ':opensmile:generateJsonModelDebug'.

C:\Users...\Desktop\opensmile\progsrc\android-template\opensmile\CMakeLists.txt : C/C++ debug|armeabi-v7a : CMake Error: File C:/Users/.../Desktop/opensmile/progsrc/android-template/
opensmile/src/include/core/git_version.hpp.in does not exist.

Can someone help me ?

need feature ranges to normalize

I saw the closed issue asking about feature ranges and the suggestion to normalize data relative to all audio file outputs. Due to a relatively small dataset, I would like to normalize using the actual range of these features. Some I can find online, but I was wondering if there was anything in your docs about ranges of these features:

I know MFCCs are 0-1, but am especially having trouble finding ranges for the differential values of these coefficients...

pcm_RMSenergy_sma;
pcm_fftMag_mfcc_sma[1];
pcm_fftMag_mfcc_sma[2];
pcm_fftMag_mfcc_sma[3];
pcm_fftMag_mfcc_sma[4];
pcm_fftMag_mfcc_sma[5];
pcm_fftMag_mfcc_sma[6];
pcm_fftMag_mfcc_sma[7];
pcm_fftMag_mfcc_sma[8];
pcm_fftMag_mfcc_sma[9];
pcm_fftMag_mfcc_sma[10];
pcm_fftMag_mfcc_sma[11];
pcm_fftMag_mfcc_sma[12];
pcm_zcr_sma;
voiceProb_sma;
F0_sma;
pcm_RMSenergy_sma_de;
pcm_fftMag_mfcc_sma_de[1];
pcm_fftMag_mfcc_sma_de[2];
pcm_fftMag_mfcc_sma_de[3];
pcm_fftMag_mfcc_sma_de[4];
pcm_fftMag_mfcc_sma_de[5];
pcm_fftMag_mfcc_sma_de[6];
pcm_fftMag_mfcc_sma_de[7];
pcm_fftMag_mfcc_sma_de[8];
pcm_fftMag_mfcc_sma_de[9];
pcm_fftMag_mfcc_sma_de[10];
pcm_fftMag_mfcc_sma_de[11];
pcm_fftMag_mfcc_sma_de[12];
pcm_zcr_sma_de;
voiceProb_sma_de;
F0_sma_de

libm.so.6: version 'GLIBC_2.27' not found

Hi, when I import opensmile in python, it gives this error: /lib64/libm.so.6: version `GLIBC_2.27' not found.
My machine is: ldd (Ubuntu GLIBC 2.31-0ubuntu9.7) 2.31.
It seems opensmile does not support new GLIBC version?

Documentations and instructions for android integration needed

I believe the android example under opensmile\progsrc\android-template is in a non-working state right now with just the adapter and placeholder folder for the library. Do you have any plan to create a working example project to show how to utilize the built .aar file?

I tried the Linux and Windows builds out and they worked fine but the documentations for android is somewhat lacking. There was no further instruction for android after the build step

Extracting your first features section only shows how to use SMILExtract using command line.

I have a .conf file and I'm looking for a way to pass that to the library (together with .wav input file) in android to get back data just like the first example of that section

SMILExtract -C config/demo/demo1_energy.conf -I example-audio/opensmile.wav -O opensmile.energy.csv

Digging into the template project, I do found some methods defined under OpenSmileAdapter. Could you please describe what those methods do and how I can get the results?

Installation

I am having trouble installing opensmile on windows. I dont know exactly what I have to do or where do I have to type the commands specified. could you be a bit more specific on instructions?

Can I pass RTMP Link as input to opensmile?

I want to pass a streaming URL like RTMP to the opensmile and save the features in a CSV. I am able to use portaudio to consume audio and get the data but now I want to use an RTMP link to get the data in a CSV.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.