Giter VIP home page Giter VIP logo

politeness's Introduction

Stanford Politeness API

A new home: ConvoKit (July 2019)

The functionality of this API is now integrated with ConvoKit. This new implementation more directly exposes politeness strategies, which can used in downstream tasks (see example scripts in ConvoKit). As such, we are discontinuing support for this API.

Version 2.00 (released March 2017)

Python3 version is available here: https://github.com/sudhof/politeness/tree/python3 (refactored from Version 1.01 through the kindness of Benjamin Meyers).

Note: This python3 version was not yet tested by us, nor compared against the results from our paper (listed below). The code used in the paper is still here in the master branch of this repository (keep on reading).

Version 1.01 (released October 2014)

Python implementation of a politeness classifier for requests, based on the work described in:

A computational approach to politeness with application to social factors.  	
Cristian Danescu-Niculescu-Mizil, Moritz Sudhof, Dan Jurafsky, Jure Leskovec, Christopher Potts.  
Proceedings of ACL, 2013.

We release this code hoping that others will use and improve on our work.

NOTE: If you use this API in your work please send an email to [email protected] so we can add you to our list of users. Thanks!

Further resources:

Info about our work: http://cs.cornell.edu/~cristian/Politeness.html

A web interface to the politeness model: http://politeness.cornell.edu/

The Stanford Politeness Corpus: http://cs.cornell.edu/~cristian/Politeness_files/Stanford_politeness_corpus.zip

Using this API you can:

  • classify requests using politeness.model.score (using the provided pre-trained model)

  • train new models on new data using politeness.scripts.train_model

  • experiment with new politeness features in politeness.features.vectorizer and politeness.features.politeness_strategies

Input: Requests must be pre-processed with sentences and dependency parses. We used nltk's PunktSentenceTokenizer for sentence tokenization and Stanford CoreNLP version 1.3.3 for dependency parsing. A sample of the expected format for documents is given in politeness.test_documents

Caveat: This work focuses on requests, not all kinds of utterances. The model's predictions on non-request utterances will be less accurate. As a bonus, our code also includes a very simple heuristic to check whether a document looks like a request (see politeness.request_utils).

Requirements:

python package requirements are listed in requirements.txt. We recommend setting up a new python environment using virtualenv and installing the dependencies by running

pip install -r requirements.txt

Additionally, since the code uses nltk.word_tokenize to tokenize text, you will need to download the tokenizers/punkt/english.pickle nltk resource. If you've worked with nltk before, there's a good chance you've already downloaded this model. Otherwise, open the python interpreter and run:

import nltk
nltk.download()

In the window that opens, navigate to Models and download the Punkt Tokenizer Models.

Sanity Check:

To make sure everything's working, navigate to the code directory and run

python model.py

This should print out the politeness probabilities for 4 test documents.

Contact: Please email any questions to: [email protected] (Cristian Danescu-Niculescu-Mizil) and [email protected] (Moritz Sudhof)

politeness's People

Contributors

cristiandnm avatar sudhof avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

politeness's Issues

Dependency parsing by Stanford Corenlp doesn't exactly match the sample examples

Hi,

I am using Stanford CoreNLP version 1.3.3 for the dependency parsing as par the documentation. However, when I run the 4 samples presented in the test_documents.py, I get a bit different parse trees in results and therefore a bit different politeness scores. Below is my output file with associated parse trees.
testoutput.xlsx

I manually checked the output xml files that the stanford tool gives me and it's the parse trees(any type of dependency parses) that I get different from the sample.

What is the problem here?

Even if I continue with the parse trees that I'm generating from the tools with my other inputs, will I get drastically different results? Or that is okay?

@sudhof @cristiandnm

pip install Politeness doesnt work

Tried using requirement.txt too, still doesnt work
**The ERROR CODE is **

C:\Users\Arjun Sarihyan\Desktop\Project 2020>pip install -r requirements.txt
Collecting scipy==0.12.0
Using cached scipy-0.12.0.tar.gz (9.1 MB)
ERROR: Command errored out with exit status 1:
command: 'c:\program files\python37\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\ARJUNS1\AppData\Local\Temp\pip-install-q0239xg4\scipy\setup.py'"'"'; file='"'"'C:\Users\ARJUNS1\AppData\Local\Temp\pip-install-q0239xg4\scipy\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\ARJUNS1\AppData\Local\Temp\pip-install-q0239xg4\scipy\pip-egg-info'
cwd: C:\Users\ARJUNS
1\AppData\Local\Temp\pip-install-q0239xg4\scipy
Complete output (122 lines):
blas_opt_info:
blas_mkl_info:
No module named 'numpy.distutils._msvccompiler' in numpy.distutils; trying from distutils
customize MSVCCompiler
libraries mkl_rt not found in ['c:\program files\python37\lib', 'C:\', 'c:\program files\python37\libs']
NOT AVAILABLE

blis_info:
  libraries blis not found in ['c:\\program files\\python37\\lib', 'C:\\', 'c:\\program files\\python37\\libs']
  NOT AVAILABLE

openblas_info:
  libraries openblas not found in ['c:\\program files\\python37\\lib', 'C:\\', 'c:\\program files\\python37\\libs']
get_default_fcompiler: matching types: '['gnu', 'intelv', 'absoft', 'compaqv', 'intelev', 'gnu95', 'g95', 'intelvem', 'intelem', 'flang']'
customize GnuFCompiler
Could not locate executable g77
Could not locate executable f77
customize IntelVisualFCompiler
Could not locate executable ifort
Could not locate executable ifl
customize AbsoftFCompiler
Could not locate executable f90
customize CompaqVisualFCompiler
Could not locate executable DF
customize IntelItaniumVisualFCompiler
Could not locate executable efl
customize Gnu95FCompiler
Could not locate executable gfortran
Could not locate executable f95
customize G95FCompiler
Could not locate executable g95
customize IntelEM64VisualFCompiler
customize IntelEM64TFCompiler
Could not locate executable efort
Could not locate executable efc
customize PGroupFlangCompiler
Could not locate executable flang
don't know how to compile Fortran code on platform 'nt'
  NOT AVAILABLE

atlas_3_10_blas_threads_info:
Setting PTATLAS=ATLAS
  libraries tatlas not found in ['c:\\program files\\python37\\lib', 'C:\\', 'c:\\program files\\python37\\libs']
  NOT AVAILABLE

atlas_3_10_blas_info:
  libraries satlas not found in ['c:\\program files\\python37\\lib', 'C:\\', 'c:\\program files\\python37\\libs']
  NOT AVAILABLE

atlas_blas_threads_info:
Setting PTATLAS=ATLAS
  libraries ptf77blas,ptcblas,atlas not found in ['c:\\program files\\python37\\lib', 'C:\\', 'c:\\program files\\python37\\libs']
  NOT AVAILABLE

atlas_blas_info:
  libraries f77blas,cblas,atlas not found in ['c:\\program files\\python37\\lib', 'C:\\', 'c:\\program files\\python37\\libs']
  NOT AVAILABLE

accelerate_info:
  NOT AVAILABLE

blas_info:
  libraries blas not found in ['c:\\program files\\python37\\lib', 'C:\\', 'c:\\program files\\python37\\libs']
  NOT AVAILABLE

blas_src_info:
  NOT AVAILABLE

Running from scipy source directory.
c:\program files\python37\lib\site-packages\numpy\distutils\system_info.py:1896: UserWarning:
    Optimized (vendor) Blas libraries are not found.
    Falls back to netlib Blas library which has worse performance.
    A better performance should be easily gained by switching
    Blas library.
  if self._calc_info(blas):
c:\program files\python37\lib\site-packages\numpy\distutils\system_info.py:1896: UserWarning:
    Blas (http://www.netlib.org/blas/) libraries not found.
    Directories to search for the libraries can be specified in the
    numpy/distutils/site.cfg file (section [blas]) or by setting
    the BLAS environment variable.
  if self._calc_info(blas):
c:\program files\python37\lib\site-packages\numpy\distutils\system_info.py:1896: UserWarning:
    Blas (http://www.netlib.org/blas/) sources not found.
    Directories to search for the sources can be specified in the
    numpy/distutils/site.cfg file (section [blas_src]) or by setting
    the BLAS_SRC environment variable.
  if self._calc_info(blas):
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\ARJUNS~1\AppData\Local\Temp\pip-install-q0239xg4\scipy\setup.py", line 165, in <module>
    setup_package()
  File "C:\Users\ARJUNS~1\AppData\Local\Temp\pip-install-q0239xg4\scipy\setup.py", line 161, in setup_package
    configuration=configuration)
  File "c:\program files\python37\lib\site-packages\numpy\distutils\core.py", line 137, in setup
    config = configuration()
  File "C:\Users\ARJUNS~1\AppData\Local\Temp\pip-install-q0239xg4\scipy\setup.py", line 136, in configuration
    config.add_subpackage('scipy')
  File "c:\program files\python37\lib\site-packages\numpy\distutils\misc_util.py", line 1035, in add_subpackage
    caller_level = 2)
  File "c:\program files\python37\lib\site-packages\numpy\distutils\misc_util.py", line 1004, in get_subpackage
    caller_level = caller_level + 1)
  File "c:\program files\python37\lib\site-packages\numpy\distutils\misc_util.py", line 941, in _get_configuration_from_setup_py
    config = setup_module.configuration(*args)
  File "scipy\setup.py", line 9, in configuration
    config.add_subpackage('integrate')
  File "c:\program files\python37\lib\site-packages\numpy\distutils\misc_util.py", line 1035, in add_subpackage
    caller_level = 2)
  File "c:\program files\python37\lib\site-packages\numpy\distutils\misc_util.py", line 1004, in get_subpackage
    caller_level = caller_level + 1)
  File "c:\program files\python37\lib\site-packages\numpy\distutils\misc_util.py", line 941, in _get_configuration_from_setup_py
    config = setup_module.configuration(*args)
  File "scipy\integrate\setup.py", line 11, in configuration
    blas_opt = get_info('blas_opt',notfound_action=2)
  File "c:\program files\python37\lib\site-packages\numpy\distutils\system_info.py", line 471, in get_info
    return cl().get_info(notfound_action)
  File "c:\program files\python37\lib\site-packages\numpy\distutils\system_info.py", line 734, in get_info
    raise self.notfounderror(self.notfounderror.__doc__)
numpy.distutils.system_info.BlasNotFoundError:
    Blas (http://www.netlib.org/blas/) libraries not found.
    Directories to search for the libraries can be specified in the
    numpy/distutils/site.cfg file (section [blas]) or by setting
    the BLAS environment variable.
----------------------------------------

ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

And this is after creating a virtual environment

Pip install does not work

Hi, did you manage to install the politeness package ? From what I have tried, the pip install does not work. And I cannot figure out how to use it in bash once you have linked the url to the stanford coreNLP server

incompatibility issue with machine learning files Python 3.6

I am running model.py and got the following error. It seems like the script has incompatibility issue with sklearn in Python 3.6.

Traceback (most recent call last):
File "model.py", line 111, in
probs = score(doc)
File "model.py", line 94, in score
probs = clf.predict_proba(X)
File "/home/karenzheng/.local/lib/python3.6/site-packages/sklearn/svm/base.py", line 616, in predict_proba
self._check_proba()
File "/home/karenzheng/.local/lib/python3.6/site-packages/sklearn/svm/base.py", line 582, in _check_proba
if not self.probability:
AttributeError: 'SVC' object has no attribute 'probability'

How could I solve this??

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.