Giter VIP home page Giter VIP logo

statapackagesearch's Introduction

Test CI Stata Run

Packagesearch: module to scan Stata .do files and identify SSC packages used by the code

Installation

To install, type the following command into Stata.

net install packagesearch, from("https://aeadataeditor.github.io/Statapackagesearch/")

Syntax: (also available in the help file)


      help packagesearch                                              (SJX-X: dmXXXX)
      -------------------------------------------------------------------------------
      
      Title
      
          packagesearch -- Module to search Stata code for the SSC packages used by
              the code
      
      Description
      
          packagesearch provides a tool that scans, parses, and matches all Stata
          .do files in a directory (and its subdirectories) against a list of all
          packages currently hosted at SSC. It outputs a list of candidate SSC
          packages that were (likely) used when code is run.
      
      Syntax
      
              packagesearch , codedir(directorytoscan)[ domain(domain) filesave
                      excelsave nodropfalsepos installfounds]
      
      
      Options
      
          codedir(directorytoscan) is required. It specifies the directory that
              contains the .do files to be scanned for SSC packages.
      
          domain(domain) optionally specifies a domain from which to take
              statistics to help identify likely packages (by default, ssc hot is
              used). Only available domain right now is econ.
      
          filesave outputs a list of all files that were parsed during the scanning
              process.
      
          excelsave saves the results of the scan into an Excel spreadsheet titled
              candidatepackages.xlsx. This file is saved in the specified
              directorytoscan and will include a list of parsed programs if
              filesave is also indicated as an option.
      
          nodropfalsepos By default, command removes packages that were frequently
              found to be false positives during beta testing. This flag disables
              that feature. Presently this includes the following packages:  white,
              missing, index, dash, title, cluster, pre, bys
      
          installfounds installs all SSC packages found during the scanning process
              into the current working directory.
      

Description:

The code begins by either collecting a list of all packages hosted at SSC using the whatshot command, or pulling a list of common SSC packages used in economics research (if option domain(econ) is specified).
Next, it identifies all .do files in the specified codedir directory and subdirectories, then parses each .do file into individual words using the txttool command. Finally, it matches the individual words against the list of common Stata packages and outputs a list of candidate packages that were (likely) used when the Stata code was run.

Testing

The Github repository has a few files to test the package. To run, you can do the following:

GITURL=https://github.com/AEADataEditor/Statapackagesearch/
git clone $GITURL
cd Statapackagesearch

and then, if you have Stata installed,

./test/run.sh

and if you don't, but have access to a Stata license (e.g. on Github Codespaces with the proper setup)

echo "$STATA_LIC_BASE64" | base64 -d > stata.lic
docker run -it --rm \
   -v $(pwd)/stata.lic:/usr/local/stata/stata.lic \
   -v $(pwd):/project \
   -w /project \
   --entrypoint /bin/bash dataeditors/stata17:2023-05-16 \
   ./test/run.sh

Questions?

Contact:
Lydia Reiner ([email protected])
Lars Vilhuber ([email protected])

statapackagesearch's People

Contributors

larsvilhuber avatar lydreiner avatar sergiocorreia avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.