Giter VIP home page Giter VIP logo

ado's Introduction

Build Status AppVeyor Build Status CRAN_Status_Badge Downloads Coverage Status License

ado

The ado package provides an R-based interpreter for Stata’s ado language. It’s still under development and isn’t yet suitable for day-to-day use. When it’s completed, the language it supports will be close to but not exactly like Stata, in much the same way that R is descended from but not identical to S. This package is not in any way affiliated with or endorsed by StataCorp.

The target functionality, only some of which is currently completed, covers several areas:

  • Stata macros and loops: Support for Stata’s macros, and for foreach and forvalues loops. Macros and loops are tightly coupled in Stata, because loops are directives to the macro processor rather than the interpreter (i.e., loop over values and repeatedly macro-expand and execute text, rather than loop over values and repeatedly execute already parsed text).
  • Ways of interfacing with R: The ado code this package understands can integrate with R code in two ways: a) R can be used to write new ado-language commands, which can be used alongside the built-in ones; b) syntax for embedding R inline into ado code and allowing it to operate on ado data structures.
  • Data manipulation commands: The most important of Stata’s many data manipulation commands - collapse, gen and egen, drop and keep, and many more.
  • Statistics commands: A selection of the most important and most easily implemented statistics commands. Hypothesis tests, regression and a few other items will be supported.
  • Misc and system commands: Some miscellaneous commands including logging, ways to work with the operating system, and commands for managing files.
  • Graphics: A few fairly thin wrappers around base R’s graphics, for making histograms, scatterplots and the like.

See sections below for more detail on functionality that’s already complete or nearly so.

Parsing and frontend

The interpreter’s frontend is mostly complete. The parser and lexer, semantic analyzer and code generator (which generates R code) are functional and accept nearly the final language we want to support. There are still a few minor bugs, and the differences from Stata are poorly documented.

Macros and loops

Support for macros and loops is complete, and touches many parts of the interpreter architecture. Macro support in particular is tightly integrated into the lexer, because it turns out that it has to be to reproduce Stata’s behavior.

R interface

Both targeted ways of integrating with R are already supported:

  • Text between {{{ and }}} is understood to be R code, and is executed as such without using the ado parser and code generator. The exact environment that this code executes in, and how the ado dataset is visible to it, is still TBD.
  • There’s a mechanism for defining and registering R functions that follow a particular (at the moment, entirely undocumented) calling convention as ado commands. Commands registered this way can be used just like ones built into the package.

Logging and misc

Stata’s logging features are mostly supported. The log and cmdlog commands exist and work as expected, allowing output, input or both to be captured and redirected to files in a much simpler way than with sink().

File manipulation commands are ready and working, though possibly not on Windows.

Installation

There’s no CRAN version yet, so install the dev version from github:

install.packages("devtools")
devtools::install_github("wwbrannon/ado")

ado's People

Contributors

wwbrannon avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

guhjy mcomsa

ado's Issues

Failing to install ado

In am using Windows 10:

devtools::install_github("wwbrannon/ado")
Downloading GitHub repo wwbrannon/ado@master
from URL https://api.github.com/repos/wwbrannon/ado/zipball/master
Installing ado
"C:/PROGRA1/R/R-341.3/bin/x64/R" --no-site-file --no-environ
--no-save --no-restore --quiet CMD INSTALL
"C:/Users/Jinn-Yuh/AppData/Local/Temp/RtmpGUrXiT/devtools42034122381/wwbrannon-ado-18d7cbf"
--library="D:/Dropbox/Stat/R/Library" --install-tests

  • installing source package 'ado' ...
    chmod: not found
    configure.win: error: cannot create configure.win.lineno; rerun with a POSIX shell
    Warning: running command 'sh ./configure.win' had status 1
    ERROR: configuration failed for package 'ado'
  • removing 'D:/Dropbox/Stat/R/Library/ado'
    In R CMD INSTALL
    Installation failed: Command failed (1)

sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=Chinese (Traditional)_Taiwan.950
[2] LC_CTYPE=Chinese (Traditional)_Taiwan.950
[3] LC_MONETARY=Chinese (Traditional)_Taiwan.950
[4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Traditional)_Taiwan.950

attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base

other attached packages:
[1] mcode_0.1.5 BiocInstaller_1.14.3

loaded via a namespace (and not attached):
[1] httr_1.3.1 compiler_3.4.3 R6_2.2.2
[4] tools_3.4.3 withr_2.1.1.9000 githubinstall_0.2.1
[7] curl_3.1 yaml_2.1.16 memoise_1.1.0
[10] data.table_1.10.4-3 yearn_0.1.3 git2r_0.21.0
[13] jsonlite_1.5 digest_0.6.15 devtools_1.13.4

varlist_to_formula

Need a function that converts varlists to R formulas, for use in modeling functions

Symbols to character

Symbols should be coerced to character in varlist, expression_list and expression where this makes sense.

Variable name abbreviation

Variable names should be able to be abbreviated to the shortest unambiguous substring. Other abbreviation syntax (ranges etc) won't be supported.

c() values

Need support for so-called "c-class values": the system parameters accessible under c(). They need to be usable

  • in expressions, unquoted
  • in macros

We'll also need to implement the creturn command to list them.

Tracebacks in debugging

There needs to be a way to print a backtrace for an R error / condition thrown at any point in executing a Stata command. It should be one of the OR-able debugging flags.

Documentation updates

Update various non-vignette docs and CRAN peccadilloes:

  • the package NULL
  • R functions
  • README.Rmd
  • DESCRIPTION and in particular its long description
  • cran-comments
  • NEWS.md
  • NAMESPACE

User-created commands

A mechanism for the user to persistently register an R function obeying the calling convention as a Stata command.

This way, we don't need much programming support, because R is the extension language.

Type system

Stata's two data types for string and numeric map fairly cleanly onto R's character and numeric. We need:

  • the op_* functions for the factor and interaction operators
  • relational operators (%==%, %>%, %<%, %>=%, %<=%) that implement a bivalent logic
  • logical operators (%&%, %|%, %!%) that do the same
  • confirm that base R's arithmetic operators work as-is without being wrapped
  • subsetting (expressions like x[1] are legal in Stata) - does it work?
  • assignment - does it work?
  • type constructor operators - does the way this is set up now work?

Portability

Makevars and the parser's Makefile have to be generated by a configure script, not be hardcoded

Return codes

Commands need to be able to have return codes, and the _rc variable has to be set as the last command's return value

Command history

The way it stores ado command history is a godawful hack, probably also a race condition, and is Unix-specific. Is there a better way?

Help system

Nothing fancy, just needs to print a usage message

Memory leaks

Memory leaks: three entry points to check, do_parse, parse_accept and do_parse_with_callbacks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.