bulatb / yuno Goto Github PK

View Code? Open in Web Editor NEW

8.0 8.0 8.0 746 KB

Automated testing for CSE 131 at UCSD

License: MIT License

Python 93.11% Shell 6.89%

yuno's People

Contributors

Stargazers

Watchers

Forkers

ledinhminh sudoka rowlandking tjlevine jifferent codemonk13 alex-c-eltima

yuno's Issues

Move development to master.

GitHub issue integration with commits is only useful if the issues close when finished, not when they ship out with a release. Doing dev on master has no drawbacks if installs don't mean git clone.

Todo:

Decide if a releases public branch makes sense.

Add option to run passed and passing

Add yuno run passed and yuno run passing to parallel failed and failing.

Arbitrary asymmetry is confusing. Plus it seems like a good way to explicitly check for regressions.

Consider buffering test results

Interrupting with ^C throws away the whole run, even test results that could be written to the history with no extra risk of inconsistency. Consider flushing harness stats on interruption.

On Ieng9, I get a noclassdeffound for my AssemblyGenerator class. This is really weird because on my mac I can run yuno just fine and produce an rc.s
but since all the linking and running happens on ieng9 I can't really test it on my mac

Does this have something to do with the compiler invocation? I'm using the 3.2 python found in /software/common/python-3.2

Excluding paths by regex

yuno run all --ignore ".*/broken|bad-test.rc$"

Should be useful for when people add bad tests.

Globstars behave incorrectly

Certain recursive globs don't search the repo correctly. It seems that globstars don't expand to zero subdirectories.

Examples:

yuno run phase*/**/check*/**
yuno run **/phase*/check*

Interactive testing

Add an option to use interactive testing.

Let the user step backwards or forwards through the test suite.
If a test fails, pause to let them try a fix and then re-run it or move on.

Use shell filename expansion if available

To allow glob arguments to yuno run and yuno run files on Windows, all glob expansion is handled internally. Bash users have to set -f or use escapes (yuno run "**/") to keep the shell from giving Yuno unexpected arguments.

When running on a Unix system, and a new use_native_globs key isn't false, accept a list of dirs to search (recursively, as normal).

Potential issues:

If the first thing in a shell-expanded glob is "files", it'll look like yuno run files ... and probably do something unexpected. Get rid of the run/files distinction, or change files to --files (ew).

Fail on nonzero exit or text in stderr

In practice it turned out that most times something exits with nonzero status or writes messages to stderr really is an error, especially for project 2. Currently the testers are too lenient, treating a nonzero exit code from the compiler as a quirk and just printing a warning. Those cases should instead be treated as test failures.

To keep the value the original behavior was supposed to add—mainly letting people use whatever exit code conventions they might want—add the following to yuno run:

--allow-exit-codes code [code... ] - A list of exit codes to treat as passing. Any except 0 should still raise a note.
--allow-stderr (boolean) - If a test triggers a write to stderr, let it pass (if it passes) but print a warning.

Extract base classes from testing to core

It would be nice to offer documented ABCs for the core interfaces used by feature modules. That should help encourage separation of concerns in any future plugins.

Error messages for Run aren't helpful

Getting syntax wrong just dumps a usage message with no explanation. Especially unhelpful when the problem is an unexpected glob expansion, where the user's input was correct but mangled into something else by Bash.

Split system_tests.py into test driver and test runner

All-in-one was hacky but ok. It grew. Now less ok.

run_single_test() and its supporting functions get moved to a separate script
run_all_tests() stays where it is

Make remote testing not hacky

Rename flint and steel to send and listen so people can tell WTH they are.
They both reach into other modules and "dependency-inject" by fiddling with stuff. Make this legit.

Custom diff modes?

--diff custom with config.some_key = "<command> {expected} {actual}"
--diff "<command> {e} {a}"
?

Certify not working on dev

I put a bunch of filenames in a list and tried to pipe it to yuno certify -, but got an error.

$ cat foo
phase1/check2/incdec-const-errors.rc
phase1/check3b/type-conflict-assignment.rc
phase1/check4/if-while-bad.rc
phase1/check5/function-call-bad.rc
phase1/check6b/not-assignable.rc
phase1/check6b/reference-bad.rc
phase1/check7/illegal-exit-bad.rc
phase2/check10/array-declaration-bad.rc
phase2/check11a/array-usage-bad.rc
phase2/check12a/foreach-bad.rc
phase2/check12b/break-continue-bad.rc
phase2/check13a/struct-decl-bad.rc
phase2/check13b/struct-decl-recursive-bad.rc
phase2/check14b/struct-this-bad.rc
phase2/check8/const-init-bad.rc
phase2/check8/divide-by-zero.rc
phase3/check15a/pointer-deref-bad.rc
phase3/check15b/pointer-deref-lhs-bad.rc
phase3/check16/delete-new-bad.rc

$ cat foo | yuno certify -
Generating answer files for piped-in tests
Traceback (most recent call last):
  File "/Users/jakem/Code/cs/cs131/yuno/yuno.py", line 26, in <module>
    main()
  File "/Users/jakem/Code/cs/cs131/yuno/yuno.py", line 23, in main
    program.main(options.tail)
  File "/Users/jakem/Code/cs/cs131/yuno/yuno/certify/certify.py", line 85, in main
    command_handlers[options.command](options)
  File "/Users/jakem/Code/cs/cs131/yuno/yuno/certify/certify.py", line 69, in _certify_pipe
    harness.run_set(tests)
  File "/Users/jakem/Code/cs/cs131/yuno/yuno/certify/testing.py", line 137, in run_set
    test.run_in_harness(self)
  File "/Users/jakem/Code/cs/cs131/yuno/yuno/core/testing.py", line 165, in run_in_harness
    harness.test_failed(self, output, expected_output)
  File "/Users/jakem/Code/cs/cs131/yuno/yuno/core/testing.py", line 241, in test_failed
    if self.diff_routine:
AttributeError: 'AnswerGeneratingHarness' object has no attribute 'diff_routine'

Add pruning plugin

Since there's nothing to track changes to the test repo, there needs to be a way to prune old tests that were deleted but are now stuck in the history. Probably just yuno prune with no options.

Fork Yuno for Phase II

We should fork Yuno for Phase II and write a new test suite. Here's my idea:

For each test, you have to write two files, testname.c and testname.rc and they should be equivalent.

Additionally, in Yuno's config, you can specify a username and server to ssh into before running (so that I can work locally, but it will run the tests on ieng9).

Then Yuno will run the local RC compiler on the rc file, then scp over the .c and the .s file, then compile them remotely, and run then, comparing their output.

The basic flow (in bash) being

$ ./RC testname.rc
$ scp rc.s user@ieng9
$ scp testname.c user@ieng9
$ ssh user@ieng9 'gcc input.c output.s rc.s; ./a.out' > RC_OUT
$ ssh user@ieng9 'gcc input.c output.s testname.c; ./a.out' > C_OUT
$ diff RC_OUT C_OUT

What you thinks?

Consolidate loaders

Almost all can be re-implemented in terms of recursive_glob. Keep load_from_regex for cases that need backtracking and add a fragment-matching function to search_folder for the rest. Matchers for the phase/check syntax should clean up that ugly regex mess.

New usage:

# recursive_glob.py
def search_folder(folder, fragments, match_by=fnmatch, globstar_mode=False):
    # ...
    is_matching_folder = match_by(item, fragments[0])
    # ...

# core/testing.py
def regex_matcher(fragment, pattern):
    return re.search(pattern, fragment) is not None

2.7-3.x compatibility

from __future__ import print_function in files using print()
In Certify, use the right raw_input()/input() function for the running Python version.

Globs break system test commands

System tests with glob expansion break in Bash. Use shlex.split and shell=False. Workaround so testing works with #22.

Handle checks with letters better in phase/check mode

Running yuno run check m-n won't find tests in checks between m and n if they have lettered subparts like check<m+1>a. They're still reachable through run <glob>, but it would be better if the phase and check syntax supported them.

It should work like this:

check 6 or check 6a - Exactly match check6/ or check6a/
check 6-10 - Match any checks between 6 and 10, inclusive, and including every sub-check.
check 6a-c - Match checks 6a, 6b, and 6c
check 6b-19a - Match as above, but starting at 6a and going through 15, 15a, 15b, and 19 to 19a. Subchecks in the range should be matched as greedily as possible.

Certify should try to normalize test paths

The compiler includes test paths in its error output, so improper relative paths create all kinds of problems for test cases. Answer files are standardized on dotless, repo-relative paths. Certify should do its best to maintain that convention in its output.

Compile and Watch documentation

Put words in README.

List display code in show should ignore blank lines in files

But it doesn't.

System tests

Yuno is too big for ad-hoc testing. Using #15, add a test driver for automated end-to-end regression tests. Unit testing may be useful soon, but the most important things to test are side effects and console output.

Write HACKME

Instructions for working with internal interfaces and data files.

Compile errors

Using compile for project two does not always compile all the test files. Most of them fail with a "No assembly written to rc.s" error. When compiling two files, for example, the first one will compile while the other fails. Sometimes, both will compile successfully.

CLI regression

#15 added parsing ambiguity to CLI: yuno run -h should send -h to run, but the launcher eats the -h and quits with help text.

Runtime config switching

Allow configuration settings to be changed at runtime by command-line switch. This will help with automated testing (letting Yuno test itself) and add useful flexibility for things like choosing test repos and file extensions.

Example:
--with <key name> <temporary value> [<value [<value> ...]]

For configuration settings which require lists, the --with switch should accept as many values as the user wants.

Clean up import spacing. Not sure why stuff like argparse was grouped with the third-party stuff.
Consistently use % or format().
Improve nondescriptive names.

Use ugly Python-style continuations

long_function_name(long_list_of_args,
    'so long',
    'such ew',
    'not wow')

instead of nice non-Python-style

long_function_name(
    long_list_of_args,
    'so long',
    'such ew',
    'not wow'
)