Our current test suite-in-a-file approach is starting to be a little unwieldy and we h

I think you've nailed it. <a class="user-mention notranslate" data-hovercard-type="org

systematic test suite about julia HOT 26 CLOSED

julialang commented on April 29, 2024

systematic test suite

from julia.

Comments (26)

JeffBezanson commented on April 29, 2024

True.
tests.j is not meant to be exhaustive; its purpose is to quickly sanity-check a wide range of basic functionality. This is needed to answer the all-important "does the damn thing work?" question after making an unusual change. In fact right now it has some tests that are a bit too thorough, and it takes a bit longer to run than it should. I want it to quickly tell me if I've broken something major. We can start moving out some of the more detailed stuff to other files.
Every commit must pass tests.j. It should finish in a couple seconds so there's no excuse not to run it. What goes in there is a fuzzy question, and it's ok for it to overlap with other test suites too.

from julia.

StefanKarpinski commented on April 29, 2024

Again, not crucial for a first release, reassigning to v2.0.

from julia.

HarlanH commented on April 29, 2024

I'm starting to collect some notes on what a souped-up Julia testing framework might look like. Let me know if you have strong opinions on how this should go. I expect it'll look sorta like every other test framework... My notes are being put here: https://github.com/HarlanH/julia/wiki/Testing-Notes

from julia.

JeffBezanson commented on April 29, 2024

My only comment right now is to avoid overuse of macros... something like equals(4) can simply return a function.

from julia.

StefanKarpinski commented on April 29, 2024

I definitely think that keeping it as simple as possible is desirable. A lot of unit testing frameworks seem to be really over-engineered. Also, I really dislike cutesy things like what rspec does, trying to read like English:

describe Bowling, "#score" do
  it "returns 0 for all gutter game" do
    bowling = Bowling.new
    20.times { bowling.hit(0) }
    bowling.score.should eq(0)
  end
end

Making it read sort of like a sentence but not quite does not help anyone and just makes it annoying to program with (AppleScript also tries to read like English and I find it impossible to program with as a result).

from julia.

HarlanH commented on April 29, 2024

I agree with simple! Python's unittest framework is pretty nice, but it's tied in with everything being objects, so you end up having to write a fair amount of boilerplate. R's test_that is better in that regard, although there's one too many levels of syntax, which I've not included in my proposal. There is a tiny bit of English-ness to the syntax, but it's only at the inner level, and is not too cutsey, I hope:

test_context("String Processing")
setup = "The quick brown fox jumps over the lazy dog." 

test_group("whitespace trimming functions")
@test_that strip("\t  hi   \n") equals("hi")
@test_that strip("hi") equals("hi")
@test_that strip("") equals("")

test_group("string length and size functions")
@test_that length("hi mom") equals(6)
@test_that length("") equals(0)

teardown = "noop"

(The setup/teardown are just showing where you'd do that actual work. There's no special syntax or anything -- it's just a script.)

from julia.

StefanKarpinski commented on April 29, 2024

My immediate reaction is why aren't these just asserts? I mean I know why — it's so that you can get better diagnostic messages. But writing things this way is annoying when all I really want to write is this:

@assert strip("\t  hi   \n") == "hi"

In this particular example, a plain old assertion is actually fine because if it fails it will complain that assertion failed: strip("\t hi \n")=="hi" (that's not quite accurate because our expression printing is a little horked right now, but I just opened an issue about that: #540). However, in situations where you programmatically construct the tests using for loops over various values, this is lousy because the expression is the same every single time and you have no idea what values caused it to fail. However, I think a better solution would be to fill in the values of variables in the expression and show what they are automatically. That way you get to just write a plain old assertion and still get exactly the debugging info you want.

Another thing, of course, is that you may want to be able to tally up how many tests passed and failed. That can be accomplished just by re-defining the assert macro to be non-fatal. What are some of the other things one expects from a unit testing framework? Grouping of tests into semantically meaningful collections? Pretty output while running tests?

from julia.

HarlanH commented on April 29, 2024

I think you've nailed it. @Assert throws an error, quite reasonably. Test suites should never fail. I don't like the idea of changing or over-riding the semantics of @Assert.

Yes, there should be pretty-printed summaries and grouped tests. The currently existing grouping relies on Makefiles, which is not appropriate for anything but core Julia testing.

Another common benefit is that if @Assert 2+2=5 fails, you don't know what 2+2 actual evaluated to, you just know it's not 5. Any reasonable test suite will print the evaluated LHS as part of the failure report. That can of course be very helpful to know what's wrong.

from julia.

StefanKarpinski commented on April 29, 2024

I guess my fundamental objection is to the way testing frameworks typically re-implement a significant core portion of the programming language to express expectations. Why do I have to write equality a different, less clear way? It just seems to me to be begging for some metaprogramming instead of re-implementing things "equals" and "matches" with different names.

One obvious thing that could be done is to make test assertion macros look at the expression that's asserted and if it fails evaluate portions of it. For example, @assert can look at 2+2 == 5 and see that the head is :comparison. When the head of an asserted expression is comparison, you can print what the left and right hand sides are when the test fails (in general, comparisons are chains, but that doesn't really change anything). Then there's no need to use a separate expectation language that just mirrors what normal code expresses but with worse syntax.

Better still, with the metaprogramming approach, when you add some diagnostic enhancement like printing the left and right hand sides of a failed comparison, then all tests everywhere — not just ones that use some special expectation operator — immediately produce better diagnostic messages without changing any test code. This reminds me of how with Julia's dynamic type inference you can improve the performance of the inference without having to change the language spec: existing code will keep working and just run faster. That's because type inference isn't part of the language spec — it's just an optimization. Similarly, I'd like a testing framework where you just assert conditions that you want to see true; enhanced diagnostics when something fails is just an optimization and shouldn't affect how you write tests.

I'm not sure why changing the behavior of @assert for testing would be a problem. Any particular reason that gives you the willies? I feel like running tests with the basic assert definition should work and be somewhat useful (including, in particular that there's nothing special about the test code), but running with the "fancy" testing framework ought to just make the output better (better diagnostics when things fail, nice status output while running tests, etc.).

from julia.

HarlanH commented on April 29, 2024

Hm, clever idea, but I'm not convinced -- it just sounds limited. I'm imagining testing data structures for functional equality, algorithms for performance, etc. That's a lot of logic to figure out what to display with no hints from the user. Also, how do you test output to stdout or the type of a returned exception without a lot of boilerplate? I think you really do want the expectations. And it's not "a significant part of the programming language"! Even one as tight as Julia! :)

To me, an assert is an inline test in production code that crashes early if something is wildly wrong, before anything gets corrupted or the system hangs. It's testing an invariant at a particular point in time. A test suite can do a lot more than that, and certainly anyone wanting to use a TDD approach will want more functionality than just what @Assert can provide. They're different, and different enough that overloading the semantics seems weird to me.

from julia.

StefanKarpinski commented on April 29, 2024

Well, certainly calling it something else like @expect is not insane, but I'm not convinced that an @assert in the middle of a program is really any different than an assertion that some fact should be true. But maybe I need to give it more thought. I certainly agree that @assert alone is not good enough for everything. Something like @assert_fails needs its own macro, for example. Testing output to stdout is easy: @assert print_to_string(foo) == "bar". Nothing needs to be added to the testing framework for that because you can just use the language.

from julia.

StefanKarpinski commented on April 29, 2024

Also, it's the opposite of limited: you can test any condition that can be expressed in the language. The only issue is how well diagnostic output is produced, which can be improved easily without having to change any testing code.

from julia.

HarlanH commented on April 29, 2024

Hm, well, I'm seeing what you're getting at. It may end up having frustrating limitations in practice, but I can see how to write it, anyway. I'll keep pondering for a little bit and see if anyone else chimes in. If nothing else, going from your solution to my solution is not a huge amount of work, if that turns out to be necessary at some point!

from julia.

StefanKarpinski commented on April 29, 2024

After some consideration, I think that you're right that asserting that a test has a certain value should not use the @assert macro. The thing that convinced me is that you might have an assert in code that is running from tests and unless the re-defined @assert was lexically scoped, those would get treated like test assertions, which is clearly wrong.

from julia.

HarlanH commented on April 29, 2024

Oh, good call! Didn't even think of that! It'll have to be @test_that or something, then.

from julia.

StefanKarpinski commented on April 29, 2024

How about @expect and @fails for when you expect something to fail?

from julia.

StefanKarpinski commented on April 29, 2024

Better still, how about just calling it @test — I'm a big fan of as much brevity as is reasonably possible. All other test-related assertions can start with @test_: @test_fails, etc. Those are far less common so brevity is less important.

We can start by making @test basically a copy of the current @assert macro — except that we should actually change @assert so that it only works on single boolean values. Currently it also works on arrays of booleans by applying the all reducer to them. That's really handy for tests, but should probably not be done for normal code assertions.

The next step would be to make the current test suites, essentially untouched, produce better output. Maybe with a little bit of test group support added. This can include making @test macro produce more informative output and printing test progress while running. Stuff like that.

from julia.

HarlanH commented on April 29, 2024

@test and @test_fails sound great to me!

I'd actually prefer to write a general-purpose test framework (just runtests, these macros, and a couple of utility functions for labeling blocks) first, then move the existing core tests to the framework later. I think it'll be easier for me to wrap my head around it that way.

Also, I really do think it'll be useful, and not that hard, and good practice, to use the Task produce/consume framework to separate tests from display of test results.

from julia.

StefanKarpinski commented on April 29, 2024

Can you elaborate on the produce/consume bit? Clearly performance isn't a big issue for test results, so using coroutines is totally acceptable. Just unclear on what the advantage would be.

from julia.

HarlanH commented on April 29, 2024

Sure. By separating the code that does the tests from the code that displays the results, it makes it much easier for people later on to customize the way the results are displayed. Instead of having to dig into core code, they just write a piece of code that consumes TestResults objects and does something with them. Imagine needing to change the output of tests to be parsable by some other build tool, or to be displayed in a GUI or analyzed by an IDE. It's just good separation of concerns, I think.

from julia.

StefanKarpinski commented on April 29, 2024

Ah, that makes a lot of sense. It's much like using a UNIX pipeline, actually: one command runs the tests and pipes the results to the next which can do whatever it wants to display them. Excellent application of coroutines.

from julia.

JeffBezanson commented on April 29, 2024

As far as modularity of design, sure. But I don't see why the particular control flow of coroutines is needed --- do we have to iterate through test results as they're produced instead of just storing them? For example, it's popular to store test results in a database, though that would be overkill for us at first.

from julia.

HarlanH commented on April 29, 2024

I'd argue that real-time output of tests is important. If they're just being stacked up to be returned to the output routine, you can't tell what, if anything, is happening.

from julia.

StefanKarpinski commented on April 29, 2024

One obvious application is showing how many tests have been run (and how many have failed) while running them. The Perl build and test system does this automatically and it's, you know, nice.

from julia.

JeffBezanson commented on April 29, 2024

Let's drop this issue in favor of more specific needed features, some of which already have issues.

from julia.

ViralBShah commented on April 29, 2024

I had been meaning to close this one for the same reason.

from julia.

systematic test suite about julia HOT 26 CLOSED

Comments (26)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent