Giter VIP home page Giter VIP logo

Comments (9)

silvergasp avatar silvergasp commented on June 22, 2024

@dmeehan1968 Thought I'd spin this out into a separate discussion. My thoughts on implementing this are as follows;

  1. Abstract out the boost::asio sockets so that you can replace that with a generic byte stream. i.e. boost::asio sockets would become one 'backend' to the new byte stream api that the wire protocol communicates over. Then you could just pipe a serial port into a unix socket and connect that up to the ruby component of cucumber.
  2. Port the gtest backend over to use pigweed's embedded port of gtest.
  3. Slowly remove requirements for dynamic memory usage.

This is by no means an minor undertaking nor is it a priority for me. But it would be nice to have, and I might work on it slowly in the long term. Do you have any input on this, or know of any dead ends that I might run into when implementing this. I'm curious as to what 'incompatibilities' you where refereing to here.

from cucumber-cpp.

dmeehan1968 avatar dmeehan1968 commented on June 22, 2024

@silvergasp It's probably worth some background.

I've been attracted to Cucumber/Gherkin for some years, but I've never been able to make use of it in a meaningful way, regardless of environment. I've worked on C, C++, PHP, and NodeJs projects and tried to employ it in all, but keep hitting obstacles, but do think that a lot of that can be down to my own mindset. I've found it really challenging to figure out an appropriate 'domain language' for an application, and keep getting bogged down in the details of how to implement the step definitions. As a result, I usually get a few features into something, and then it just becomes too much mental load and I revert to unit testing and hope that the client and my view of project are close enough!

I habitually use unit testing in the development process, to both drive and validate implementation. The problem with unit tests is that they are too low level for customers to understand, and I find it mostly impossible to get them to specify what they want in anything other than high level detail. In fairness, that's great for me, as it gives me lots of latitude in how to implement.

For the past year I've been working on an embedded project. The client had lots of legacy code that was a house of cards (hacks). We had the opportunity to take a clean slate and build in unit tests and misra compliance which were selling points for the end user market. We chose to use C, and I used the CGreen testing library. This turned out to be a very pleasing environment. I added abstractions to all 'classes' within the application so that I could write tests against mocks and isolate each bit of code.

I work mostly in Xcode on Mac for development, and put everything into a library. This library is linked to the test runners.

The abstractions were a little costly though, quite a bit of boilerplate which couldn't easily be automated, and there is some run time overhead although its not significant to the product. So I've been looking towards otherwise of getting that mocking ability, and had been looking at the thinnest of C++ wrappers. But that then seems to open a can of worms around testing/mocking frameworks, I tried to pick up Cucumber C++ but struggled with the build environment, and actually the fundamental of putting the step definitions on the device (bridged via the wire protocol) was just not going to be viable for anything other than a simple test.

All I really want the behavioural tests to do is to 'observe' the product in operation, and interfere with it in the least way possible. The product already has a debug serial interface, although currently that's compiled out for the release version, but it wouldn't be impossible to make a conditional on an active serial connection or some other 'switch'.

What I think I need is a cucumber test runner in a language I (do/can) understand, that talks to an interface that deals with the serial interface to the device. Largely it would collect the serial output and make validations on what has occurred (as an analogy, it would be a bit like a headless web browser). It needs to be able to instruct the device to reset, and potentially make RTC adjustments so that time can be accelerated to speed up the tests.

The device has two fundamental interfaces - sensors and a wireless network. There would be the need to drive the sensor readings from the specification, in conjunction with the passage of time (for example, a rise in temperature over time), and then the wireless network needs to be 'observable'. Ideally, the product would work without need for the actual network, but that means being able to provide a mock interface within the device, and that might be too much. At the end of the day, although the network interactions would add latency to the tests, the observation of what messages the device sends could be done by observing the database that the messages get pumped into (and there is scope to put in a test dummy that strips away any other application processing so we just validate the message content and provide canned responses (which are actually few and far between).

There is some logic involved in how the device validates its network connectivity (its mostly sending blind) which is actually hard to test without switching off the wireless gateways or interfering in the network servers sense of connection 'state', so it might be desirable to mock/stub those outages.

Basically, the device needs a simple protocol over serial so that we can inject commands and observe results, combined with observation of activity on the other side of the network servers.

I'm actually unclear how all of that would fit into any available Cucumber variant. I don't see myself as a maintain or even active participant in this project, the patch I made to my own fork was merely to try and get it to build in the basic sense with current dependency versions (that's what I meant about incompatibilities). My experience of Cucumber as a 'platform' over the years is that every time I've picked up a variant to experiment with I've spent time patching or working around deviation from the documentation/code base and the current version of anything it interacts with, or at least so it seems.

from cucumber-cpp.

silvergasp avatar silvergasp commented on June 22, 2024

Yeah ok, that's a great rundown! Getting step definitions running on a micro-controller is definitely going to be a challenge. Though in concept this would be really nice to have. As a workaround an approach that I am currently exploring is to use a remote procedure call (RPC) to drive python step definitions directly on a micro-controller. e.g.

python step definitions (behave/cucumber) --- calls ---> RPC server on microcontroller. (Pigweed rpc)

Ideally I'd run the step definitions directly on the microcontroller using a port of cucumber-cpp. But in the mean time this seems like the easiest solution.

In terms of my usage. I rarely if ever create end2end tests with cucumber as I find that they are an enourmous amount of work, fragile and quickly go out of scope. This is particularly acute when dealing with embedded systems. For the most part I usually use a set of integration tests that use fakes/mocks to abstract away sensors and other IO. At the moment the integration tests that run with Cucumber are host only tests, that depend on mocked IO.

Sometimes I will run sets of integration tests on the same set of tests with some io/sensor devices in the loop as well. But this is a fairly new for me and I'm still working out best practice.

Certainly a lot the Pigweed embedded c++ libs would help with implementing all of this.

from cucumber-cpp.

dmeehan1968 avatar dmeehan1968 commented on June 22, 2024

Getting step definitions running on a micro-controller is definitely going to be a challenge.

I don't think this is actually a challenge, that's really the point of the wire protocol, it allows the Cucumber runner to delegate the execution of the step, and only concern itself with pass/fail reporting.

The bigger question is whether you'd ever actually want this. It's akin to running unit tests on the micro. You're not so much testing application logic as you are validating the toolchain, the ability to turn your code into a different instruction set. I've been using my approach of Xcode running unit tests, and then building the application using its tool chain to great effect. There are a couple of gotchas, like word alignment of struct fields, but I've yet to encounter a problem where compiler optimisation causes the application logic to fail. Therefore, I can trust the unit tests I run under Xcode.

The only case where I could see a desire to put the step defs on the device itself is if its otherwise 'unobservable', i.e. there is no scope to put a debug/trace interface onto it. I have an interesting niggly notion in my head that if you could hook into JTAG much like a debugger does, then ones ability to 'observe' the device becomes super detailed, especially where devices are operating at very high fidelity and outputting meaningful debug would be prohibitive. But that still requires what amounts to a physical wire interface to a device, which wouldn't always be part of the bill of materials for a production device.

As a workaround an approach that I am currently exploring is to use a remote procedure call (RPC) to drive python step definitions directly on a micro-controller. e.g.

Isn't that reinventing the wire protocol? It is RPC, with a limited set of procedures. As I said above, I see more cons than pros to putting step definitions on the target device, especially for embedded where resource limitations come into view much more quickly than on the average development computer. What I think the wire protocol was originally developed for was to allow step definitions to run in a walled garden, separated from the runner itself.

Cucumber-CPP therefore doesn't have a lot of value for me, because the mental effort of dealing with CPP and the witchcraft of testing/mocking is just too much. I do need to write my step definitions in something of course, but Ruby/PHP/Javascript is likely to be much more elegant. With the approach I outlined, where the step definitions are largely just observing a serial stream, means that I can just write fairly simple tests around regex and aggregation to determine if the device performed as expected. As I said before, because it based heavily on the passage of time, this can be a long running process, so test brittleness is best avoided, because the cost of retrying is high (unless as I proposed, I can 'time travel' the device to simulate the passage of time without invalidating what the test is actually testing).

For the most part I usually use a set of integration tests that use fakes/mocks to abstract away sensors and other IO.

Which I think is the same as I think I'd do, unless I'm going to hook the whole thing up to a heating/cooling system so that I can ACTUALLY change the environment the device is in, which seems daft to even consider!

In all I think that means that Cucumber-CPP isn't providing any value (for me), other than that it does support the wire protocol (I think Ruby is the only flavour that does). Given that I don't see a reason to run the step definitions on the device, all I need is an RPC and/or serial library in whatever flavour of Cucumber I care for, so that I can observe my device, and send it a few limited commands (reset, fake sensor readings, mock wireless network responses, time travel).

from cucumber-cpp.

silvergasp avatar silvergasp commented on June 22, 2024

The bigger question is whether you'd ever actually want this. It's akin to running unit tests on the micro. You're not so much testing application logic as you are validating the toolchain, the ability to turn your code into a different instruction set. I've been using my approach of Xcode running unit tests, and then building the application using its tool chain to great effect. There are a couple of gotchas, like word alignment of struct fields, but I've yet to encounter a problem where compiler optimisation causes the application logic to fail. Therefore, I can trust the unit tests I run under Xcode.

While in an ideal world I think your points here make perfect sense, I think there is a separate set of challenges here beyond just testing your toolchain. What follows here is a bit opinionated and I'm not trying disseminate opinions as facts. So if you have a difference of opinion I'd love to here it.

So the way in which I see unit tests is as a special case for an integration test. The idea that a unit test just tests a singular unit of functionality is patently false. But it is a useful assumption to make in the sphere of stable and reliable operating system such as windows/linux/mac. Realistically when you are unit testing, you are just testing a set of instructions generated by a toolchain that run on a very particular set of hardware. In the case of a 'standard' operating system (e.g. linux) the abstractions that make up the RAM, persistent storage and other IO are abstracted, stable and so well defined that you can 'just run code'. Or 'just run a unit test'. However in the embedded world this in not necessarily always the case. In both the 'embedded world' and the 'application world' your 'unit test' is in fact a integration test that tests;

  • The code that you right + the unit test framework,
  • Your toolchain,
  • The OS kernel (with the exception of bare metal),
  • The minimum hardware that is required to run your test e.g. RAM, persistent storage, cpu etc.

In my experience it is useful to run tests on bare-metal platforms. But under the assumption that 'unit testing' is in fact integration testing. This is obviously falls into the spectrum of integration vs. unit testing. I've certainly found a few bugs in how my so called 'portable code' does not play nicely with an RTOS and/or the hardware itself. Or even more common I find that a so called 'portable RTOS' or a stable (MISRA compatible) HAL does not behave exactly as it should. But I still have to depend on these systems to 'unit test' my code.

Isn't that reinventing the wire protocol? It is RPC, with a limited set of procedures. As I said above, I see more cons than pros to putting step definitions on the target device, especially for embedded where resource limitations come into view much more quickly than on the average development computer. What I think the wire protocol was originally developed for was to allow step definitions to run in a walled garden, separated from the runner itself.

Not in my case. For context I generally treat BDD tests as high level tests where there is an interface between the customer and a developer. But beyond that, the logic is tested as a unit/integration test. I'm not reinventing the 'wire protocol' because the RPC system is intended to be used in production. So the BDD test, drives the set of unit tests, that drives the RPC implementation. In many IOT cases you are just doing 'dumb' sensor forwarding to sending a sensor reading off to the interwebs. But in some cases you want to have control authority, this is where it is advantageous to introduce a dedicated RPC.

Cucumber-CPP therefore doesn't have a lot of value for me, because the mental effort of dealing with CPP and the witchcraft of testing/mocking is just too much. I do need to write my step definitions in something of course, but Ruby/PHP/Javascript is likely to be much more elegant. With the approach I outlined, where the step definitions are largely just observing a serial stream, means that I can just write fairly simple tests around regex and aggregation to determine if the device performed as expected. As I said before, because it based heavily on the passage of time, this can be a long running process, so test brittleness is best avoided, because the cost of retrying is high (unless as I proposed, I can 'time travel' the device to simulate the passage of time without invalidating what the test is actually testing).

This is the conclusion that I am leaning towards as well. Cucumber-cpp, as it currently stands does not offer significant value for an embedded project. While an embedded backend would be nice to have, the investment tradeoff to get things working compared to working around the issues is quite significant. Though I'd disagree with the sentiment towards mocking/testing. In some cases you can 'time travel' using a mocked time API. The broader issue of 'test brittleness' on embedded systems is something that I've long been trying to wrap my head around. How would test brittleness not lead to brittleness in production. Either it would make sense to broaden the tests so that they weren't brittle or improve the product such that any physical causes of brittleness can become negligible.

I definetely agree with you in some areas but disagree in others. However, I think in some dimensions we are bike-shedding a little, and maybe going beyond the scope of what makes sense within the cucumber framework. I would love to here more about the success stories that @mattwynne mentioned.

from cucumber-cpp.

dmeehan1968 avatar dmeehan1968 commented on June 22, 2024

Whilst I do get your point about what a unit test actually comprises, I think from a computer science point of view its intended to represent the simplest module of functionality. I think we can all accept that code can't run itself, and therefore we do have to put some faith in the toolchain we use. I do generally accept that one C compiler is reasonably comparable in its 'effect', even if its method is different. And yes, we can probably get out of the bike shed ;-)

I do rather wonder whether Cucumber-CPP is a solution looking for a problem. It's essentially a runner, and given that it takes a hands off approach to the step definitions (which are running over the wire protocol in a separate executable, perhaps on a separate machine), I think it -should- be interchangeable with the other Cucumber variants, as long as they all implemented the wire protocol consistently (and perhaps that's where the crux of the problem lies). It seems to me that its important (if we are trusting our toolchains), that the test runner IS actually compatible and comparable. Runner options for filtering tests by feature/tag, and support for Gherkin etc should be universally available.

There is then the secondary aspect of how a developer uses the tools/language of their choice to implement the steps. Or, depending on perspective, the paramount one. It's not my tool of choice, and I suspect not for the majority of embedded developers. If there was a truly compelling reason to adopt C++ (i.e. productivity/quality) then we might see more action here. This is not helped by many of the microprocessor and toolchain vendors still not supporting C++, even if you can get it running with a bit of tinkering.

One thing that you highlighted is that many of these embedded devices are running some sort of interpretter, such as python, and that seems like a use case that's worth supporting with a Cucumber flavoured tool. And in doing do, maybe that opens up some other options

from cucumber-cpp.

mattwynne avatar mattwynne commented on June 22, 2024

I do rather wonder whether Cucumber-CPP is a solution looking for a problem

For context, it's worth noting that the wire protocol (for which the best documentation is probably the feature files) was originally developed to enable the pre-cursor to SpecFlow, back in the days when Cucumber only ran on Ruby and we wanted to be able to use it to run tests on native .NET code.

Cucumber-CPP then came about as a happy accident because the wire protocol existed.

I have heard of people using cucumber-cpp it to test Windows C++ GUI apps, where they baked a step definition server into a build of the software so they could automate their GUI through it.

I think it's a good idea to take a step back and look at what people need, and whether there's another way of delivering it. I don't have enough context to understand what the typical use-cases are, and where you'd want the boundaries of test runner / step defs / system under test to be. Maybe there are a few different development contexts we'd want to cater for here.

from cucumber-cpp.

ursfassler avatar ursfassler commented on June 22, 2024

Personally, I don't see a need to run cucumber-cpp on the target. I work mostly on Embedded GNU/Linux Software in C++. Here, running it on the target would be easy as the Embedded Linux isn't really different to my development machine. A few projects have been different. Those projects are in C or C++, running on Micro-controllers,with no RTOS, proprietary RTOS and Zephyr.

I'll need cucumber-cpp for the behavior of the application (or business logic). Usually I abstract away at the driver level (GPIO, LED, serial port, network or higher level network protocol, ...). Those drivers are tested with unit tests, but not part of the acceptance tests. Main reason for this is that it is usually the right abstraction to use in the acceptance tests. Going lower would mean very complicated mocks.

When I only have the business logic, no hardware specific functionality or code is included (it is an onion architecture). That code has to run on the hardware. This is ensured by the compiler and static analyzers (i.e. that the code does not contain undefined behavior).

My main motivation to not run it on the hardware is to shorten the feedback cycle. When I'm developing I want to see it working, and I want that fast. Also running it fast and as parallel as I need is of advantage in the CI system.

A test on the real hardware is needed nevertheless. I haven't done a lot of automation here, most test have been done manually. Because all the business logic is known to work, those tests are quite short. When doing automated tests with the target, I'll test the whole system. That means, I don't instrument the code on the target but use the external interfaces the target has and uses.

from cucumber-cpp.

ursfassler avatar ursfassler commented on June 22, 2024

Closing the issue because I don't see cucumber-cpp developing into this direction. Maybe an implementation in plain C would suit the needs better?

Please reopen and comment when you don't agree.

from cucumber-cpp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.