ezward / esp32camerarover2 Goto Github PK

EzRover is a framework for building and programming inexpensive differential drive robots. The framework includes closed loop speed control, pose estimation and go to goal behavior. Behaviors can be added in JavaScript. The first hardware instantiation uses an ESP32cam, an L9110S DC Motor Driver and a cheap Robot Car chassis.

License: MIT License

HTML 4.57% C 16.91% C++ 22.96% Shell 0.48% CSS 1.18% JavaScript 53.89%

esp32camerarover2's People

Contributors

Stargazers

Watchers

Forkers

cgrrty sichitong timyou chaumezz mikewellc alexbulavin flyinggh

esp32camerarover2's Issues

Implement rover as access point and compare latency

Why
As discussed in issue #8, we would like to reduce the video latency between the rover and the gamepad in order to make FPV mode more usable. One potential latency is the protocol between the rover and the web application. The Open.HD project has addressed this issue in part by writing code to create a custom peer to peer connection between RaspberryPi endpoints using usb wifi adapters' radios. Espressif had created a similar peer to peer protocol between Esp32 endpoints called EspNow. However, EspNow is not suitable for transferring video because it limits a message to 250 bytes. Instead, we would like to see if removing the router from our current setup reduces latency.

Our current wifi setup configures the rover's Esp32Cam as a wifi station that connects to a wifi access point. The client also connects to that wifi access point, then uses the Esp32Cam's ip address to load the web application. This presents two issues; first we have to know that ip address that the router assigned to the Esp32Cam. Second and more importantly to this ticket, websocket packets must transit the wifi router to get from rover to client browser. We want to see if removing that wifi router from the middle can reduce latency. As a potential side benefit, it may simplify connecting to the rover.

What
Implement a wifi configuration where the rover's Esp32Cam is configured as an access point. The device that will run the web browser will then connect directly to that access point. This will eliminate the intervening wifi router. Note we may need to implement a wifi configuration where the rover's Esp32Cam is both an access point and a station, so that it can serve the web application, serve the camera status endpoints and join the websocket connection.

Once this configuration is working, measure the video latency between the rover and the web client.

Where can I see the demo video？

Where can I see the demo video

Implement mDNS to assign a .local dns name

See https://floatingintheclouds.com/espmdns/

See EloquentEsp32 example https://github.com/eloquentarduino/EloquentEsp32cam/blob/main/src/traits/HasMDNS.h

Multicast DNS is a protocol that allows a device to register a DNS name on the local network. This makes it easy to find your device even if you don't know the IP address. You can define a name, like myespcam and it will show up on the network as myespcam.local. That would allow us to access the robot application on http://myespcam.local:8080

Implement basic authentication for the rover's web ui and web socket api

Currently any one can control the robot if they have or can guess its IP address. We want to add optional basic authentication such that the user can configure username and password that is compiled into the firmware and is used for basic authentication. If either the username or the password is empty then no authentication is done. If both the username and password are not empty then they are used as the credentials for the web ui and web socket api as follows:

The use is challenged for a username and password using the browser's default basic authentication dialog when the user opens the web ui.
The web server checks the Authorization header for the web ui GET and the web socket GET as follows;
- If the Authorization header is missing then the request fails with a 401 Not Authorized response
- If the Authorization header exists but does not match the configured username and password then the request fails with a 403 Forbidden response.
- If valid of the Authorization header matches the configured username and password then the requests succeed.

Save calibration values in browser

We currently have to re-input our motor and speed control calibration values each time the rover is started. It would be better to save these in the browser and reload them when the application is started.

When we save calibration values to the rover, also save them in the browser.

This has a forward looking benefit; if we implement a simulator in the browser then it will also have access to saved calibration values.

Redesign Encoder polling code

Rewrite the encoder polling code to act like the mono encoder sketch I create for the DonkeyCar project. See https://github.com/autorope/donkeycar/tree/921-next-generation-odometer-parts/arduino

The mono encoder does a reasonable job of reading encoder pulses while debouncing, all without interrupts. We can leave the interrupt version of the code as-sis.

How to compile the code correctly？

HI
I has been able to successfully compile the code，But the serial port does not output debug information
How to set up to print the following statement like "Setting up..." ...Wifi initialized, running on IP Address:

//
// Include _after_ encoder stuff so SERIAL_DISABLE is correctly set
//
// leave LOG_LEVEL undefined to turn off all logging
// or define as one of ERROR_LEVEL, WARNING_LEVEL, INFO_LEVEL, DEBUG_LEVEL
//
#define LOG_LEVEL DEBUG_LEVEL
#include "log.h"

void setup()
{
    //
    // initialize serial monitor
    // NOTE: if SERIAL_DISABLE is defined, then SERIAL_xxxx calls are all no-ops
    //
    SERIAL_BEGIN(115200);
    SERIAL_DEBUG(true);
    SERIAL_PRINTLN();

    LOG_INFO("Setting up...");

    // 
    // init wifi
    //
    WiFi.mode(WIFI_STA);
    WiFi.begin(ssid, password);
    if (WiFi.waitForConnectResult() != WL_CONNECTED)
    {
        LOG_ERROR("WiFi Failed!\n");
        return;
    }

    SERIAL_PRINT("...Wifi initialized, running on IP Address: ");
    SERIAL_PRINTLN(WiFi.localIP().toString());
    SERIAL_PRINT("ESP Board MAC Address:  ");
    SERIAL_PRINTLN(WiFi.macAddress());

[env]
src_build_flags = 
	-D SERIAL_DISABLE=0
	-D USE_WHEEL_ENCODERS=0
	-D USE_ENCODER_INTERRUPTS=0
    -include arduino.h

[env:esp32cam]
platform = espressif32
board = esp32cam
framework = arduino
monitor_speed = 115200
lib_deps = 
	ESP Async WebServer
	WebSockets

Implement an in-browser simulator

It would be excellent to have a simple simulator so that we can test the software without needing to download it to the rover. To do the we put simulate a motor controller that takes a duty cycle and then we must use this duty cycle to simulate the encoder counts such that the motor 'spins' proportional to the duty cycle. We also need to have rover configuration values that are currently in the c++ firmware to make this work; wheel diameter, axle length, ticks-per-revolution, etc. We also need to simulate the stall value and min and max speeds for the motors. Ideally the simulator configuration would be in a web ui that is shown only if the simulator is running; although we could start by compiling it into the simulator.

Simulated rover configuration
- wheel diameter
- axle length
- ticks-per-revolution for wheel encoders
- stall value for duty cycle (value below which encoder produces zero)
- motor speed (revolutions/sec) at stall value
- maximum motor speed (revolutions/sec) at full duty cycle.

The simulator will be a version of the c++ application compiled to web assembly and executed in the browser. The simulator version of the code will replace real motor drivers and encoders with simulations.

simulated motor and encoder
- input is duty cycle
- duty cycle is applied first against stall value; it is is below stall value that revolutions are zero.
- if it is not below stall duty cycle is used to scale between min and max motor revolutions
- motor revolutions and the time delta are used to produce a tick count for that motor's encoder.

When in simulator mode we won't actually open a web socket to the rover; so how do we simulator this communication? That is TBD. We may simulate the web socket as well.

If we can simulator the motors/encoders/websocket commands then all of the web application should work as it does with the physical esp32cam. So we can see the motor and pose telemetry, we can enter the calibration values, etc.

Add Jsdoc types to JavaScript

We would like to have the JavaScript code in the rover web ui much better documented. This involves reorganizing the code, adding jsdoc and adding higher level explanation in the markdown docs.

Add Jsdoc to all functions and output types.
Enable type checking in Visual Studio using TypeScript to interpret the jsdoc types.
- See https://www.typescriptlang.org/docs/handbook/jsdoc-supported-types.html for what jsdoc is supported.
- Make sure this works in Visual Studio; https://code.visualstudio.com/docs/nodejs/working-with-javascript#_javascript-projects-jsconfigjson
explain the structure and patterns used in the JavaScript in the markdown documentation

Things to note

jsconfig.json is used to tell TypeScript how/where to type check using jsdoc
TypeScript reference comments (TypeScript includes) can be used to make the types of dependencies available in a JavaScript file. https://www.typescriptlang.org/docs/handbook/triple-slash-directives.html

This is being worked on in PR #26

Support 'forward only' motor control mode

Currently each wheel uses two pwm pins, one for forward and one for backward. This is good because it (a) allows us to backup (b) allows us to turn in place. However, if we only support a forward pwm pin on each motor, then we would have two free pins. This would then allow us to read the encoders with those pins. Then we could support use of the serial ports without having to disconnect that encoders.

support compile-time optional drive train mode where we only supply forward pwm to each wheel.
when the firmware is compiled in this mode, the Turtle Control web UI must hide the 'Reverse' button because the rover cannot drive in reverse.
when the firmware is compiled in forward-only mode the Go to goal must use a slightly different 'Turn to Angle' algorithm.
when the firmware is compiled in forward-only mode, the Game Controller support in the web UI must use a slightly different algorithm that does not allow for reverse. Perhaps simply setting any reverse value to zero is adequate.
when the firmware is compiled in forward-only mode, allow the serial port to be activated even if encoders are activated. The encoder pins will be reassigned to the former reverse pwm pins in this mode.

There are failures in unit tests

unit tests are stale
looks like changes to command parameters are causing rover parse errors

Discussion on the Standard of remote Control Protocol

Hi
I see an esp32 project, which realizes the control of drone. The protocol of the remote control part is standard. Can we use it for reference?
https://michalschwarz.github.io/rc-controller/esp32/quadcopter/f450/asgard32/schema/2019/05/07/esp32-drone-v1.0.0.html

Implement bluetooth gamepad connection between Esp32Cam and PS3 controller.

Why
As discussed in issue #8, we would like to reduce the input latency between the rover and the gamepad. Currently we are using a gamepad attached to the web application's browser, so it incurs the latency across the websocket connection. Further, we (in another issue) would like to implement a peer-to-peer FPV video solution, so attaching a gamepad to the browser would not make sense.

What
Implement bluetooth gamepad connection between Esp32Cam and PS3 controller. See these:

We will need to turn the returned values into wheel speed (tank style) inputs from 0 to 1.0. We do this in the javascript code now. This involves a mapping of gamepad axis to command inputs. We should put this into config in a header file; they can be fairly static because we intend to use us a PS3 joystick for now. However, we do want these in constants so we could change them if we implement another gamepad.

See these javascript files for current gamepad implementation.

Compare the resulting input latency between the bluetooth gamepad implementation and the web application's HTML5 Gamepad API/Websocket implementation,

Speed control is broken on Esp32Cam if ENABLE_CAMERA is not defined

There was a previous commit that create a new preprocessor symbol, ENABLE_CAMERA, that must be defined for the camera support to be fully compiled into the rover's C++ code. By default this is defined in platformio.ini.
When it is not defined, the camera api is turned into function that just return failure, exception for getCameraPropertiesJson() which is the method underlying the status/ endpoint. When camera is enabled, it return all the camera properties and an enabled property with value of "true". When camera is disabled, it only returns enabled with value of "false".
So that's the context. The bug is that if ENABLE_CAMERA is not defined, then speed control does NOT work. The issue seems to be the interrupts are not firing, so the encoder never counts any ticks. This is likely because the camera internals may turn on interrupts, but if we don't initialize it, they never get turned on.

One way to confirm this is to disable the camera AND encoder interrupts; to do this, comment out these two lines in platformio.ini;

    -D USE_ENCODER_INTERRUPTS=1
    -D ENABLE_CAMERA=1

This will then make the rover C++ code using polling to read the encoder; this is not very accurate, but if it works it would prove the issue is with interrupts.
If it is interrupts, then we probably have to make a call to enable interrupts ourselves.

Can you provide some photos of the equipment at work

Hi
Can you provide some photos of the equipment at work
😊

Design integrated PCB that can run the framework

Let's start to collect requirements here

on board ESP32Cam functionality (Esp32 + camera)
on board motor driver and power circuitry
we can discuss power handling. I am currently using a 5v USB power bank. This works well to deliver 5v to Esp32Cam and also delivers power to motors. However, motors are rated at a higher power and would work a little better with 6v. Setting up power is one of the biggest pains for beginners, so that is why I chose the USB power bank. So we should keep simplicity in mind.
make sure I2S bus pins are exposes so we can add I2S modules like PCA9685 or OLED display.
form factor should make it easy to mount as a front-facing camera. This is one of the biggest pains with the ESP32Cam; it has no mounting holes and it is hard to get to the reset button. We should try to address those issues in this design.

Implement autonomous mode using Tensorflow.js

See this repo for an example using the Udacity Self Driving Simulator; https://github.com/bakoushin/self-driving-car-javascript

Here is a medium article about this; https://levelup.gitconnected.com/run-a-self-driving-car-using-javascript-and-tensorflow-js-8b9b3f7af23d

Strategy:

This is mediated by the webUI. The user interacts with the web ui in 3 modes; collection mode, training mode and autopilot mode.
There is a server involved the get's the collected data, does the training and returns the resulting model to the client.
The client initiates a websocket connection with the server. Once connected to the server, the collection and training modes are enabled. Autopilot mode is available without a server if there are models stored in local indexedDB; if not then autopilot mode becomes available when the server is connected. Each mode is detailed in its' own section below.
Normal telemetry is displayed in collection mode and autopilot mode.
Training happens in the server (probably nodejs) application. During training the rover can continue to be operated, including running collection or autopilot mode.

Collection Mode

The user chooses collection mode and provides the address of the server and chooses to start a new data set or append to the current data set.
The webui closes it's camera socket
The webui opens a socket to the server
The webui sends the address of the rover to the server
The rover opens a camera socket to the rover
When the user selects to start capture or capture is triggered automatically by a non-zero throttle, the webui sends a message to the server to start recording.
The server chooses the data set or creates a new one depending on what the webui indicated.
We will modify the camera socket so that the rover always sends metadata with the image; steering angle, throttle, angular velocity, linear velocity, pose.x, pose.y, pose.angle, timestamp
The server will save the data as it arrives (naming each image uniquely and in a way that is increasing monotonically, probably saving the metadata in a csv file and including relative path to the image)
The server will forward the data to the webui through it's websocket so it can be displayed.
When the web ui indicates recording should stop, the server stops saving data to the dataset and closes it's websocket connection to the rover. The webui reopens it's camera socket connection to the rover.

Training Mode

The user chooses training mode and provides the address of the server (we will remember this in localstorage so we can provide a good default).
The webui connects to the server (if it is not already) and asks for the list of datasets.
The webui shows the list of datasets.
The user checks the datasets they wish to train on.
The webui sends a message to the server with the list of datasets to train on.
The server starts training and sends status updates to the webui.
The webui shows the status of the training.
When completed, the server saves the model to disk and messages the webui, the webui saves the model in indexedDb

Autopilot Mode

The user chooses autopilot mode and optionally prov
The webui shows the models that are saved in indexedDb
If connected to a server, the webui asks for the list of models persisted to disk and merges this with those saved locally in indexedDB
The user chooses a model.
The pose is reset to (0, 0, 0)
The webui enters the autopilot loop;
- an image and metadata are received from the rover.
- using the image, a steering angle is inferred using the model.
- the steering angle and throttle are sent to the rover.
- the rover uses an inverse kinematic model to calculate wheel velocities to match the desires velocity and steering angle and updates the wheel velocities. All normal telemetry is sent to the client and the client displays the telemetry.
The user chooses to stop autopilot in the webui
- The webui tells the rover to stop
- The webui exits the autopilot loop

Prototype EspNow peer to peer video

What
Implement a peer to peer video protocol from an Esp32Cam to an Esp32 with a TFT display using EspNow. Measure the latency of transmitting a frame from the Esp32Cam to the endpoint. Compare the peer to peer latency with the web app's websocket latency.

Save calibration values on rover

The esp32cam has the ability to write to SD card and to local flash memory. We should save calibration values that are sent to the rover in flash. We then also have to tell the web application what these values are when it asks, so we need an api for this.

save calibration values to flash when the web application sends them to the rover.
implement a web-socket api for asking for the calibration values. Design this so that the response may be used as a push value (at some point the rover may push these values over the web socket rather than waiting to be asked for them).

Smartcar Shield library

Are you aware of https://github.com/platisd/smartcar_shield#software

I like the API https://platisd.github.io/smartcar_shield/index.html

It seems you share some similar aims. I wonder if there is scope for collaborating on the C++ side of things?

See if we can implement OTA updates

If we can set up our partition so it can handle both the camera output, see Issue 18 and OTA, then we can use OTA during development to push new code. This would then allow forward/revers of both wheels and use of the encoders because we would not need to download via the serial rx/tx pins; we could do it with an ota.

This ticket is for a discussion and to capture relevant research.

Improve camera frame rate

Camera frame rate is influences by

image size; smaller images, especially 640x480 and below, produce better frame rates
number of frame buffers allocated. There should be at least 2 buffers allocated. This is only possible for lower resolution modes and with PRAM available.
The clock rate. It turns out a lower clock rate can enhance framerate on low resolutions, perhaps because all the work the sensor does can be done within a single frame. See espressif/esp32-camera#15

NOTE: the way memory is partitioned is important to make sure there is enough available for multiple frame buffers. The ESP32-Cam example sketches us the 'Huge APP' partition scheme. So we should make sure we've made enough memory available for our system to have multiple frame buffers. See https://iotespresso.com/how-to-set-partitions-in-esp32/ See this for how to set partition table in platformio https://community.platformio.org/t/partion-scheme-no-ota-with-platformio/13102/8 See relevant PlatformIO docs https://docs.platformio.org/en/latest/platforms/espressif32.html#partition-tables

Tasks;

experiment with combinations of image size, frame buffers and clock rate to optimize frame rate
- do this with and without factoring in the transfer of the images, so that we see the raw rate and the overhead imparted by transferring to the web UI.
Review partition table settings to make sure they support the configuration. If possible leave memory for OTA, as that would be a much nicer way of programming the board.
Add a compile flag to fix the frame size, number of buffers allocated and the clock rate to the chose parameters
Hide the camera properties UI when the image properties are fixed.
Hard code the camera properties endpoint to return the fixed properties.

Donkeycar uses 160x120, so that might be what we want here. However 640x480 is much better quality for FPV, so maybe we want a way to switch between these two mode.

Add a preprocessor flag ENABLE_ESP32_CAMERA that must be defined for the Esp32Cam camera code to be included in the compile.
Add and endpoint to check for camera support; if camera is supported this returns true, false otherwise
Have the web client check the endpoint at startup and hide or show the camera controls based on the return value. Default should be to hide the camera support.

I did measure the latency

Can we learn something from this project? https://github.com/OpenHD/Open.HD/wiki