Comments (49)
Recompiled everything and can reproduce the error with the following hal file
loadrt [KINS]KINEMATICS
loadrt [EMCMOT]EMCMOT servo_period_nsec=[EMCMOT]SERVO_PERIOD num_joints=[KINS]JOINTS
# Connection to the board
loadrt litexcnc extra_modules="toolerator" connections="eth:10.0.0.10"
# Assign to threads
# - LitexCNC
addf EMCO5.read servo-thread
# - MOTMOD
addf motion-command-handler servo-thread
addf motion-controller servo-thread
# - LitexCNC
# addf EMCO5.write servo-thread
The above hal file will run without errors. However, as soon as I enable the write function I get the error:
USRMOT: ERROR: command timeout
emcMotionInit: emcTrajInit failed
Waiting for component 'inihal' to become ready.
It boils down to something which must have changed in the write function which prevents the module from starting. Will further investigate where the write-function fails by shutting down all components and then re-enabling them one by one.
Edit 1:
Found the culprit in litexcnc.c
in the line:
static void litexcnc_write(void *void_litexcnc, long period) {
litexcnc_t *litexcnc = void_litexcnc;
// Check whether the write has been initialized AND the read and write functions
// are in the recommended order (first read, then write). In the first loop the
// we don't write any data to the FPGA, but it is configured. This is required,
// because the configuration requires the period to be known, which prevents the
// configuration to be performed before the HAL-loop starts
if (!litexcnc->write_loop_has_run) {
// Check whether the read cycle has been run, if not, the order is not correct
if (!litexcnc->read_loop_has_run) {
LITEXCNC_WARN("Read and write functions in incorrect order. Recommended order is read first, then write.\n", litexcnc->fpga->name);
}
// Configure the FPGA and set flag that the write function has been done once
litexcnc_config(void_litexcnc, period); // <== This line blocks the starting of the FPGA and the time-out
litexcnc->write_loop_has_run = true;
return;
}
Edit 2:
Found the misbehaving module: stepgen
. While determining the best pick-off (and thus the best accuracy) it gets into a infinite loop in some cases...
from litex-cnc.
Fantastic will try in the morning
from litex-cnc.
Awesome work!
Do you think it is better to use the pos2vel / position-control in general?
I do not really have a tuned PID setup. It is more just a P setup for loop back to the FPGA .
I finished 400 little parts today, which I turned on my lathe. No single problem with LiteX and Colorlite board
from litex-cnc.
This is the interesting bit on line #80
Waiting for component 'inihal' to become ready......................................A configuration error is preventing LinuxCNC from starting.
It's really hard to debug without all the ini & hal files.
from litex-cnc.
Thank you for your response @ozzyrob. It is indeed a configuration error, because the card does connect and buffers are being allocated and after all cleaned up.
Please share your ini and Hal files.
from litex-cnc.
Sure. I should have attached the files...
The thing is, I didn't change anything.
You say you only did bug fix for overflow in the stepgen and something for RIP install?
So, that should have nothing to do with my hal and ini, or?
Semse.hal.txt
Semse.ini.txt
from litex-cnc.
Let me check one thing. Did you come from the main branch or did you update from branch 11?
If you updated from branch 11: yes only the Rip and overflow have been changed.
If you come from the main branch: the config file should be updated and the firmware rebuilt.
Can you also share you're config json?
At this moment I'm on holiday, as soon as I'm back I will build your config and investigate.
Also if there are any logs available from LinuxCNC, please share. Somewhere I might have the feeling a pin has been renamed somewhere in the process. (yes, that's on me)
from litex-cnc.
Yes. I come from the 11-add-externals.
I used the new json format. It works fine.
Only when I pull the last update I cannot start LinuxCNC
from litex-cnc.
If you can describe your system, version of Linuxcnc, Debian version & kernel version, and your method of installing Linuxcnc I can try and replicate.
I would also require your json file and the following mentioned in your ini file
HALFILE = custom.hal
POSTGUI_HALFILE = postgui_call_list.hal
SHUTDOWN = shutdown.hal
If that is ok with you Pete. I'm unemployed ATM and looking to my mind occupied.
from litex-cnc.
All the information should be in the report I attached in the first post above.
I cloned the repo with git clone --single-branch 11-external....
poetry Install
poetry shell
pip3 Install click
pip3 Install yapps
litexcnc install_driver
The custom and other hal Files are empty.
It is a very basic LCNC setup just for testing
from litex-cnc.
@ozzyrob: I'm completely okay with that when you can try and help others. On the other hand, hope your in a job soon again.
from litex-cnc.
Cheers Pete
Yeah sorry mate, you're right, was first thing in the morning down under. Just still need the json file so I can build the firmware and flash the fpga.
from litex-cnc.
Sorry.
I am not at home till end of month.
I used the example json.
Just added the index pins to encoders. 3 pwm. 4 stepgens.
from litex-cnc.
I'm going take a stab at this and suggest it could be a latency issue. Have you tried isolcpus=1 when booting ?
The reason I think this I tried a simple sample config that uses the hal_speaker component and as my kernel was non real time I was getting these errors.
waiting for s.joints<0>, s.kinematics_type<0>
waiting for s.joints<0>, s.kinematics_type<0>
waiting for s.joints<0>, s.kinematics_type<0>
waiting for s.joints<0>, s.kinematics_type<0>
waiting for s.joints<0>, s.kinematics_type<0>
waiting for s.joints<0>, s.kinematics_type<0>
USRMOT: ERROR: command timeout
from litex-cnc.
It could be. But as I said, I did not change any other thing.
I can start and use LinuxCNC with LiteX with no problems when I switch back to the "old" drivers
from litex-cnc.
Can confirm main branch is ok, although I get Apply time exceeded limits with 2.9 least linuxcnc loads, haven't tested 2.8 yet.
With 11-add-external on Buster with Linuxcnc 2.8 & Bookworm with Linuxcnc 2.9 sometimes I'm seeing errors as mentioned by OP, sometimes the system is just "freezing".
from litex-cnc.
When I am back at home I will try to go step by step through the components which causes my error
from litex-cnc.
No probs mate, thanks for all your hard work.
Any thoughts on "Apply Time exceeds limits" on the main branch ?
from litex-cnc.
The apply time exceeding limits can be due to:
- the first loop taking too much time because every single pre compute is done at the same time, causing the first loop to take too long. Also in my experience the ping time of the card is quiet long. If it happens right at start up, you can ignore this. This behavior can be improved by doing more during the setup phase. This behavior is independent of the version of LinuxCNC.
- sometimes the network can be lagging. The network calls (especially reading from the card) are blocking.
I did not experience any lock up though on my machines. I'm running on a RPi 4 using isolcpus=1,2,3 for best latency results. Although isolcpus=2,3 also yields good results. Generally it is recommended (no source, this is from the top of my head) to isolate a pair.
My response time might be higher due to holiday. My apologies for any inconvenience.
from litex-cnc.
No rush mate, it's all sweet & cruisy Down Under, no need to apologise. Enjoy your holiday.
To be fair I'm testing on Lenovo T530 with a Dual core i5 (it's my favourite, but shhh don't tell my other computers ;) ). The real machine will be a PC with a quad core i5. I just use the laptop as I sit in the living room rather than isolated somewhere else.
So I'll do a bit more testing Tomorrow.
from litex-cnc.
I also have plenty of time with this.
My lathe don't need this Speed when overflow can occur. And I don't use a RIP.
I am fine with the Version that works for me ππ
from litex-cnc.
Back from my holiday and sadly: I can reproduce this bug. Starting my machine leads to freezing of LinuxCNC. Seems that some processes are not running any more.
Edit: I have to investigate it further. When disabling litexcnc
completely, it still fails to start.
from litex-cnc.
In the LCnc Forum someone mentioned that this happens when he add stepgens to the Json config
from litex-cnc.
At this moment I'm thinking my installation is completely broken:
- removed all files from LitexCNC
- created a new machine config from the examples that come with LinuxCNC
- et voila: the real time components fail
Reinstalled LinuxCNC to no avail. Today I'm going to format the image to see whether I can create a working config again....
from litex-cnc.
You think it's corrupting something in the actual Linuxcnc installation ?
from litex-cnc.
https://www.dropbox.com/s/h1v0j1btdzi96ia/VID_20230804_093236.mp4?dl=0
Hey.
I think it is time to show the world that this project is not only a bugs and feature request π
This is with 11-external before the last update
from litex-cnc.
So 11-external was working ?
Looking good.
Have you got any more detail of the X axis ?
I'm trying to come up with something simple for my Myford ML/S7 Frankenstein.
from litex-cnc.
https://www.dropbox.com/s/e3lnlcgugkhil5h/IMG_20230804_094837.jpg?dl=0
https://www.dropbox.com/s/4x0orttlawtoq6c/IMG_20230804_094831.jpg?dl=0
https://www.dropbox.com/s/m911beann6je7yi/IMG_20230804_094825.jpg?dl=0
It is a jmc ihsv57 180w servo with 1204 kus spindle and a selfmade mounting plate
from litex-cnc.
Cheers, nice solution.
from litex-cnc.
OK rolled back 11-add-external-extensions-to-litexcnc to:
commit ba57141686940a113f1d2394c17f069025eb3770
Author: Peter van Tol <[email protected]>
Date: Wed Jul 12 10:33:57 2023 +0200
pip vs. pip3
Was able to get the config running, but on a quad core Intel Core i5-3470 with 3 cores isolated I was stil getting
Litexcnc: Apply time exceeded limits.litexcnc: Apply time exceeded limits.
Apply time exceeding limits (too long): 69026366277, 69026365867, 69026405879
That was with only Linuxnc being run from a terminal.
What should the watchdog be set at ?
from litex-cnc.
@OJthe123 : how would you like the idea of crating a show your machine page in the documentation?
@ozzyrob : this is one of the problems which has been resolved between your rollback version and the current version. Because I'm experiencing the same problem (reinstalling atm), I hope the problem is fixed. Otherwise I'll make a patch for you.
from litex-cnc.
@Peter-van-Tol : Sure, no problem. What do you need?
I also have the "Apply time.." info. But I cannot say that has any impact on my machine...
But what I noticed, is that the calculated(?) encoder.velocity is 25% higher than the actual servo speed.
I scale it down with the position-scale to fix it at the moment.
Could be calculation error, or really the servo speed is off. I have no other possibility to measure it
from litex-cnc.
Here are my machine files...
semse.7z.zip
from litex-cnc.
Spend today reinstalling LinuxCNC on my RaspberryPi, bu to no avail. Something has changed apparently and prevents the real-time components to start (i.e. emcTrajInit failed)...
edit: installed the following versions:
- https://forum.linuxcnc.org/9-installing-linuxcnc/47841-installing-linuxcnc-2-8-4-on-raspbian-10-buster-tested-on-raspberry-pi-3b-pi?start=10
- https://forum.linuxcnc.org/9-installing-linuxcnc/39779-rpi4-raspbian-64-bit-linuxcnc?start=180#264347
Both give the same error on my RPi, how is that possible?
from litex-cnc.
Any luck yet ?
Sounds like a real PITA. :(
from litex-cnc.
Just for more success story π
G76 Threading cycle also works
from litex-cnc.
@OJthe123 : Nice
In the meanwhile, my RPi is showing signs of life again, no errors when starting a simple configuration. Now going to install LitexCNC and re-build... The error was in the end PEBKAC (configuration error)
from litex-cnc.
Was just going to mention that the code in litexcnc.c is the same in the commit ba57141. And that works apart from the apply time messages.
from litex-cnc.
Took some effort, but have found the error. If you pull the latest version of the branch #11 your LinuxCNC should start up again.
from litex-cnc.
Ok gave it a go, tried with the OP's configs.......damned if I could get rid of the following error. Latency is good, I can run a config using steppers with a 25us base thread on this machine. Ping times are good. Tried isolating various cores (4 core i5)
But after that whinge it does start up, just can't jog.
from litex-cnc.
The following error should be gone by the latest commit. There was a difference between Python (firmware) and C (driver) in determining the pick-off. For slow movements this could be compensated by the PID or pos2vel
. However, for faster speeds the difference became to big.
The current commit has been tested on my EMCO5, which shows no following error when trying 1500 mm/min whilst using pos2vel
as translation between position and velocity.
from litex-cnc.
Continuing from #29 ...
With the config and hal-files from @OJthe123 I can now re-produce the problem. The difference between my setup and his is mainly the scale. Now that has been sorted out, I can start debugging. Just want to close this issue in a proper manner...
I have suggested that the problem might be with using the pin position-feedback
instead of position-prediction
. At this moment this seems to resolve the problem in my set-up, at least for having a following error. However, both using PID
as well asnpos2vel
the machine starts to oscillate when the jogging stops.
My observations are:
- the oscillations are proportional to the scale. Larger scale, larger oscillations;
- when using
PID
, these might be lessened by applying proper tuning; - looking for a solution in `pos2vel' as well.
EDIT
Not committed yet, but I got a rock-solid version of pos2vel
working at this moment. It is more based on the way LinuxCNC stepgen behaves. Have the feeling that LinuxCNC is tuned on its own stepgen. Upcoming changes will be:
- the stepgen module will not emit the pins
period-s
andperiod-s-recip
anymore (replaced by internal parameters). This makes the stepgen module less heavy on the processor, as some floating point arithmetic will be removed; - no more need for higher acceleration for STEPGEN and MOTMOD;
pos2vel
will no longer be a separate module, as it will be included instepgen
.stepgen
will have an additional parametervelocity-mode
. The default is position control (velocity-mode=false
), which drives the motor to a commanded position, subject to acceleration and velocity limits. Velocity control (velocity-mode=true
) drives the motor at a commanded speed, again subject to accel and velocity limits. NOTE: users who want to continue to use their tuned PID-setup must addsetp [NAME].stepgen.##.velocity-mode 1
to their hal-files.
EDIT 2
Finished the re-write of litexcnc_stepgen.c
. Tonight I will test this modification (it is a real big clean-up) with loads of enhancements. It does compile, but during the day no way to test it on my equipment.
from litex-cnc.
Just committed the changesπ:
pos2vel
has been removed;stepgen
has now position mode (you can select the pin in HAL);- the documentation has been updated accordingly.
For an advice on position vs pid: if your setup works, there is no need to change. However, the readability and maintainability of the HAL-file will improve when using the position control. The code below is the minimal example for a single axis, which is roughly 50% reduced in size when compared to a solution with pid
or pos2vel
.
STEPGEN - X-AXIS
########################################################################
# - Setup of timings
setp [LITEXCNC](NAME).stepgen.00.position-scale [JOINT_0]SCALE
setp [LITEXCNC](NAME).stepgen.00.steplen 5000
setp [LITEXCNC](NAME).stepgen.00.stepspace 5000
setp [LITEXCNC](NAME).stepgen.00.dir-hold-time 10000
setp [LITEXCNC](NAME).stepgen.00.dir-setup-time 10000
setp [LITEXCNC](NAME).stepgen.00.max-velocity [JOINT_0]MAX_VELOCITY
setp [LITEXCNC](NAME).stepgen.00.max-acceleration [JOINT_0]STEPGEN_MAXACCEL
# setp [LITEXCNC](NAME).stepgen.00.debug 1
# - Connect velocity command
net xpos_cmd joint.0.motor-pos-cmd => [LITEXCNC](NAME).stepgen.00.position-cmd
net xpos_cmd joint.0.motor-pos-fb <= [LITEXCNC](NAME).stepgen.00.position-prediction
# - enable the drive
net xenable joint.0.amp-enable-out => [LITEXCNC](NAME).stepgen.00.enable
I would really appreciate if you would test this latest version, so this issue can be closed as resolved.
from litex-cnc.
You rock dude!
changed the drivers and build new firmware.
Tested my setup with 3000mm/min. No errors. Maybe it could be faster, but I did not want to kill my maschine in case of a bug in firmware or driver π
EDIT:
just for those who will copy & paste, there are two typos. corrected below.
`
STEPGEN - X-AXIS
########################################################################
# - Setup of timings
setp [LITEXCNC](NAME).stepgen.00.position-scale [JOINT_0]STEP_SCALE # typo
setp [LITEXCNC](NAME).stepgen.00.steplen 5000
setp [LITEXCNC](NAME).stepgen.00.stepspace 5000
setp [LITEXCNC](NAME).stepgen.00.dir-hold-time 10000
setp [LITEXCNC](NAME).stepgen.00.dir-setup-time 10000
setp [LITEXCNC](NAME).stepgen.00.max-velocity [JOINT_0]MAX_VELOCITY
setp [LITEXCNC](NAME).stepgen.00.max-acceleration [JOINT_0]STEPGEN_MAXACCEL
# setp [LITEXCNC](NAME).stepgen.00.debug 1
# - Connect velocity command
net xpos_cmd joint.0.motor-pos-cmd => [LITEXCNC](NAME).stepgen.00.position-cmd
net xpos_fb joint.0.motor-pos-fb <= [LITEXCNC](NAME).stepgen.00.position-prediction # typo
# - enable the drive
net xenable joint.0.amp-enable-out => [LITEXCNC](NAME).stepgen.00.enable`
from litex-cnc.
Sorry guys for being a bit quiet, Iβve been playing with some 7c81 firmware on a Spartan 6 dev board. When I get the chance Iβll setup my machine that I use for testing.
from litex-cnc.
Really Happy Pete. I owe your at least a Beer
Had a play around no following errors, running the sample code was passed, jogging via keyboard passed, MDI passed, only issue was on shutdown. Starting up again runs fine but still quits with the same message.
Shutting down and cleaning up LinuxCNC...
task: 48698 cycles, min=0.000007, max=0.097561, avg=0.009917, 0 latency excursions (> 10x expected cycle time of 0.010000s)
litexcnc/Semse: Watchdog timeout not set. Using default value 0 ns (3 times period).litexcnc: LitexCNC etherbone driver unloaded
rtapi_app: caught signal 11 - dumping core
:0: exit value: 255
:0: rmmod failed, returned -1
Waited 3 seconds for master. giving up.
Note: Using POSIX realtime
motmod: not loaded
:0: exit value: 255
:0: rmmod failed, returned -1
Note: Using POSIX realtime
trivkins: not loaded
:0: exit value: 255
:0: rmmod failed, returned -1
Note: Using POSIX realtime
homemod: not loaded
:0: exit value: 255
:0: rmmod failed, returned -1
Note: Using POSIX realtime
tpmod: not loaded
:0: exit value: 255
:0: rmmod failed, returned -1
:0: unloadrt failed
Note: Using POSIX realtime
from litex-cnc.
can confirm...
When I run from terminal I can see the same output....still can see no effects on the maschine...
Shutting down and cleaning up LinuxCNC...
Running HAL shutdown script
task: 603 cycles, min=0.000041, max=0.012258, avg=0.009716, 0 latency excursions (> 10x expected cycle time of 0.010000s)
mb2hal quit_signal DEBUG: signal [15] received
mb2hal quit_cleanup DEBUG: started
mb2hal quit_cleanup DEBUG: unloading HAL module [16] ret[0]
mb2hal quit_cleanup DEBUG: done OK
mb2hal main OK: going to exit!
litexcnc: LitexCNC etherbone driver unloaded
rtapi_app: caught signal 11 - dumping core
free(): invalid pointer
<commandline>:0: exit value: 255
<commandline>:0: rmmod failed, returned -1
Waited 3 seconds for master. giving up.
Note: Using POSIX realtime
motmod: not loaded
<commandline>:0: exit value: 255
<commandline>:0: rmmod failed, returned -1
Note: Using POSIX realtime
trivkins: not loaded
<commandline>:0: exit value: 255
<commandline>:0: rmmod failed, returned -1
<commandline>:0: unloadrt failed
Note: Using POSIX realtime
from litex-cnc.
This error is due to an old loadrt
statement in your hal-files. You have now:
loadrt litexcnc
loadrt litexcnc_eth connection_string="192.168.178.150"
This should be combined to the following single statement:
loadrt litexcnc connection_string="eth:192.168.178.150"
Why this error emerges at this moment? It is because the FPGA is reset to its safe state when LinuxCNC is unloaded. This means that litexcnc
will send a last message to the FPGA. When the FPGA is loaded using two separate statements, the etherbone driver is already unloaded (and memory thus freed up). Thus writing to a closed device, without allocated memory leads to a core dump.
I will close this issue, as the original problem has been solved. In another issue I will unpublish the litexcnc_eth
component, so it cannot be inadvertently used as a stand-alone component.
from litex-cnc.
@ozzyrob : for beer that would be then a VB please π» ...
But to be honest: the beer would be on me. Thank you for your support, testing and time spent to make this possible and closing this issue.
from litex-cnc.
Related Issues (20)
- [Feature request]: Add PWM/direction and UP/DOWN to PWM HOT 4
- [Feature request]: Dither PWM
- [Feature request]: Support ENCODER counter-mode
- [Feature request]: add `index-enable` to STEPGEN HOT 1
- Bugfix for RPI5 Support
- Safety: Upcoming change in Watchdog HOT 1
- HUB75HAT Pin-out incorrect
- GPIO does not compile when either all pins are input or output
- Requirement for encoder.<n>.reset (HAL_BIT) HOT 17
- PATH not set on LinuxCNC RPi4 image after installing LitexCNC HOT 2
- Module ENCODER: width of Z-index pulse not taken into account
- Add option invert PWM output HOT 3
- Installation on PC does not detect correct platform HOT 2
- Stepgen only working up to 4 steppers HOT 2
- Reset watchdog on FPGA reset
- New module `shift_in` and `shift_out` for using shift registers HOT 1
- Maximum frequency 375 kHz. HOT 37
- Upgrade toolchain: bring Yosys to version 0.38
- differential step HOT 1
- Litex-cnc on standalone fpga chip?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from litex-cnc.