Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistently drifting towards negative direction / losing steps on positive direction? #100

Open
mrrossdude opened this issue Dec 2, 2024 · 12 comments

Comments

@mrrossdude
Copy link

mrrossdude commented Dec 2, 2024

I upgraded my Makerdreams evo one with a PicoCNC last month, but instantly had issues where the Z carriage was hitting the limit switch on G28 after finishing a job. It would slowly drift upwards during programs and cut higher than it should, and I was getting misalignmed cuts on both X and Y at the same time. I gave up as I needed to complete some jobs, so swapped my original 8 bit board back in and have been running it perfectly fine since. I run it daily cutting aluminium and brass at 4000mm/s, 1000mm/s accel and have done this for over a year, with zero issues of missed steps or inaccuracy on the original 8 bit controller board.

I recently tried to revisit this, and decided to purchase another PicoCNC and PicoW incase it was a hardware issue. Even with the new board and Pi, the exact same issue persists. If I swap the original 8 bit controller back in, the problem goes away, so it's definitely not an issue with the machine itself, and shouldn't be a problem with the wiring or drivers etc. I'm running DM556 with separate 48v PSU.

I've set up a test where the spindle carriage moves back and forth, and measured with a dial indicator to try and track the actual distances moved. I've done this on X, Y, and Z individually to eliminate any possibility of stepper wire "crosstalk", incase that's a thing. To keep things the same for the stepper/driver logic signals, I flipped the connector on the Z carriage and unticked the invert direction signal for this axis in the GRBL settings (normally it's the only axis that is flipped). The effect this has had is that instead of drifting upwards, it now drifts down, but I wanted to keep things the same just incase.

I made 100 moves back and forth 10mm with a half second dwell between direction changes. found that Z ended up 0.6mm lower and X ended up a full 2mm over in the negative direction. Strangely Y was basically perfect but I definitely noticed it drift before with cuts, it just didn't manifest during testing, so class that as possibly intermittent for now.

Z seemed to be intermittent during the test, where it would hold steady and repeat exactly 10mm back and forth for a few moves, and then drift 0.02mm towards negative each time, then hold steady, repeat etc.
X was the most obvious to track, as each time it would move exactly 10mm negative, and then it would move 9.98mm positive. It was a clear 0.02mm loss with every positive move, which explains the full 2mm change over 100 moves.

Here's a video showing the testing and results:

https://www.youtube.com/watch?v=1Penjpo-bwU

I've tried measuring the signals going to the drivers with a multimeter, but it's not super reliable and I don't have a scope. I can see what seems to be a 0v-5v changing signal for the DIR/PUL, but occasionally it might show something like 3v or 2v. This might be my meter being too slow, I'm waiting on a cheap logic analyser incase that is able to show me something clearer.

  • I've tried swapping drivers, no change.
  • Triple checked all wiring, swapped stepper and driver cables round etc, no change
  • The issue is the same over both USB and ethernet.
  • Changing senders has no effect.
  • Putting the acceleration and stepper speeds super low makes no difference.
  • Tried different pulse widths (20us/10us/5us/3us etc), no change
  • Tried active low/high signals for all relevant stepper related options, no change

If I change microstepping, the problem persists but reduces in a linear fashion. E.g I'm currently losing 0.02mm per move at 800pulse/rev, if I up this to 1600 my distance loss is halved. It's still not acceptable for me though, as I run a lot of patterned jobs with lots of repeated lifts and by the end I can still be millimetres out.

I checked the documentation for the DM556 drivers I'm using to make sure that the signals are in line with what they expect. They mention that "ENA must be ahead of DIR by at least 5us", "DIR must be ahead of PUL effective edge by 5us" and "low level width not less than 2.5us", but apart from the pulse width change in GRBL I can't see how to affect the other two requirements.

If anyone can chime in with some tips or help, I would appreciate it massively.

Tom

@terjeio
Copy link
Contributor

terjeio commented Dec 2, 2024

Post your settings please ($$ output) so I can check. Your test gcode may also be of help.
I am currently refactoring the step generation code in order to better support step injection and the new RP2350 processor. As a part of that I use a machine simulator that counts the stepper pulses, and I have not seen any missed steps yet. My last run was with 1250 steps/mm and 400 mm/sec^2 accel - via telnet over ethernet.
How are you connecting to the controller?

@mrrossdude
Copy link
Author

GRBL settings:

$0 = 10.0 (Step pulse time, microseconds)
$1 = 255 (Step idle delay, milliseconds)
$2 = 0 (Step pulse invert, mask)
$3 = 0 (Step direction invert, mask)
$4 = 7 (Invert step enable pin, boolean)
$5 = 0 (Invert limit pins, boolean)
$6 = 1 (Invert probe pin, boolean)
$9 = 1
$10 = 33 (Status report options, mask)
$11 = 0.010 (Junction deviation, millimeters)
$12 = 0.002 (Arc tolerance, millimeters)
$13 = 0 (Report in inches, boolean)
$14 = 70
$15 = 3
$16 = 4
$17 = 0
$18 = 0
$19 = 0
$20 = 1 (Soft limits enable, boolean)
$21 = 1 (Hard limits enable, boolean)
$22 = 13 (Homing cycle enable, boolean)
$23 = 1 (Homing direction invert, mask)
$24 = 20.0 (Homing locate feed rate, mm/min)
$25 = 1000.0 (Homing search seek rate, mm/min)
$26 = 250 (Homing switch debounce delay, milliseconds)
$27 = 0.500 (Homing switch pull-off distance, millimeters)
$28 = 0.100
$29 = 0.0
$30 = 24000.000 (Maximum spindle speed, RPM)
$31 = 0.000 (Minimum spindle speed, RPM)
$32 = 0 (Laser-mode enable, boolean)
$33 = 5000.0
$34 = 0.0
$35 = 0.0
$36 = 100.0
$37 = 7
$39 = 1
$40 = 0
$43 = 1
$44 = 4
$45 = 3
$46 = 0
$62 = 0
$63 = 3
$64 = 0
$65 = 0
$70 = 11
$100 = 100.00000 (X-axis travel resolution, step/mm)
$101 = 100.00000 (Y-axis travel resolution, step/mm)
$102 = 100.00000 (Z-axis travel resolution, step/mm)
$110 = 8000.000 (X-axis maximum rate, mm/min)
$111 = 8000.000 (Y-axis maximum rate, mm/min)
$112 = 3500.000 (Z-axis maximum rate, mm/min)
$120 = 1000.000 (X-axis acceleration, mm/sec^2)
$121 = 1000.000 (Y-axis acceleration, mm/sec^2)
$122 = 500.000 (Z-axis acceleration, mm/sec^2)
$130 = 367.800 (X-axis maximum travel, millimeters)
$131 = 218.800 (Y-axis maximum travel, millimeters)
$132 = 112.000 (Z-axis maximum travel, millimeters)
$300 = grblHAL
$301 = 1
$302 = 192.168.5.1
$303 = 192.168.5.1
$304 = 255.255.255.0
$305 = 23
$307 = 80
$308 = 21
$341 = 0
$342 = 30.0
$343 = 25.0
$344 = 200.0
$345 = 200.0
$346 = 1
$370 = 2
$372 = 0
$384 = 0
$394 = 4.0
$398 = 100
$481 = 0
$484 = 1
$486 = 0
$535=
$650 = 0
$673 = 1.0

Testing gCode (Z in this example) is just a simple repeated movement:
g1 g91 z10 f3500 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4 g91 z-10 G04 P0.4 z10 G04 P0.4

The only thing on the $$ that I can see might look wrong is that I'm currently active high on $2, but active low doesn't affect it either.

@mrrossdude
Copy link
Author

Oh sorry forgot to mention, currently connected via Ethernet within ioSender, but problem persists on USB in both ioSender and UGS.

@terjeio
Copy link
Contributor

terjeio commented Dec 3, 2024

With the refactored code the pulse count is correct (I have not rechecked the current github code yet). So somehow steps are lost.

They mention that "ENA must be ahead of DIR by at least 5us", "DIR must be ahead of PUL effective edge by 5us" and "low level width not less than 2.5us", but apart from the pulse width change in GRBL I can't see how to affect the other two requirements.

For "ENA must be ahead of DIR by at least 5us" set $1 (step idle delay) to 255 as a test. Normally it should not be needed since there is a hardcoded 2ms delay in the driver. However your test code contains a lot of G4 0.4ms delays that may cause the machine to lose position when the drivers are disabled, more likely in the Z direction due to gravity pulling the spindle down. FYI $37 (steppers to keep enabled) can be used to set, per axis, which drivers that should be kept enabled regardless of the $1 setting. IMO at least the Z axis should if gravity tends to pull it down - it does so in my machine.

For "DIR must be ahead of PUL effective edge by 5us" try setting $29 (Pulse delay) to at least 5.

To be aware of: the sum of $0 and $29 plus minimum time between pulses as required by the drivers will limit the max step frequency and thus the max feed rate. The processor interrupt latency may also affect the max limit, less so for the RP2040 driver since it needs only one, instead of the usual two, interrupts per step pulse.

@mrrossdude
Copy link
Author

mrrossdude commented Dec 3, 2024

$1 is at 255 already, and $37 is already set to keep all steppers enabled.

I've tried changing $29, trying 1us increments all the way up to 10us, which hasn't fixed it. (It also gives an error when trying to set higher than 10us).

I've set up a new test where I set the feedrate to 1mm/min, and I am moving 0.1mm back and forth manually against the dial indicator on the X axis. This allows me to see the ten 0.01mm discrete steps being made over 6 seconds. I am moving 0.1mm negative, then 0.1mm positive. There's a definite problem with direction change, and it's happening every single time the direction goes from negative to positive. Here's a quick video showing the results:

https://streamable.com/qnfv6z?src=player-page-share

I connected up the logic analyser, and the signals look correct to me. There's a clear 9us gap between the DIR change and the pulse signals, so I am very confused:

Both movements:
0 1mm left right @ 10us $29

Closeup of first movement, pulse comes after DIR change:
0 1mm left @ 10us $29 CLOSEUP

Closeup of second movement, pulse comes after DIR change:
0 1mm right @ 10us $29 CLOSEUP

Others online have mentioned a similar issue affecting MACH3 machines which was fixed when they set a delay between the DIR and PULSE as I originally thought above, but as mentioned I can't get this any higher than 10us in the GRBL settings and it seems like there's already a good enough delay for the driver to cope. It's also strange that it only affects a direction change back to positive.

Any thoughts?

@terjeio
Copy link
Contributor

terjeio commented Dec 3, 2024

It also gives an error when trying to set higher than 10us

The limit is hardcoded in this line, if you compile yourself you can try changing it to 15 or more.

It's also strange that it only affects a direction change back to positive.

That is likely be due to the optocoupler LEDs in the drivers having different turn on/turn off delays.

Any thoughts?

Do you have resistors in series with the pulse and dir signals?

@mrrossdude
Copy link
Author

mrrossdude commented Dec 4, 2024

Do you have resistors in series with the pulse and dir signals?
I do not.

The limit is hardcoded in this line, if you compile yourself you can try changing it to 15 or more.

I'm trying to compile in VS code but running into errors, I've cloned the repo and in my_machine.h uncommented:

#define BOARD_PICO_CNC
#define ETHERNET_ENABLE
#define WIZCHIP

then ADD_ETHERNET changed to ON in cmakelists.txt, and changed the 10us hardcoded limit in settings.c to "20" for testing.

On compile though, I'm getting an error stating "hardware/rtc.h: No such file or directory GCC [Ln 42 Col1]"

I assume this is related to the pico sdk library, but the PicoSDK code extension is installed correctly as far as I can see.

@terjeio
Copy link
Contributor

terjeio commented Dec 4, 2024

On compile though, I'm getting an error stating "hardware/rtc.h: No such file or directory GCC [Ln 42 Col1]"

Replace CMakeLists.txt with the one in this zip:

CMakeLists.zip

@mrrossdude
Copy link
Author

Appreciated, managed to compile.

Unfortunately changing the hardcoded limit had no effect. I tried multiple delays up to 25us and the issue persists every time on direction change. Logic analyser still shows a definite gap between the DIR and PULSE.

When you mentioned resistors in series on the signal wires - is that something that might help? My electronics knowledge is limited so I'm unsure what you might be leading towards.

@terjeio
Copy link
Contributor

terjeio commented Dec 4, 2024

Series resistors is something you have to add when driving the optos with voltages > 5V.
To little current in the opto LED, and also in the transistor, increases the opto response time - not what you want.

I'll run a test with a DM420 driver to see if that misbehaves.

@terjeio
Copy link
Contributor

terjeio commented Dec 5, 2024

My DM420 drivers are all thrash - none works correctly (losing steps) and two of them are very noisy. I got them from China a few years ago in a duplicate delivery so likely they are bad. I tested with 3 different controllers, all behaved the same. The controllers used were a PicoCNC, a prototype board for the new RP2350 (not Pico2) and a T41BB5X Pro (Teensy 4.1) - all from Phil.
I retested the same controllers, and one more (Arduino Due) with 3.3V signalling, using exactly the same settings with a A4988 Polulu style stepper driver - no lost steps anymore and smooth running.

FYI my router has a Teensy 4.1 (on a T41U5XBB board from Phil) and DQ542MA stepper drivers - no problems with that combo. I guess I should get some additional drivers for bench testing...

@mrrossdude
Copy link
Author

I appreciate the effort, many thanks.

Since all the drivers I've got here and swapped out for troubleshooting are the cheapest possible Chinese ones (less than £15 each), I've ordered some (hopefully) higher quality DM556T from StepperOnline just to try and rule it out a bit more. Considered getting proper Leadshine as many would recommend but they're difficult to source relatively quickly and some people seem to think the StepperOnline ones are the same product.

Either way, should be arriving tomorrow so I can hopefully get some good results and will update accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants