Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[net] FTP not working, seems to crash or otherwise corrupt ktcp #1535

Open
skiselev opened this issue Feb 1, 2023 · 12 comments
Open

[net] FTP not working, seems to crash or otherwise corrupt ktcp #1535

skiselev opened this issue Feb 1, 2023 · 12 comments
Labels
bug Defect in the product

Comments

@skiselev
Copy link

skiselev commented Feb 1, 2023

Description

  • I loaded ELKS-0.6.0 Minix floppy image, configured NE2000 compatible NIC, and launched ktcp
  • The NIC is detected correctly
  • I can ping the ELKS system from my Linux system
  • When trying to ftp to my Linux system, I get "Connection timed out" message. This is followed by repeating "eth rcv oflow (0x91), keep 1" messages. At this point I can no longer ping the system
  • The network card works fine under MS-DOS, used an 8-bit patched packet driver and mTCP suite

Configuration

How to reproduce ?

  • The problem happens most of the times. I was able to successfully connect to the FTP server once (out of ~10 attempts), and even then, when trying to get a file, the connection failed.
  • Steps to reproduce the problem:
    • Boot to ELKS
    • Edit /bootopts, add "netirq=2 netport=0x260" line (this is the configuration of my NIC
    • Reboot the system so that the new configuration takes effect
    • Run "ktcp 192.168.1.47 192.168.1.254 255.255.255.0 &"
    • On my Linux system, run "ping 192.168.1.47" and confirm that the system replies to ping
    • Run "ftp 192.168.1.7" (that is my Linux system with FTP daemon running)
    • Observe the error message

Raw data

  • ELKS boot
    ELKS-boot_on_Micro_8088
  • Error when trying to establish FTP connection
    ELKS-ftp_eth_errors
  • Error getting a file using FTP
    ELKS-ftp_get_error

Additional information
None

@skiselev skiselev added the bug Defect in the product label Feb 1, 2023
@ghaerr
Copy link
Owner

ghaerr commented Feb 1, 2023

Hello @skiselev,

Is your testing image from the v0.6.0 release images, or the latest source? There have been significant changes made to the NE2K driver since v0.6.0. I have attached an image built from the most recent source in case this is not what you're using:
fd1440.img.zip

Otherwise, it may be that you need to specially indicate that you're running an 8-bit NE2K or change the RAM buffer size, even though the driver appears to be recognizing 8-bit, as indicated from your boot screenshot. Have you read the Wiki Networking article that describes the (newer) configuration procedure in /bootopts? The last number in the ne0= line specifies special flags that may not be auto-recognized by the driver init code. I am not completely up-to-date with where those flags may be documented, we may need to look into the NE2K driver source in the variable net_flags in order to determine them.

Thank you!

@toncho11
Copy link
Contributor

toncho11 commented Feb 19, 2023

Hi @skiselev ,

Are you using the card you designed https://github.com/skiselev/isa8_eth ?
So does it work with the latest builds of ELKS?
The latest build is available in "Actions" here on ELKS project page. You need to select one of the last "runs" generated by the latest pull requests. Each run/PR generates a lot of images that can be used for testing.

@toncho11
Copy link
Contributor

@skiselev please try the new 0.7 release of ELKS.

@ConiKost
Copy link
Contributor

I would like to continue this issue here. I am using also Sergeys isa8_eth on my NuXT v2.0. Using ELKS 0.8.1.

Right after launching "net start", I am getting a lot of messages without doing anything.

ne0: Rcv oflow (0x91), keep 0

Number 0x91 seems to differ. I also saw 0x90, 0x94 and 0xb4.

@ghaerr
Copy link
Owner

ghaerr commented Dec 28, 2024

@Mellvik, do you have any thoughts on why @ConiKost might be having this issue? I am not familiar with the NE2K NIC driver enough to know why this might happen, other than perhaps an incorrectly configured NIC card, or perhaps this has been enhanced or fixed in TLVC?

@Mellvik
Copy link
Contributor

Mellvik commented Dec 29, 2024

@ConiKost: It seems to me there may be a mismatch between the actual NIC buffer size and what the driver thinks the NIC has. Initially, after a restart, small packets (e.g an incoming ping or the ftp dir command) works - the latter because the directory is small. If the directory happened to have many more files, it would 'crash' right away.

The boot messages don't display the flags, I think we may be able to narrow down the possible causes by manipulating the flags via the bootopts file. I've been using this card with ELKS and later TLVC for a long time so it should be a good match. There are no big changes that I can remember in the current TLVC version except buffering.

I suggest setting the flags to 0x84 (verbose, 4k buffer), then leave the ELKS system alone after starting ktcp, and use ping from your linux system for debugging.

  1. Will ping (no options) run fine for like hundreds of pings?
  2. Increase ping packet size (-s 256) and repeat. Continue with larger packets if it works with the current size. Use steps of 256.

The rcv oflow messages simply tell us that the driver is receiving packets but ktcp is not reading them. Using ping involves very little of ktcp thus allowing to focus on the lower levels for now.

@ConiKost
Copy link
Contributor

[..] other than perhaps an incorrectly configured NIC card, or perhaps this has been enhanced or fixed in TLVC?

I don't think, that I have an incorrectly configured NIC. MAC address is correctly identified and adapter comes correctly up. I can ping that interface without any problems.

I suggest setting the flags to 0x84 (verbose, 4k buffer),

I've set 0x84 and It's now being active. Using mtr or ping works stable. But i am not getting more error messages like: icmp: unrecognized ICMP type 5 and ne0: RX-error, status 0x21. Also those oflow messages still show, but not always.

@ConiKost
Copy link
Contributor

Small update: It seems, after some longer time (>30-60 min), networks just stops working. No pings work anymore. A net stop and net start does not help. I need to reboot, then network starts working again.

@Mellvik
Copy link
Contributor

Mellvik commented Dec 29, 2024

I've set 0x84 and It's now being active. Using mtr or ping works stable. But i am not getting more error messages like: icmp: unrecognized ICMP type 5 and ne0: RX-error, status 0x21. Also those oflow messages still show, but not always.

I agree with you, the NIC is configured correctly. The change to flags 0x84 is for testing. BTW I'm using the same card with a V20 based system with flags 0x80.

Are you saying that a regular ping (no options which means 64 byte packets and 1 second intervals) runs fine, even for a long time? Now that the buffer on the NIC is only 4k, there will likely be some rcv overflows, but not caused by a 1second ping.

Are there other systems on this network? You may want to look at what's going on on the network by using
tcpdump -v -i eth0 (or whatever the name of the interface is). If there is a lot of broadcast traffic (some network components are very chatty these days), that may affect your system.

To look at only the traffic to/from your ELKS system, run
tcpdump -vvv -i etc? 192.168.0.47
It would be interesting to see what you get if you use this command, then boot your ELKS system, then open an FTP session from the ELKS system.

Also, since ktcp has a fixed IP address and your network most likely runs DHCP, there may be an address collision biting you. Maybe you could change the ELKS system IP address to 192.168.0.253 - which is less likely to be chosen by a DHCP server.

@ConiKost
Copy link
Contributor

Are you saying that a regular ping (no options which means 64 byte packets and 1 second intervals) runs fine, even for a long time? Now that the buffer on the NIC is only 4k, there will likely be some rcv overflows, but not caused by a 1second ping.

Yes, this is correct. But I don't think, that it's caused by the ping packet itself. Message are mor or less, but not on every ping.

Are there other systems on this network? You may want to look at what's going on on the network by using
tcpdump -v -i eth0 (or whatever the name of the interface is). If there is a lot of broadcast traffic (some network components are very chatty these days), that may affect your system.

Yes, this is also correct. There are pletny of others things on the network and also broadcast traffic.

It would be interesting to see what you get if you use this command, then boot your ELKS system, then open an FTP session from the ELKS system.

Sorry, I currently don't have access to that network, so it will have to wait.

Also, since ktcp has a fixed IP address and your network most likely runs DHCP, there may be an address collision biting you. Maybe you could change the ELKS system IP address to 192.168.0.253 - which is less likely to be chosen by a DHCP server.

It does not change anything. After "net start" with static ip adress configuration, those messages start to appear.

@Mellvik
Copy link
Contributor

Mellvik commented Dec 31, 2024

@ConiKost - can you post the boot messages from your ELKS system?

@Mellvik
Copy link
Contributor

Mellvik commented Dec 31, 2024

It would also be useful with a screenshot from (or listing of some of) the pings. Try ping -i 0.5 -s 1024 <address>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Defect in the product
Projects
None yet
Development

No branches or pull requests

5 participants