Important changes to forums and questions
All forums and questions are now archived. To start a new conversation or read the latest updates go to forums.mbed.com.
11 years, 9 months ago.
EthernetInterface does not appear to be robust
Hi
I have an application where I continually use socket connections; to scrape a webpage and to update a server.
I put the scraping in one thread and the updating in another thread. After a few hours the mbed would lock up.
I have reproduced the problem using the code below in a single thread (the main one). After just under 4 hours of activity a lock up occurred - I suppose it is random so it could lock up in more or less time. I will run the code again to see how much it varies.
The point is that the EthernetInterface is unusable like this. Is it possible that this could be reproduced (and then debugged) by the mbed team?
Thanks Daniel
PS This is based on the sample code as a starting point. I have updated all the libraries to the latest versions.
Import programTCPSocket_HelloWorldTest
Test to demonstrate that TCP sockets lock up
Question relating to:
4 Answers
11 years, 9 months ago.
Hi Daniel, for the networking libraries we are not yet in the robustness testing phase. There is still a small queue of bug fixes to be applied to the Ethernet driver from NXP and we have still to add support for certain features.
For example, in the last couple of days (apart releasing support for the new Freedom board from Freescale) I am working on adding IGMP support for multicast join requests where I did hit a lwIP bug: https://savannah.nongnu.org/bugs/?38165
Anyway, it is coming along and slowly improving. If you want to help, all our source code is open and released under an Apache v2 license. Pull requests are welcome: https://github.com/mbedmicro/mbed
Cheers, Emilio
11 years, 9 months ago.
Hi Emilio
Thanks for the response. I'm keen to help make the stack robust. I think this would really benefit people like myself who want to develop applications with long uptimes, rather than demos, and value availability over throughput.
Is there a way to find out what the known issues/bugs are? For example, the queue of fixes to the NXP driver. Last time I helped debug an LPC driver (not the NXP one), it suspiciously locked up after a few hours as well.
Also, how are you capturing test cases? I'd like to write some for the robustness testing phase, so networking can become as trusted as other parts of the SDK.
Regards Daniel
> I'm keen to help make the stack robust.
Thanks Daniel.
> Is there a way to find out what the known issues/bugs are?
At the moment, the closest thing we have to a public issue tracker is this forum: http://mbed.org/forum/bugs-suggestions
That is clearly a sub optimal solution, the main problems with it are:
- There is no way to get a clear list of open issues
- There is no way to filter the issues by component (rtos, tcp/ip, USB device, DSP, NXP drivers, Freescale drivers, etc). There are tags, but a tag search returns results from the whole site, not only from the "Bugs & Suggestions" forum.
Of course, we do have an internal issue tracker where we try to mirror the forum posts in actual tickets, but we have no way to provide a public view of it.
The mbed web team has been busy adding a lot of other features and the addition of a proper issue tracker got always postponed.
Now that the mbed SDK is completely open source having a public issue tracker is growing of importance. I hope the mbed web team will manage to get its development scheduled in.
Perhaps, as a temporary solution, we could actually use the mbed github issue tracker: https://github.com/mbedmicro/mbed/issues?state=open
> For example, the queue of fixes to the NXP driver. > Last time I helped debug an LPC driver (not the NXP one), > it suspiciously locked up after a few hours as well.
I summarized the sources of the code used in our TCP/IP stack in this document: https://github.com/mbedmicro/mbed/blob/master/libraries/doc/net/source.txt
The NXP LPC repository is under maintenance (no web view), but it is still possible to clone it: http://sw.lpcware.com/ (git clone http://git.lpcware.com/lwip_lpc.git)
There were a lot of changes affecting example programs, but at the end the fix related to the Ethernet driver was only one: https://github.com/mbedmicro/mbed/commit/a80058dc5fac3c1d7569209cc0d441eef4bdebc3#L2R401
I summarized the architecture of the main 3 TCP/IP threads here: https://github.com/mbedmicro/mbed/blob/master/libraries/doc/net/doc.txt
Cheers, Emilio
posted by 03 Mar 201311 years, 9 months ago.
Daniel, if you want some more help, let me know. I could try building your sample with the offline compiler and send you some binaries and/or offline build able sources. If the problem reproduced with these binaries, we could connect GDB, the GNU Debugger, and take a peek at what is going on when the program stops running.
-Adam
Hi Adam
It would be good to try something offline with the debugger. I did think about this, read a little bit, and saw a few weeks ago that you were working on getting the new EthernetInterface sample code to compile offline.
I'll revisit my test code, check that Emilio's latest changes are included, then send you a link.
Thanks Daniel
posted by 28 Mar 201311 years, 8 months ago.
ping <ip> -s 4800 Locks my mbed in just 3 packets.. Is there any rules/filters that can be put into the Ethernet MAC to drop packets over a certain length? Also holding down the reload button in my browser with the ip of the mbed loaded locks my device
Maybe for reliability reasons it would be best to use an external chip to handle all the networking like the WIZ5200 or something similar.. For my remote real-time ac line monitor, lock-ups are not an option...
just my 2c
did some testing with the W5200 and it lockups within 4-6 hrs without doing a thing.. Have to reset it... When it dose work it also locks up if I hold F5 reloading the page even if it's not serving a thing.. large ping packets lock it up aswell... Hmm... Wonder where the part of the code is that responds to ICMP requests... I should start by deleting that or limiting the payload
posted by 04 Apr 2013
After a reset, the code ran for just over 9 hours before lock up. I've restarted it again this morning.
posted by Daniel Peter 27 Feb 2013After another reset, ran for just over 6 hours. I've restarted it again.
posted by Daniel Peter 27 Feb 2013