6 years, 7 months ago.

BLE connect() "hangs"

I'm using an NRF52832 device with Mbed (5.8, #675528b), the device is acting as a central device connecting to several peers, and the problem I am having is that connect() frequently does not result in a connection.

The scenario is that I have scanned, found a discoverable BLE peer and would like to connect to it to find out what it is. However, quite often, the BLE::Instance().gap().connect() call never results in a connection; invoking it returns BLE_ERROR_NONE and yet my onConnection() and onDisconnection() callbacks never, ever, get called. I can't cancel the connection attempt as disconnect() requires a handle and connect() doesn't return a handle except to one of the callbacks. So I’m stuffed; subsequent calls to connect() return BLE_ERROR_INVALID_STATE.

Any ideas how I get out of this situation? The complete code that shows the problem (connect() issued on line 263) is attached.

As an experiment, I've tried passing in the timeout structures that connect() takes, see below, (I had both originally NULL) but that doesn’t help, though I’ve no idea if I’ve chosen sensible numbers.

// TODO are these correct?
static const Gap::ConnectionParams_t connectionParams = {6 /* minConnectionInterval */,
                                                         6 /* maxConnectionInterval */,
                                                         0 /* slaveLatency */,
                                                        100 /* connectionSupervisionTimeout (10 ms units) */};

// Gap scanning parameters to minimise connection time (from https://os.mbed.com/docs/v5.8/reference/gap.html)
static const GapScanningParams connectionScanParams(GapScanningParams::SCAN_INTERVAL_MAX /* interval */,
                                                    GapScanningParams::SCAN_WINDOW_MAX /* window */,
                                                    3 /* timeout */,
                                                    false /* active scanning */);

Note that I've asked Nordic about this on their forums but they say this is not a problem they've ever seen and sent me here instead.

3 Answers

6 years, 6 months ago.

Hi Rob,

Many thanks for attaching the code; it helps a lot.

I believe your issue lies in the hard-coded address type passed to the connect function; the connect operation will not work if the address you try to connect to is not a random static address. You can get the address type of the peer from the advertisement packet; it is in a field named address type.

Please also note that a connection attempt while a connection operation is ongoing will fail and the number of parallel connection is limited to 3 (on Nordic targets) for a central.

Last but not least, connections operation will timeout at some point (30s IIRC); you can be notified of this timeout by registering a timeout event handler with the help of onTimeout and look for the timeout event.

At some point we will add a public function that cancel the ongoing connection operation. As it is now you can use sd_ble_gap_connect_cancel but this is not portable.

Accepted Answer

Hi Vincent, and thanks for checking through my code! I had wondered about the address type; I was following the example in the GAP README.md BLE_LEDBlinker example in hard-coding the address type. I've modified my code to use the peer address type that I receive but unfortunately it doesn't help; there are a few devices with address type RANDOM_PRIVATE_RESOLVABLE around but they all always disconnect immediately and so are not a problem. I've also added an onTimeout() but it is never, ever, called (I've tried connecting both with and without the suggested connectionParams and connectionScanParams in case that makes a difference). I've reattached the new code above: what can I be doing wrong!?

posted by Rob Meades 04 May 2018

Apologies, I had misunderstood. Re-reading yo upost I see that I have to _check_ that the device has a RANDOM_STATIC address before I call connect(), so the example and the GAP README.md are correct but, anyway, the effect is the same as all the devices I can see say that they do indeed have a RANDOM_STATIC address type yet me connect() call still never times out for some devices.

posted by Rob Meades 04 May 2018

Also apologies for the uncorrected misspoolings above: I am unable to Edit comments for some reason...

posted by Rob Meades 04 May 2018

I made investigations and found some issues:

  • Update for soft device 14 does not support multiple central link; earlier up to 3 concurrent link were supported. Not sure why it has changed; I guess it will be addressed for 5.9.
  • It's best to set scanning params to connect so a timeout can be set. If scanning params are not set then existing scanning params are used. Default scanning params have an infinite timeout. In other word, it is possible that the connection process never ends.
  • Connection timeout will most likely be reported with `Gap::TIMEOUT_SRC_SCAN` as the source; not `Gap::TIMEOUT_SRC_CONN`. The reason behind this behavior is the way the connection process is made: First the local device scan for the peer specified, once and advertisement packet is received the local device sends in the same advertising interval a connection request. As soon as the connection request is received the connection is considered as being established; there's no need to wait for a response. In other words timeout for connection establishment can only happen during scanning.

Here's a gist of your application: https://gist.github.com/pan-/be66551b1e5faea0a4b715ab7e4078ae

posted by Vincent (pan-) Coubard 04 May 2018

"Update for soft device 14 does not support multiple central link; earlier up to 3 concurrent link were supported. Not sure why it has changed; I guess it will be addressed for 5.9."

Vincent, is this a change in the SoftDevice or our glue code between Mbed BLE API and SoftDevice API?

posted by Marcus Chang 04 May 2018

Those are excellent investigations, thanks Vincent. Taking all of those on-board I can now make my BLE application behave well. Note that I always limited my number of simultaneous connection attempts to 1 for simplicity and, in fact, if you don't do this there's no way of tying a timeout event to a connection, since the timeout callback doesn't get told what connection has not been made. It might be worth considering fixing this in the API when the number of simultaneous connections heads north of 1 once more.

posted by Rob Meades 04 May 2018

Hi Rob - A fix has been made for the number of concurrent connections and that will be available in v5.9 (expected later this month): https://github.com/ARMmbed/mbed-os/commit/ed20b17d296ac4238bd768686a21271cd513c4ad

posted by Ralph Fulchiero 05 Jun 2018
6 years, 6 months ago.

We don't appear to have any guidelines on the values for connectionParams, but in our example in Gap.h we are using {50, 100, 0, 600} our scanning_params are {100, 100, 0, false}. We ran a test with these new values but still see failures sometimes as you have reported. Could you run with these new values to see if you notice a difference?

For reference: https://github.com/ARMmbed/mbed-os/blob/master/features/FEATURE_BLE/ble/Gap.h

-Ralph, Team Mbed

Thanks for responding, it's good to know it's not just me. I've tried with those parameters but it hasn't helped I'm afraid. Is there any logging I can switch on in the lower layers of BLE that might help diagnose the issue?

posted by Rob Meades 01 May 2018
6 years, 6 months ago.

Hi Rob,

We have updated the Nordic port quite significantly recently. Would you be able to test your issue against our master branch? Or the feature-nrf528xx branch?

Kind regards,

Don

OK, I have switched to headrev of the branch feature-nrf528xx but the problem still remains I'm afraid. What else can I do?

posted by Rob Meades 03 May 2018

Actually, according to Github, the branch feature-nrf28xx wasn't updated recently (last update 26 March: commit 0c14ebb @marcuschangarm "Updated target MTB_LAIRD_BL652 to use SDK 14.2"). Should I be looking somewhere else for your recent updates?

posted by Rob Meades 03 May 2018

In case it is relevant, while poking at this problem, I've raised a separate issue that GenericGattClient::DiscoveryControlBlock terminate() is being called twice which throws a hard fault if you compile with debug on: https://github.com/ARMmbed/mbed-os/issues/6806.

posted by Rob Meades 03 May 2018

The feature-nrf528xx branch has already been merged to master.

posted by Marcus Chang 03 May 2018