Zone timer dump from Envisa Link 4

Information and support for EnvisaLink modules.

Moderators: EyezOnRich, GrandWizard

pos42
Posts: 37
Joined: Sat Mar 04, 2017 8:19 am

Re: Zone timer dump from Envisa Link 4

Post by pos42 »

@mikep

LED status update???
Break backward compatibility???

The API spec is a little bit unclear. And I have not dig into it that deep. It says:
--snip--
The POLL command will also reset the Envisalink's network watch- dog timer. If there is no communications with the Envisalerts servers for a period of 20 minutes, the Envisallink will reboot.
--snip--

It could be so that the timer is reset during ANY communication sent or received... Or not... I have not checked...

I have send a poll request "000" on a regular basis for two reasons.
# 1 To know the session is ok or if I have to reconnect
# 2 To make he Envisa Link to NOT reboot, which it will if no poll is done within 20 minutes

Therefor I have decided to do a poll every 30s to have session-fail-detection reasonable fast. I however considered the session dead if I have not received a poll response of two poll intervals. The question was if I should continue to send a few more poll requests and wait a little bit longer for a poll response before I considered the session dead and try to do a reconnect to the Envisa Link.

Do you by "LED status update" mean use "001" instead of poll "000". Will it also reset the watchdog timer? And why should that be less heavier than a "poll"? I think it would be the opposite as it will return more data by using "001".

But maybe I misunderstood your comments..

Regards
Peo
mikep
Posts: 138
Joined: Wed May 30, 2012 1:49 pm
Contact:

Re: Zone timer dump from Envisa Link 4

Post by mikep »

Sorry, was a bit terse 'cause I didn't think my comments were very important (or were going to change anything). I was just noting that zonetimerdump seemed like an odd choice - it's supposed to be solicited, it's big, so a bit of a clunky solution. And that the envisalink sending a poll wouldn't work - it could send a poll response but that seemed a bit clunky too. An led status update is already asynchronous and innocuous so might be a better choice to check the liveliness of a connection.

Your poll mechanism sounds fine to me. It needs to be more often than the socket receive timeout which is probably less than 20m... I too send followup polls on an unresponsive connection before deciding it's dead, but honestly this is TCP rather than UDP so it's probably not necessary, a dead session is dead.

What's weird about your situation is that you send the polls, get no response, attempt to reconnect, but still get the unsolicited zonetimerdump. That's a bit odd - perhaps you're not closing the socket after deciding it's dead before opening the next? And maybe not waiting long enough for the poll response? k-man pointed out that it's the new connect request that causes the envisalink to send the zonetimerdump on the old connection so it seems like it must still be open and active for you to get the timerdump.
DscServer for android/linux/windows: https://sites.google.com/site/mppsuite/dscserver
pos42
Posts: 37
Joined: Sat Mar 04, 2017 8:19 am

Re: Zone timer dump from Envisa Link 4

Post by pos42 »

I will take a look at LED status update. Not that I think it is better, but that I maybe get more usable info... But will that also reset the timer counting against the reboot????


What you say in the last text section about the zone dump, I agree. I can come up with a few ideas...
- Maybe there is still a session but a bug in Envisa Link so the poll reply wont be sent back
- Maybe a Fibaro Home Center problem so the poll wont be received by by sandboxed LUA written service

BUT.... As I (see earlier log posted) get the zone dump just milli seconds after I finally managed to reconnect, the zone dump *CANNOT* be received by the earlier session and sent to my log. My app restarts every time it tries to reconnect. When it does, all earlier sockets are simply **gone**. So I am not 100% sure K-man is correct. But I could have missed something... :) It for sure looks like the zone dump is sent directly in the new session after auth is ok without me asking/polling for it. It is the FIRST packet received in the new session after auth is done and session is ok. Weird... But I can live with it...

To try ti catch this, I will....
- Try longer and continue polling every 30s a few more times before I consider a session dead and start over
- When I consider a session dead, I will try a tcp close anyway (I do not do that today). Maybe the other end receives it. I cannot be sure my packet don't get through because of the fact I did not get a poll reply to my service.

If it is would a bug in Envisa Link, it does not seem that serious... :)

/Peo
mikep
Posts: 138
Joined: Wed May 30, 2012 1:49 pm
Contact:

Re: Zone timer dump from Envisa Link 4

Post by mikep »

I might be misunderstanding... but I think we're not really sync'd here.

The Led status update was a suggestion for K-Man, not for us as API users. We should continue to send polls as before.

I believe the "milliseconds after you connect" is explained by K-Man's algorithm. In the original design if a connect request was received while another client was already connected the NEW request would be rejected (connect reset) immediately. He's changed the design now to send a message (in this can a zonetimerdump, but it could be anything, like, um, an led status update :)) to the ORIGINAL channel to see if that one is still open/active - it will likely arrive milliseconds after your NEW connection request. If the ORIGINAL is still active he'll reject the NEW connect request as before. If the ORIGINAL is NOT active he'll accept the NEW connection request.

So I think you're seeing traffic on your original connection since you haven't closed it. If you close it before trying to reconnect you may find the envisalink a little more responsive. As to why the polls aren't getting through I would look to the network but it's hard to tell from afar.
DscServer for android/linux/windows: https://sites.google.com/site/mppsuite/dscserver
K-Man
Posts: 141
Joined: Fri Jun 01, 2012 1:08 pm

Re: Zone timer dump from Envisa Link 4

Post by K-Man »

pos42 wrote:I will take a look at LED status update. Not that I think it is better, but that I maybe get more usable info... But will that also reset the timer counting against the reboot????


What you say in the last text section about the zone dump, I agree. I can come up with a few ideas...
- Maybe there is still a session but a bug in Envisa Link so the poll reply wont be sent back
- Maybe a Fibaro Home Center problem so the poll wont be received by by sandboxed LUA written service

BUT.... As I (see earlier log posted) get the zone dump just milli seconds after I finally managed to reconnect, the zone dump *CANNOT* be received by the earlier session and sent to my log. My app restarts every time it tries to reconnect. When it does, all earlier sockets are simply **gone**. So I am not 100% sure K-man is correct. But I could have missed something... :) It for sure looks like the zone dump is sent directly in the new session after auth is ok without me asking/polling for it. It is the FIRST packet received in the new session after auth is done and session is ok. Weird... But I can live with it...

To try ti catch this, I will....
- Try longer and continue polling every 30s a few more times before I consider a session dead and start over
- When I consider a session dead, I will try a tcp close anyway (I do not do that today). Maybe the other end receives it. I cannot be sure my packet don't get through because of the fact I did not get a poll reply to my service.

If it is would a bug in Envisa Link, it does not seem that serious... :)

/Peo
Your connection over the TPI is just a pipe, like any other UNIX pipe. When the IP socket fails the new connection inherits the FD at the end of the pipe and anything in it goes to the new client. Simple.
pos42
Posts: 37
Joined: Sat Mar 04, 2017 8:19 am

Re: Zone timer dump from Envisa Link 4

Post by pos42 »

Now it has happened again today just after 7 CET where the Envisalink became unresponsive after aprox a week. This time I did not get a zone dump. The Envisalink was just gone for a few minutes. Again, the switch cannot see any problem. And the home automation that talks to the Envisalink did not loose any other connection.

Is there a way to see the Envisalink uptime? I want to see if it has rebooted as this is just one of few things to look at when trying to find the problem. A long shot, but... I have denied the Envisalink to call "home" in my firewall. Could the Envisalink eventually eat memory when it cannot connect and finally reboot due to a watchdog or so? Just thinking loud... My envisalink has a ....111 firmware.

Also, I use fixed IP over DHCP for the EnvisaLink. This setup is normally not a problem for any service, but I think I must ask.

Of course it does not have to be the Envisalink that it is the issue here. I of course look everywhere.... :)

Tnx
/Peo
K-Man
Posts: 141
Joined: Fri Jun 01, 2012 1:08 pm

Re: Zone timer dump from Envisa Link 4

Post by K-Man »

We found a potential scenario whereby the TPI connection locks up for a couple of minutes and needs a socket reset to recover. You need a pretty noisy network to cause it to happen.

If you want to try some beta code email Eyezon support and ask for the BETA update. You will have to turn off your firewall and let the module "call home" and get new firmware.

The Envisalink is programmed to reboot if it doesn't see any traffic from our servers or from the TPI in an 20 minute period.
pos42
Posts: 37
Joined: Sat Mar 04, 2017 8:19 am

Re: Zone timer dump from Envisa Link 4

Post by pos42 »

Hi @K-Man

Tnx för the reply.

I have seen what is written in the API about 20 minutes, and that is why I send a poll req ever 30s.


I have seen that the lock-up is about a few minutes every time it happens. So it could *maybe* be it... Worth trying!

Noisy network... Well... I have 10 virtual machines, 1VM host, 1 fw, 2 APs, around 15 clients, envisalink, DSC caller card, a few PoE cameras, 2 mini PoE powered switches, a cisco 2960, a NAS, a few raspberry:s. Don't know it that could be noisy enough.


On a scale 1-10, how stable is the beta? I ask about your personal opinion here :) And is it possible to back out? Just want to know the risk...

PS
I have opened up for "call home"

Tnx
Peo
pos42
Posts: 37
Joined: Sat Mar 04, 2017 8:19 am

Re: Zone timer dump from Envisa Link 4

Post by pos42 »

@K-Man

Now I have firmware version 121 instead of 111. Is that the release you are referring to?

Tnx
Pro
pos42
Posts: 37
Joined: Sat Mar 04, 2017 8:19 am

Re: Zone timer dump from Envisa Link 4

Post by pos42 »

Hi again


Now I have been using firmware 121 for approx 10 days. No socket problems so far. It is looking good...


/Peo
Post Reply