abcdefg
Thanks for the help Giuseppe
I've made those changes and I have another log file for you but I can still see keepalive mentioned in there ...
Giuseppe Corti
Hi abcdefg,
I confirm that it seems that the <keepalive>N</keepalive> setting was not applied.
Please could you check the lightstreamer_conf.xml configuration file, the <keepalive>N</keepalive> should be just after <lightstreamer_conf>.
If all seems correct to you please, repeat the test case with this additional log setting (I warn you that this is very verbose):
<logger name="LightstreamerLogger.requests" level="INFO"/>
In addition I am wondering if you could arrange a network sniff.
Regards,
Giuseppe
abcdefg
Okay I am very confused now.
Here is the first bit of my lightstreamer_conf.xml file:
<?xml version="1.0"?>
<lightstreamer_conf>
<keepalive>N</keepalive>
<logger name="LightstreamerLogger.requests" level="INFO"/>
<!--
The following elements, subelements and attributes are consulted
by Lightstreamer Server to setup its own configuration. Elements with
an empty or blank value are ignored and considered as not defined at all.
Elements described as "cumulative" can be inserted multiple times in the
same place.
I've attached another log of a failed connection. But it looks exactly the same to me. It doesn't look like these configuration changes are having any effect. I restarted the Lightstreamer server prior to the test, just to be sure.
There are no other <logger> tags inside lightstreamer_conf.xml, but there are a lot of them inside lightstreamer_log_conf.xml. One of them is exactly the same as the above command. I'm not sure if statements in this file will also take effect.
Please could you confirm that there are no network intermediaries (proxies or something similar) in the middle that can buffer in some cases the replies?
I don't think so.
In addition I am wondering if you could arrange a network sniff.
Shouldn't be a problem, I'll probably work on this tomorrow.
Giuseppe Corti
Hi abcdefg,
I am very sorry to have been inaccurate in my instructions.
Indeed the logger configuration is related to the lightstramer_log_conf.xml file and not to lightstreamer_conf.xml and also the suggested level was wrong.
In lightstramer_log_conf.xml please change the level for <logger name="LightstreamerLogger.requests" level="WARN"/> to DEBUG.
Anyway, in this log it is clear that the <keepalive>N</keepalive> setting was successful. You may notice that there are no more rows in the log file like this:
17:01:30.930 [SERVER POOLED THREAD 1] INFO LightstreamerLogger.connections - Reused HTTP connection on "Lightstreamer HTTP Server" from 10.9.8.133:59985
Unfortunately also in this case we could not exactly figure out why the replies to some bind requests are not send to the client.
Some steps by which we can proceed with the investigation are:
- Please could you priovide us with the details about the client library in use in these tests (library type and version)?
- Repeat the test case wiht the extra log settings (in lightstreamer_log_conf.xml):
<logger name="LightstreamerLogger.requests" level="DEBUG"/>
<logger name="LightstreamerMonitorText" level="TRACE">
and if possible, a network sniff.
- Repeat the test case with a Lightstreamer server 6.0.x.
Thank you,
Giuseppe
abcdefg
Thanks for the clarification, that makes sense now.
I've added those extra log settings and attached the updated server log.
And also I have a Wireshark .pcapng which shows what the messages look like from the client side. I'm not sure that my title for this thread is correct, the bind_session.txt requests look like they did receive responses? If so then that in turn would mean that my Postman setup is not representative for some reason.
I'll admit I'm still trying to make heads-or-tails of the Wireshark output.
Please could you priovide us with the details about the client library in use in these tests (library type and version)?
I am using
pod 'Lightstreamer_OS_X_Client'
which is at version 2.0.0. I'm testing in an OSX project so as to keep things simple, but I had the same results when I was using Lightstreamer_iOS_Client, and the non-unified API as well.
Giuseppe Corti
Hi abcdefg,
First, please note the OS X Client API (Unified Version) version 2.0.x is compatible with Lightstreamer Server since version 6.0.1.
Most likely this is not the reason of the issue, since also happens with other libraries, but generally using a client lib not supported by the Lightstreamer server version in use exposes you to unpredictable behavior.
That said, the new log and the client side dump offer us some clues to explain the situation.
In fact we often find errors of this type, in the IP protocol:
"Header checksum: 0x0000 [incorrect, should be 0x6ec5 (maybe caused by "IP checksum offload"?)]"
These happen in the ack messages from the client to the server.
Now I can imagine at this point a scenario like this: the server does not receive the ack for his reply to create_session, for this reason the session remains stuck, and the following bind_session request does not receive any reply.
Regards,
Giuseppe
abcdefg
It is interesting however apparently a 0x0000 checksum might just be a quirk of Wireshark when capturing the packets.
https://ask.wireshark.org/questions/13577/header-checksum-error-incorrect-should-be
Giuseppe Corti
Could you take a sniff server side?
abcdefg
No problem, I've attached.
In the server logs, it looks like the header checksums are ok in both directions, so I'm guessing that this is just a recording quirk.
Giuseppe Corti
Hi abcdef,
You are right, these new captures show a correct situation from the network point of view.
However remains a strong suspicion that the sessions remain in some way stuck.
The best way to proceed the investigation is to reproduce the case with a server 6, and if the issue persists, I will provide you with specific instructions.
Regards,
Giuseppe
abcdefg
I think I might have isolated the problem.
Our servers run Lightstreamer 5.1.1. In 5.1.2 you added the <use_enriched_content_type> property which changes the Content-Type of messages from text/plain to text/enriched. Our servers don't have this flag set and Wireshark confirms that the Content-Type is text/plain not text/enriched.
From my testing, when the server sends text/plain messages, the Flash client will respond and connect, but the iOS client won't.
I have tested this with our 5.1.1 production server, with a 5.1.2 server, and with 6.0.3 server with <use_enriched_content_type>N</use_enriched_content_type> and it seems to be completely consistent and reproducible.
Giuseppe Corti
Hi abcdefg,
This is quite strange since the recent results do not agree with my previous considerations. In the log files I have checked so far the client lib always replied (with a bind request) to the server create_server reply.
Indeed as stated in the in-line comments of the <use_enriched_content_type> parameter, it is generally useful to work around possible buffering by network intermediaries (which is not your case).
That said, if the <use_enriched_content_type>Y</use_enriched_content_type> setting helps in your case you should consider the option of a Lightstreamer server upgrade.
Regards,
Giuseppe
abcdefg
Could you just confirm for me that you can NOT reproduce this behaviour
Server:
Lightstreamer 6.0.3
Server setting:
<use_enriched_content_type>N</use_enriched_content_type>
Client:
Lightstreamer_OS_X_Client
Client code:
override func viewDidLoad() {
super.viewDidLoad()
LSLightstreamerClient.setLoggerProvider(LSConsoleLoggerProvider())
let client = LSLightstreamerClient(serverAddress: "https://....:443/", adapterSet: ".....")
client.addDelegate(self)
client.connect()
}
func client(client: LSLightstreamerClient, didChangeStatus status: String) {
NSLog(status);
}
Output:
CONNECTING
CONNECTED:STREAM-SENSING
CONNECTED:HTTP-STREAMING
CONNECTED:HTTP-POLLING
DISCONNECTED:WILL-RETRY
CONNECTING
CONNECTED:HTTP-POLLING
DISCONNECTED:WILL-RETRY
CONNECTING
CONNECTED:HTTP-POLLING
DISCONNECTED:WILL-RETRY
CONNECTING
CONNECTED:HTTP-POLLING
DISCONNECTED:WILL-RETRY
CONNECTING
...
Giuseppe Corti
Hi abcdefg,
I can confirm that I can NOT reproduce the behaviour. My test conditions:
- Lightstreamer server 6.0.3 listening on localhost:8080 (no SSL)
- client OS X version 2.0.1
- <use_enriched_content_type>N</use_enriched_content_type>
But the results of your tests continue to confuse me; especially the server-side network sniff analyzed.
It may be that between the sniff and the Lightstreamer server there is some software component such as antivirus, firewall or similar?
Regards,
Giuseppe
abcdefg
Hmm. After returning to the problem a week later, I can't reproduce my own results.
I'm getting the behaviour I stated for the 5.1.2 server, but not for the 6.0.3 any more. (I think perhaps I just got confused between the various test cases I'm cycling through.)
I understand that you are no longer "officially" supporting 5.x servers anymore, however our overcomplicated server structure will make it a difficult task to upgrade.
I believe you should be able to reproduce this issue using a 5.1.2 server and I've
uploaded a test case which you can run. It should be exactly the same as your test except a slightly older version of Lightstreamer.
If you can reproduce this in any way at all, even using an outdated server, it will make me happy.
It may be that between the sniff and the Lightstreamer server there is some software component such as antivirus, firewall or similar?
Ever since I understood that the <use_enriched...> flag was involved, I've been able to reproduce the scenario using localhost:8080 like you. I'm not really using my remote test server anymore. So I believe I've removed this possibility.
Dario Crivelli
Before delving into reproducing the case, we need some clarifications.
Do you confirm that the <use_enriched_content_type> flag is important and that with <use_enriched_content_type> set to Y your testcase works also in Server 5.1.2 ?
In fact, we notice that in the configuration you sent the <use_enriched_content_type> is N.
So, can we assume that, by moving from 5.1.1 to 5.1.2 (which is straightforward) and setting <use_enriched_content_type> to Y, you have fixed your issue, as far as your operativity is concerned?
Do you mean that with Server 6.0.3 your testcase works even when <use_enriched_content_type> is N?
According to the logs included in your testcase, we see that the testcase still involves an OSX client library 2.0.1 trying to connect to a Server 5.1.2, which is not supported.
Is this the extension to 5.1.2 that you would like to be applied?
abcdefg
Do you confirm that the <use_enriched_content_type> flag is important and that with <use_enriched_content_type> set to Y your testcase works also in Server 5.1.2 ?
Yep.
In fact, we notice that in the configuration you sent the <use_enriched_content_type> is N.
So, can we assume that, by moving from 5.1.1 to 5.1.2 (which is straightforward) and setting <use_enriched_content_type> to Y, you have fixed your issue, as far as your operativity is concerned?
Locally yes. I'm trying to avoid server upgrades because our deployment scripts are ... less than fully understood.
Do you mean that with Server 6.0.3 your testcase works even when <use_enriched_content_type> is N?
I just checked again to make sure. Yes.
According to the logs included in your testcase, we see that the testcase still involves an OSX client library 2.0.1 trying to connect to a Server 5.1.2, which is not supported.
Is this the extension to 5.1.2 that you would like to be applied?
That's the behaviour I would like to test, yes, thanks.
Since the Flex client connects, but the iOS/OSX one doesn't, I thought I might be looking at a client bug. It sounds like this isn't the case. On the off-chance that it is, then a fix would be great, otherwise screw it, I'll get the wheels turning for an upgrade.
Thanks to both of you by the way.
Giuseppe Corti
Hi abcdefg,
As already said the 2.0.x versions of OS X and iOS clients are compatible since version 6.0.1 of Lightstreamer server. Use them with version 5.x could result in unpredictable behavior.
In any case I can not officially support you in using the OS X 2.0.1 client with a server 5. From my point of view should not work even the create_session.
You should consider the 2.0 client only if you're planning to go to server 6. Otherwise you should fall back to the client version 1.2.4.
By the way, I can confirm that we have replicated your test case.
Regards,
Giuseppe
Giuseppe Corti
Hi abcdefg,
Eventually we found the possible culprit of the strange behaviors you have observed.
Indeed should be the padding agreed between the client and the server.
As already mentioned the client 2.0 and server 5.0 are not compatible and so they could not agree a proper padding setting. The enrich configuration forced (as a side effect) a padding acceptable to iOS/OS X client 2.0 and that was the trick.
Just to confirm my hypothesis, pleaser could you try a 5.1.1 server with this configuration
<padding>2048</padding>
and OS X 2.0 client?
Regards,
Giuseppe
abcdefg
Yep, that does the trick!
<use_enriched_content_type>N</use_enriched_content_type>
<padding>2048</padding>
2016-09-28 10:26:30.144 OSXConnectionTesting3[42934:5980634] CONNECTING
2016-09-28 10:26:30.293 OSXConnectionTesting3[42934:5980634] CONNECTED:STREAM-SENSING
2016-09-28 10:26:30.302 OSXConnectionTesting3[42934:5980634] CONNECTED:HTTP-STREAMING
Thanks once again for your help.
Phew!