the lightstreamer stopped working for unknown reasons

ahmedsmart4tech

03-Sep-14 13:40:10,314|INFO |LightstreamerMonitorText |Timer-1 |Total threads = 101, Total heap = 3760193536 (free = 491165288), Sessions = 71 (max = 82), New sessions = [+0, -0], Connections = 77 (max = 112), New connections = [+0, -0], In-pool threads = 38, Active threads = 12, Available threads = 26, Queued tasks = 130, Pool queue wait = 31, NIO write queue = 61, NIO write queue wait = 8663, NIO write selectors = 4, NIO total selectors = 32, Subscribed items = 398, Inbound throughput = 72.24 updates/s (pre-filtered = 72.24), Outbound throughput = 248.84 updates/s (115.53 kbit/s, max = 9159.48), Lost updates = 0 (total = 0), Total bytes sent = 1014192688, Client messages throughput = 0 msgs/s (0 kbit/s, max = 0), Total messages handled = 0, Extra sleep = 93, Notify delay = 0
03-Sep-14 13:40:10,314|INFO |LightstreamerLogger.requests |SERVER POOLED THREAD 6 |Polling (941) to session: Se953ab502aa51527T2437106 on "Lightstreamer HTTP Server" from 87.109.105.249:1131
03-Sep-14 13:40:10,314|INFO |LightstreamerLogger.requests |SERVER POOLED THREAD 6 |Polling (24241) to session: S6d93ae988a324ee4T5319413 on "Lightstreamer HTTP Server" from 94.99.88.231:53393
03-Sep-14 13:40:10,314|INFO |LightstreamerLogger.requests |SERVER POOLED THREAD 6 |Serving request: /lightstreamer/control.txt --> LS_session=S989df9e39abc498aT3910958&LS_op=add&LS_table=81&LS_id=ts146&LS_mode=COMMAND&LS_schema=TradePrice%20TradeVolume%20TradeTime&LS_data_adapter=PROXY_MarketBeat&LS_Snapshot=true on "Lightstreamer HTTP Server" from 41.209.249.5:11750
03-Sep-14 13:40:10,314|INFO |LightstreamerLogger.requests |SERVER POOLED THREAD 3 |Serving request: /lightstreamer/control.txt --> LS_session=S9fc7681542f5e19T3317880&LS_op=delete&LS_table1=17 on "Lightstreamer HTTP Server" from 37.104.60.191:63433
03-Sep-14 13:40:10,314|INFO |LightstreamerLogger.requests |SERVER POOLED THREAD 3 |Controlling session: S9fc7681542f5e19T3317880 on "Lightstreamer HTTP Server" from 37.104.60.191:63433
03-Sep-14 13:40:18,520|WARN |LightstreamerLogger.scheduler |Timer-1 |Current delay for tasks scheduled on thread pool SNAPSHOT is 16 seconds.
03-Sep-14 13:40:18,520|WARN |LightstreamerLogger.scheduler |Timer-1 |If the delay issue persists, taking a JVM full thread dump may be the best way to investigate the issue.
03-Sep-14 13:40:18,520|INFO |LightstreamerMonitorText |Timer-1 |Total threads = 102, Total heap = 3760193536 (free = 481006256), Sessions = 71 (max = 82), New sessions = [+0, -0], Connections = 78 (max = 112), New connections = [+1, -0], In-pool threads = 38, Active threads = 12, Available threads = 26, Queued tasks = 183, Pool queue wait = 214, NIO write queue = 4, NIO write queue wait = 8194, NIO write selectors = 4, NIO total selectors = 32, Subscribed items = 398, Inbound throughput = 19.62 updates/s (pre-filtered = 12.92), Outbound throughput = 367.9 updates/s (163.92 kbit/s, max = 9159.48), Lost updates = 0 (total = 0), Total bytes sent = 1014353784, Client messages throughput = 0 msgs/s (0 kbit/s, max = 0), Total messages handled = 0, Extra sleep = 89, Notify delay = 0
03-Sep-14 13:40:25,603|INFO |LightstreamerLogger.requests |SERVER POOLED THREAD 7 |Serving request: /lightstreamer/control.txt --> LS_session=S9fc7681542f5e19T3317880&LS_op=add&LS_table=19&LS_id=sy30&LS_mode=MERGE&LS_schema=OpenPrice%20DailyHighPrice%20DailyLowPrice%20ClosePrice%20TradedVolume&LS_data_adapter=PROXY_MarketBeat&LS_Snapshot=true on "Lightstreamer HTTP Server" from 37.104.60.191:63433
03-Sep-14 13:40:25,603|INFO |LightstreamerLogger.requests |SERVER POOLED THREAD 7 |Polling (70262) to session: Sddb1da9f03ac590eT4616827 on "Lightstreamer HTTP Server" from 188.54.176.28:49875
03-Sep-14 13:40:25,603|WARN |LightstreamerLogger.scheduler |Timer-1 |Current delay for tasks scheduled on thread pool EVENTS is 15 seconds.
03-Sep-14 13:40:34,152|WARN |LightstreamerLogger.scheduler |Timer-1 |If the delay issue persists, taking a JVM full thread dump may be the best way to investigate the issue.
03-Sep-14 13:40:34,152|WARN |LightstreamerLogger.scheduler |Timer-1 |Current delay for tasks scheduled on thread pool SNAPSHOT is 32 seconds.
03-Sep-14 13:40:34,152|WARN |LightstreamerLogger.scheduler |Timer-1 |If the delay issue persists, taking a JVM full thread dump may be the best way to investigate the issue.
03-Sep-14 13:40:34,152|WARN |LightstreamerLogger.scheduler |Timer-1 |Current delay for tasks scheduled on thread pool PUMP is 15 seconds.
03-Sep-14 13:40:34,152|WARN |LightstreamerLogger.scheduler |Timer-1 |If the delay issue persists, taking a JVM full thread dump may be the best way to investigate the issue.
03-Sep-14 13:40:34,152|INFO |LightstreamerLogger.requests |SERVER POOLED THREAD 7 |Polling (671) to session: S505844d25dd3fe2cT3245212 on "Lightstreamer HTTP Server" from 176.224.50.73:51677
03-Sep-14 13:40:49,862|INFO |LightstreamerLogger.requests |FOR PUMPS PARKING DESTROYER|Closed session S6d93ae988a324ee4T5319413 with internal cause code: 39 (Timeout).
03-Sep-14 13:40:49,862|INFO |LightstreamerLogger.requests |SERVER POOLED THREAD 7 |Polling (622) to session: S16dc896b1a7dfb5cT3753084 on "Lightstreamer HTTP Server" from 188.50.217.17:3090
03-Sep-14 13:40:49,862|INFO |LightstreamerLogger.requests |SERVER POOLED THREAD 7 |Serving request: /lightstreamer/control.txt --> LS_session=Se8b3d70b4480147aT0424792&LS_op=add&LS_table=9&LS_id=ix1&LS_mode=MERGE&LS_schema=IndexClose%20HighValue%20LowValue%20IndexClose%20TradedVolume&LS_data_adapter=PROXY_MarketBeat&LS_Snapshot=true on "Lightstreamer HTTP Server" from 37.106.246.202:27063
03-Sep-14 13:40:46,196|INFO |LightstreamerLogger.requests |SERVER POOLED THREAD 3 |Sync error: Found a closing session on "Lightstreamer HTTP Server" from 94.99.88.231:53393

,,
i get this problem twice now, i don't know why t happen or how to solve it

ahmedsmart4tech

it happened for the third time today,
here is a print screen

Dario Crivelli

The first log suggests that the JVM process is under CPU shortage, since we see that all the internal thread pools are unable to dequeue their tasks.
Moreover, the "Extra sleep" statistics from the Internal Monitor log shows that all threads that issued Thread.sleep were scheduled (on the average) 90 ms later than expected. Note that normally this figure is 0 and we have found it quite lower even in problematic cases.

Have you traced the process CPU usage? Could you confirm the suspect in some way?
Are there other processes in the host that may compete with the Server JVM for the CPU resource?

A possible cause for a high CPU usage is frequent Garbage Collections, caused by either memory shortage or a very intense allocation activity and, in fact, the second log snippet clearly shows a GC activity issue.
However, the latter may be just a consequence of previous problems. We can get no evidence of memory shortage in the first log, also because just a couple of samples of the Internal Monitor log are available.

To analyze the memory requirements of your usage scenario, we should collect many samples of the "Free" and "Total Heap" statistics (with LightstreamerMonitorText set as TRACE) while the Server is behaving normally.
By following the changes in the used (i.e. total - free) heap, we can estimate the rate of memory collecting.
Obviously, you could gather better measures by acting at system level (that is, outside of Lightstreamer scope) and having the JVM log its GC statistics, by configuring the proper settings in the launch script.

So, you should monitor the whole lifecycle of the Server process, then see if there are any linear relations between:
- the CPU usage by the JVM process;
- the GC activity in terms of collected memory;
- the number of sessions and/or the "Outbound throughput".
This should allow you to understand if the problem is due to an unsustainable level of activity.

ahmedsmart4tech

Dear DarioCrivelli,
i upgraded the cpu to 8 cores and uninstalled then installed the java on the server, and the lightstreamer is still stopped working everyday , which causes me a lot of troubles yu can see the error message and the monitor.
and please i need to do wat ever it takes to solve this problem , or simply recommend e another engine that can handle my business

ahmedsmart4tech

Dears,
i have added in the ls.bat , set JAVA_OPTS=-server -Xms6G -Xmx12G -XX:+UseG1GC
my server is 16 G Ram , 2 xeon x5660 2.8ghz with total 8 cores ,
f there is any configuration i need to add or to modify please tel me , i want to us the maximum resources for the lightstreamer , so i can avoid any problems ,
waiting your reply on how to solve this daily problems .

Dario Crivelli

Your configuration of the memory management is correct.
However, we are not able to confirm that the cause of the problem is a memory shortage. The memory issues may be just the final consequence of a completely different problem that triggered an abnormal sequence of operations in the Server.
In fact, at the moment we lack any information on the history of the execution that led to the final problem.
Hence, I'm afraid I can't provide you with any answer different from #3 above, where I asked you to collect monitoring statistics throughout the whole life of the Server.

ahmedsmart4tech

Dear DarioCrivelli,
This is log file from yesterday with the new memory config,
you can download the file from this link:
https://www.mediafire.com/?1bpsi0053z93cdb

and you can see some print screens from java visualvm in the attached.
Please waiting your reply ASAP to try to know the cause to solve this issue.
Regards,

Giuseppe Corti

Hi Ahmed,

I Checked the server log and matched the info with the screen prints.

Indeed the trouble of the Lightstreamer server seems to start at 12:21:59. Since then, the situation of the used memory, who had been always quite good, begins to degrade at a rate of about 300 MB every 2 seconds.
This situation goes on until to force the JVM to significant GC operations to a final reallocation up to 12 GB of total heap. In these moments the server undergoes periods of freezing (few seconds) that caused the disconnection of some clients, not all.
After about half an hour, the situation seems to be returning with a regular memory consumption more appropriate.

In the log in the relevant period of the issue the only activity of some importance is the reconnection of a client (IP address: 41.209.249.5) that, at the restart, performs several subscriptions including one repeated 60 times:

LS_session=S245abe733d79e813T2157492&
LS_op=add&
LS_table=X&
LS_id=ts1%20ts2%20ts3%20ts4%20ts5%20ts6%20ts7%20ts8%20ts9%20ts98%20ts131%20ts157%20ts132%20ts137%20ts138%20ts144%20ts146%20ts91%20ts166%20ts130%20ts86%20ts153%20ts94%20ts169%20ts83%20ts87%20ts97%20ts10%20ts11%20ts12%20ts13%20ts14%20ts15%20ts16%20ts17%20ts18%20ts19%20ts20%20ts21%20ts22%20ts23%20ts24%20ts25%20ts26%20ts27%20ts28%20ts29%20ts30%20ts31%20ts32%20ts33%20ts34%20ts122%20ts35%20ts36%20ts37%20ts38%20ts39%20ts40%20ts41%20ts80%20ts127%20ts114%20ts117%20ts155%20ts158%20ts161%20ts163%20ts168%20ts42%20ts123%20ts43%20ts44%20ts45%20ts46%20ts47%20ts124%20ts152%20ts135%20ts88%20ts156%20ts162%20ts164%20ts167%20ts170%20ts48%20ts49%20ts50%20ts51%20ts52%20ts53%20ts54%20ts55%20ts56%20ts125%20ts57%20ts58%20ts59%20ts60%20ts61%20ts62%20ts63%20ts64%20ts65%20ts66%20ts99%20ts134%20ts67%20ts113%20ts107%20ts101%20ts102%20ts115%20ts116%20ts151%20ts126%20ts136%20ts142%20ts160%20ts96%20ts68%20ts69%20ts70%20ts71%20ts72%20ts73%20ts74%20ts75%20ts76%20ts121%20ts139%20ts154%20ts77%20ts165%20ts95%20ts78%20ts79%20ts103%20ts81%20ts104%20ts82%20ts100%20ts105%20ts108%20ts106%20ts111%20ts110%20ts112%20ts109%20ts118%20ts119%20ts120%20ts133%20ts129%20ts128%20ts84%20ts85%20ts141%20ts140%20ts90%20ts143%20ts89%20ts148%20ts147%20ts149%20ts93%20ts159%20ts145%20ts150%20ts92%20&
LS_mode=COMMAND&
LS_schema=TradePrice%20TradeVolume%20TradeTime&
LS_data_adapter=PROXY_MarketBeat&
LS_Snapshot=true on "Lightstreamer HTTP Server" from 41.209.249.5:20622

Is this a normal behavior of your application?

I can not be sure that this is related at the beginning of the problem, it could simply be a coincidence, and the higher memory consumption due to an increase in the flow throughput of the incoming data from the Remoto Adapter into the server.

Could you check if there has been a considerable increase in the flow of data from the adapter to the server around that time? Have you some instruments to measure it?

Thank you,
Giuseppe

ahmedsmart4tech

Dear Giuseppe,
about the repeated subscriptions , no i's not normal behavior of my application , and there is no increase in the flow data from the adapter around that time , and i there is some kind of attack or many connections i don't think tat can increase the memory that big !!
i changed garbage collections but no use,
u can see the last logs in this link:
https://www.mediafire.com/?tbrfkb4nsj51ax9

Regards,

Giuseppe Corti

Hi Ahmed,

I confirm you that also this second case, Sept 10 at 14:54:36, seems to have the same trigger. A client (this time from a different ip) reconnects and submits about 40 times this subscription:

LS_session=S46d48811f64ddd75T5440006&
LS_op=add&
LS_table=29&
LS_id=ts1%20ts2%20ts3%20ts4%20ts5%20ts6%20ts7%20ts8%20ts9%20ts98%20ts131%20ts157%20ts132%20ts137%20ts138%20ts144%20ts146%20ts91%20ts166%20ts130%20ts86%20ts153%20ts94%20ts169%20ts83%20ts87%20ts97%20ts10%20ts11%20ts12%20ts13%20ts14%20ts15%20ts16%20ts17%20ts18%20ts19%20ts20%20ts21%20ts22%20ts23%20ts24%20ts25%20ts26%20ts27%20ts28%20ts29%20ts30%20ts31%20ts32%20ts33%20ts34%20ts122%20ts35%20ts36%20ts37%20ts38%20ts39%20ts40%20ts41%20ts80%20ts127%20ts114%20ts117%20ts155%20ts158%20ts161%20ts163%20ts168%20ts42%20ts123%20ts43%20ts44%20ts45%20ts46%20ts47%20ts124%20ts152%20ts135%20ts88%20ts156%20ts162%20ts164%20ts167%20ts170%20ts48%20ts49%20ts50%20ts51%20ts52%20ts53%20ts54%20ts55%20ts56%20ts125%20ts57%20ts58%20ts59%20ts60%20ts61%20ts62%20ts63%20ts64%20ts65%20ts66%20ts99%20ts134%20ts67%20ts113%20ts107%20ts101%20ts102%20ts115%20ts116%20ts151%20ts126%20ts136%20ts142%20ts160%20ts96%20ts68%20ts69%20ts70%20ts71%20ts72%20ts73%20ts74%20ts75%20ts76%20ts121%20ts139%20ts154%20ts77%20ts165%20ts95%20ts78%20ts79%20ts103%20ts81%20ts104%20ts82%20ts100%20ts105%20ts108%20ts106%20ts111%20ts110%20ts112%20ts109%20ts118%20ts119%20ts120%20ts133%20ts129%20ts128%20ts84%20ts85%20ts141%20ts140%20ts90%20ts143%20ts89%20ts148%20ts147%20ts149%20ts93%20ts159%20ts145%20ts150%20ts92%20&
LS_mode=COMMAND&
LS_schema=TradePrice%20TradeVolume%20TradeTime&
LS_data_adapter=PROXY_MarketBeat&LS_Snapshot=true

Please note that the 40 subscriptions are related to the same Lightstreamer session and that seem to be part of the initialization procedure.
Verifying in the log we found that this same subscription is present several times in the course of the day, and if repeated in a lower number of times from 4 to 10, results in a significant increase of memory that however is subsequently reabsorbed by the server.

Please, could you help me to understand if this Item (ts1%20ts2%20ts3...) involves a very large snapshot perhaps with very large fields too, such as an Item in COMMAND mode with a lot of associated keys.
Furthermore, I think you should investigate into the code of your client to try to prevent repeated subscriptions. I do not exclude that deleting them the issue will be mitigated if not completely eliminated.

As a last resort you may consider to perform a heap dump during a crisis.

Regards,
Giuseppe