Should a Server instance fail for any reason, the client libraries will try to recover by reconnecting towards the same address.
To allow that, you may have a new instance start immediately by acting at the Operating System level; we don't provide any ready-made script for this task.
On the other hand, a Load Balancer ensures that, at the time one Server instance fails, others are alive and can be reached immediately at the same address.
About the snapshots, I confirm that the current snapshots are stored in the JVM heap for all items for which
isSnapshotAvailable returns true.
It is not possible to put limits on the memory needed for keeping the snapshots.
Only for items in DISTINCT mode, the
getDistinctSnapshotLength callback sets the number of events kept on an item basis (this is not needed for MERGE mode, where only the latest state of each field is kept).
For items in COMMAND mode, on the other hand, the snapshot is made by all keys added and not deleted yet and it cannot be limited, as that would compromise data correctness.
Hovever, the snapshot size also grows with the number of fields and the field value size and all this only depends on the nature of the data.
If you can estimate the size needed to keep the various snapshots, you can put a limit on the overall number of subscribed items, by acting on the Metadata Adapter.