Our Server
Sorry for the late reply. You system looks very similar to ours, but we have more like 2.5k logins per month (ie, “sessions”), and a bit more active users week-to-week.
We have a single server running both Solr and Tomcat. That server has:
- 64 GiB ram total
- 32 GiB Solr max heap (though it doesn’t use all this)
- 4 GiB tomcat max heap (and it uses all of this)
- 28 GiB “cached” memory (the memory-mapped index files)
The “cached” memory is the operating system using what would otherwise be “free” memory to store a copy of files that processes have or may access. This is mainly caching files of the Solr index. Because the OS uses “free” ram to do this caching, you shouldn’t force Java to use a large heap (by setting min heap size) since the OS can’t use a Java process’s heap to cache the files. Monitoring programs still report memory cached for certain processes as part of that process’s footprint, but since it can be shared between many processes, sometimes it is counted as “shared” memory.
In solrconfig.xml, we’ve set up a few caches. These are our settings currently:
<query>
<filterCache class="solr.FastLRUCache"
maxRamMB="2048"
showItems="5"/>
<documentCache class="solr.LRUCache"
size="8192"/>
</query>
These caches are stored in Solr’s heap memory, so it will expand as needed as it fills these caches.
Optimizing the Index
As far as general slowness, another university had such a problem after upgrading. We ended up recommending them to try to shrink the number of segments in the index by increasing the maximum allowable size of an individual segment, since their newer slower index had more segments than their older faster one. In solrconfig.xml
add an element:
<indexConfig>
<mergePolicyFactory class="org.apache.solr.index.TieredMergePolicyFactory">
<int name="maxMergeAtOnce">10</int>
<int name="segmentsPerTier">10</int>
<int name="maxMergedSegmentMB">100000</int>
</mergePolicyFactory>
<mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler">
<int name="maxThreadCount">1</int>
<int name="maxMergeCount">6</int>
</mergeScheduler>
</indexConfig>
I think the most pertinent part of that snippet is the maxMergedSegmentMB
which is set to 100 GB. (Much of the settings in that snippet are defaults, or near-defaults.) After making these changes, they forced an optimize on the documents
index, by issuing a request to:
https://localhost:8983/solr/documents/update?optimize=true&maxSegments=10
If that doesn’t shrink the number of segments much, or if performance doesn’t improve much, you could try re-indexing with that <indexConfig>
setting.
I didn’t expect this to improve their query performance by more than a factor of two or so, but it seems like it fixed the extreme slowness they had at the time.
The Patient Index Issue
As far as the patient-slave index being empty, that doesn’t sound right! EMERSE’s demographic charts shouldn’t work if the patient-slave index is not populated. If it is working, I’d check to see if EMERSE is actually looking at the patient index only. There are two properties in emerse.properties
that tell EMERSE which index names are for what:
solr.patientUpdateCollection=patient
solr.patientSearchCollection=patient-slave
The “update collection” is the one that is written to when EMERSE pushes the PATIENT
table to the index. The “search collection” is the one that the patient demographic charts use. If these are the same (such as both are set to patient
) then EMERSE would work even though the slave index is empty. This is actually a relatively okay way to configure EMERSE, the only problem is that when EMERSE pushes the PATIENT
table, it first drops all the records in the patient index, then pushes them all back in. While this process is going on, EMERSE searches that use the patient index won’t work, or will work wrong. However, since this push only happens at night (as typically configured) it shouldn’t affect operations much.
If those settings are set correctly, maybe we could have a call to look at it. I’d be very interested on how it works!