ICGC Portal server requirements

We are setting up our production portal environment. We plan to use VM as portal servers. There is maximum memory limit to those VMs which is 64GB.

Based on the “Portal Mirror Specification” document, current ICGC production setup has some servers with 128GB RAM:

  1. 1 Varnish server node
  2. 15 elastic search node
  3. 32 HDFS nodes

I am wondering if it is possible for us to VMs with 64GB memory to achieve the same performance result.

For elastic search and HDFS, since they are big data app, I assume they are scalable app. To achieve the same result of 128GB server, we can use roughly two 64GB nodes. Is my assumption true from ICGC experience?

For Varnish server, it doesn’t appear there is a scalable cluster solution. There is HA solution which replicate cache between two servers. This is not what we are looking for. There are a couple of potential solutions:

  1. Just use smaller cache and hope performance is good enough.
  2. Use physical server. This is more costly and inconvenient.
  3. Put caching backend on SSD.

From ICGC experience, is 64GB Varnish cache server good enough? Has ICGC tried using SSD as cache?

Hey Brady,

I believe we use a 64GB server in production for Varnish. However we usually hover around ~10GB in usage. In my memory I cannot ever remember the cache size ever being an issue.

It all really comes down how much traffic you expect.

Thanks, Dusan. I now recall you or someone said this during a meeting. On our side, we plan to use 64GB VM for Varnish for now. We may be add physical server if it becomes too popular :slight_smile:

Hi Brady,

Regarding Elasticsearch memory configuration, here is a good write-up for it: https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html. 64GB memory is actually good, use slightly lower than 32GB for Elasticsearch, then the other 32GB will be left for OS filesystem caching.

The standard recommendation is to give 50% of the available memory to Elasticsearch heap, while leaving the other 50% free.

What’s also important is that don’t assign more than 32GB to Elasticsearch. At ICGC DCC, we run two Elasticsearch nodes on a 128GB physical machine, each node uses close to 32GB memory as recommended.

Hi Junjun,

Thanks for the comment and link. Based on your recommendation, we should be ok to use 64GB memory servers. How about number of servers? Since we will be running single instance elasticsearch per node, should we deploy twice the number of elasticsearch nodes at ICGC DCC? Is that necessary at current dataset size?

Brady

Hi Brady,

Short answer is no, I don’t think it’s necessary to run 30 ES data nodes at the beginning.

Below is the long answer. At ICGC, in Release 22 the index size is close to 150GB (300GB with replica). We run 30 ES data nodes, the index is divided into 15 shards (with one replica), each node hosts only one shard and about 10GB data. An ES node with 32GB memory should be capable to handle more data. Let’s say, you run 1/4 of the ES data nodes as we have, rounds to 8 nodes. With the same sharding and replica settings, each node will host about 4 times more data. This shouldn’t be a problem at all. Would it be slower, probably, but maybe not that much. Number of concurrent queries matters as well in terms of performance.

As Elasticsearch is designed to scale horizontally, you can always start with smaller number of nodes, then add more nodes when it’s necessary. One important decision to make is number of shards, if you configure 5 shards for an index (ES default setting), and there are 30 ES data nodes, only 5 of them will get a shard (assume no replica is configured). The other 25 ES nodes will be idling. So, even you start with a fairly small number of ES nodes, you should configure more shards. At beginning, each node will host more shards, when more nodes added, shards will be automatically transferred to new nodes resulting in fewer shards per node. That’s said, if we do have to adjust number of shards when new nodes are added, it’s possible to reconfigure and re-index all documents.

Junjun

Juju,

Thank you very much for the detailed explanation how the scaling works. That is really helpful!

It is great to have all these useful information on discussion board so future newbies can benefit from our discussion.

Brady