ETL services run on virtual machines or host OS?

kibri · March 2, 2017, 10:59pm

Do you run the ETL spark and HDFS nodes on virtual machines or on the host OS?

andricDu · March 3, 2017, 3:52pm

We have both a bare metal cluster and a virtualized cluster.

Our production cluster uses bare metal. ~30 data nodes w/ 12Core-24Thread CPUs, 128GB Ram machines, running Debian.

Our test cluster is completely virtualized. ~10 data nodes w/ 16 vcpu, 64GB Ram machines, running Ubuntu.

Edit for extra node:
Our virtualized cluster runs on it’s own dedicated vlan separate from the other VMs we run.