Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

When comparing D and E (with and without indexing), there should be an increase in performance, when turning indexing off. Since this is not the case I'm guessing that the I/O bottleneck is hit even earlier (replication over the network?) so that indexing does not slow down the ingest process at all.

Node network I/O performance

The physical hosts have a 1gb/s network connection but I measured the network performance to be ~ 10MB/s when pushing one file from one VM to another VM over the network. This is probably due to the fact that multiple VMs share the I/O channel of one physical host

Node hdd performance

ubuntu@ ubuntu:/data$  sync;time sudo bash -c "(dd if=/dev/zero of=bf bs=8k count=500000; sync)"

...

real 2m34.033s
user 0m0.060s
sys 0m5.590s

 

CPU

Following is the output of 'cat /proc/cpuinfo' on one VM

vendor_id : GenuineIntel
cpu family : 6
model : 6
model name : QEMU Virtual CPU version 0.9.1
stepping : 3
cpu MHz : 2266.804
cache size : 32 KB
fpu : yes
fpu_exception : yes
cpuid level : 4
wp : yes
flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni hypervisor
bogomips : 4533.60
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 6
model name : QEMU Virtual CPU version 0.9.1
stepping : 3
cpu MHz : 2266.804
cache size : 32 KB
fpu : yes
fpu_exception : yes
cpuid level : 4
wp : yes
flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni hypervisor
bogomips : 4533.60
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:

Load balancing

Load balancing is done by using an apache server with mod_jk enabled on a dedicated VM and a jk_workers.properties file which has the individual nodes configured as mod_jk workers. This results in a simple round-robin load balancing mechanism.

The jk_workers.properties file is currently being generated via a shell script: 

https://github.com/futures/scc-cluster-install/blob/master/fedora-node.sh#L44

Example:

To balance between 7 nodes the jk_workers.properties file could look like this: 

https://gist.github.com/fasseg/7138008

 

 

Results

Test Utility

BenchTool: https://github.com/futures/benchtool

...