IOPS improvements by asqasq · Pull Request #35 · zrlio/crail

asqasq · 2017-11-28T09:43:18Z

This PR contains changes to improve the number of IOPS:

Adapt to new DaRPC API
Additional Crail benchmark to measure IOPS with multiple namenodes
Additional HDFS benchmark to measure namenode IOPS
Move Crail namenode statistics out of fast-path to separate class, which can
be instantiated instead of original one, if we want to do measurements
Added properties to tune mempool
Request DaRPC version 1.4

Note: This PR has to be merged together with the corresponding DaRPC PR.

PepperJo · 2017-11-28T10:32:17Z

client/src/main/java/com/ibm/crail/tools/CrailBenchmark.java

+		double latency = 0.0;
+		if (executionTime > 0) {
+			latency = 1000000.0 * executionTime / ops;
+		}


I know this logic has been used elsewhere in this file but can we please change this to not use doubles and millis to calculate execution time. Instead use System.nanoTime() and do calculations with long where it makes sense for latency etc we of course need double

I am not exactly sure, why calculating in nanoseconds is better than milliseconds (both return long, but we need double for the division).

If we change that, it should be everywhere to keep it consistent. I'll create an issue, as I think that this is a bigger cleanup step.

The problem with currentTimeMillis is that it returns the actual time, i.e. is not monotonic and that leads to obvious problems, e.g. if you are running a NTP service time can change while you run a benchmark. That is why it is recommended to use System.nanoTime().

Ok sounds good. Will change all benchmark calculations in a new PR.

PepperJo · 2017-11-28T10:32:39Z

client/src/main/java/com/ibm/crail/tools/CrailBenchmark.java

+		}
+		long end = System.currentTimeMillis();
+		double executionTime = ((double) (end - start));
+		double latency = executionTime*1000.0 / ((double) batch);


See comment above

PepperJo · 2017-11-28T10:34:07Z

client/src/main/java/com/ibm/crail/tools/CrailBenchmark.java

 		String benchmarkTypes = "write|writeAsync|readSequential|readRandom|readSequentialAsync|readMultiStream|"
-				+ "createFile|createFileAsync|createMultiFile|getKey|getFile|getFileAsync|enumerateDir|browseDir|"
+				+ "createFile|createFileAsync|createMultiFile|getKey|getFile|getFileAsync|getMultiFile"
+				+ "getMultiFileAsync|enumerateDir|browseDir|"


Can we make this more generic. Lot's of Strings duplicated and possible spelling errors.
I would like to see something like a map from string -> benchmark method

I agree with this, there should be no need for string duplication.

I'd prefer to do this as a cleanup step and will create an issue.

PepperJo · 2017-11-28T10:34:52Z

hdfs/src/main/java/com/ibm/crail/hdfs/tools/HdfsIOBenchmark.java

 	public static void usage(){
 		System.out.println("Usage:");
-		System.out.println("hdfsbench <readSequentialDirect|readSequentialHeap|readRandomDirect|readRandomHeap|writeSequentialHeap> <size> <iterations> <path>");
+		System.out.println("hdfsbench <readSequentialDirect|readSequentialHeap|readRandomDirect|readRandomHeap|writeSequentialHeap|getFile|getFileIOPS> <size> <iterations> <path>");


Again please make this more generic

PepperJo · 2017-11-28T10:35:38Z

hdfs/src/main/java/com/ibm/crail/hdfs/tools/HdfsIOBenchmark.java

+		}
+		long end = System.currentTimeMillis();
+		double iops = ((double)loop) / (end - start) * (double)1000.0;
+		double executionTime = ((double) (end - start));


Again see above.

PepperJo · 2017-11-28T10:36:29Z

rpc-darpc/src/main/java/com/ibm/crail/namenode/rpc/darpc/DaRPCConstants.java

 	public static final String NAMENODE_DARPC_CLUSTERSIZE_KEY = "crail.namenode.darpc.clustersize";
 	public static int NAMENODE_DARPC_CLUSTERSIZE = 128;	

+	public static final String NAMENODE_DARPC_MEMPOOL_HUGEPAGEPATH_KEY = "crail.namenode.darpc.mempool.hugepagepath";


I was under the impression we don't want this multilevel config anymore e.g. everything should be darpc.X

PepperJo · 2017-11-28T10:39:17Z

rpc-darpc/src/main/java/com/ibm/crail/namenode/rpc/darpc/DaRPCServiceDispatcher.java

+	protected AtomicLong renameOps;
+	protected AtomicLong getOps;
+	protected AtomicLong locationOps;
+	protected AtomicLong errorOps;


I prefer putting the Atomics in the stats class instead of making them protected

patrickstuedi

I think we should take out the code for the statistics that we only used internally (e.g., DaRPCServiceDispatcherStats.java)

asqasq · 2017-12-13T21:23:47Z

I removed the IOPS thread. Crail uses now the simple memory pool. I created two cleanup issues based on PepperJo's comments. Please have a look at the newest version.

PepperJo

One minor comment.

PepperJo · 2017-12-14T08:58:11Z

rpc-darpc/src/main/java/com/ibm/crail/namenode/rpc/darpc/DaRPCNameNodeServer.java

 		}
 		DaRPCServiceDispatcher darpcService = new DaRPCServiceDispatcher(service);
-		this.namenodeServerGroup = DaRPCServerGroup.createServerGroup(darpcService, clusterAffinities, -1, DaRPCConstants.NAMENODE_DARPC_MAXINLINE, DaRPCConstants.NAMENODE_DARPC_POLLING, DaRPCConstants.NAMENODE_DARPC_RECVQUEUE, DaRPCConstants.NAMENODE_DARPC_SENDQUEUE, DaRPCConstants.NAMENODE_DARPC_POLLSIZE, DaRPCConstants.NAMENODE_DARPC_CLUSTERSIZE);
+		if (!DaRPCConstants.NAMENODE_DARPC_STATS.isEmpty()) {


Maybe it makes sense to treat crail.namenode.darpc.stats as a boolean or similar. I can see this being misinterpreted otherwise and set to "crail.namenode.darpc.stats false" or "crail.namenode.darpc.stats no".

…S and new HDFS benchmarks to measure IOPS.

PepperJo

One minor comment.

PepperJo · 2018-01-18T11:40:12Z

rpc-darpc/src/main/java/com/ibm/crail/namenode/rpc/darpc/DaRPCNameNodeServer.java

+				DaRPCConstants.NAMENODE_DARPC_MEMPOOL_ALIGNMENT,
+				DaRPCConstants.NAMENODE_DARPC_MEMPOOL_ALLOC_LIMIT
+				);
 		String _clusterAffinities[] = DaRPCConstants.NAMENODE_DARPC_AFFINITY.split(",");


I would prefer doing all parsing in the DaRPCConstants.

I did not add any parsing here. If you mean the split(), this is old code. I would prefer cleaning aup existing code in a separate PR

Sounds good.

PepperJo suggested changes Nov 28, 2017

View reviewed changes

patrickstuedi suggested changes Nov 28, 2017

View reviewed changes

asqasq force-pushed the iopsimprovements branch from 5512a02 to 4140ce6 Compare December 13, 2017 13:55

PepperJo approved these changes Dec 14, 2017

View reviewed changes

asqasq added 6 commits January 11, 2018 11:12

IOPS improvements, new Crail benchmarks to measure multi-namenode IOP…

39740af

…S and new HDFS benchmarks to measure IOPS.

Use simple memory pool.

64feb09

Use simple mempool with hugepages.

bbed231

Minor fix.

c05d788

NAMENODE_DARPC_STATS is now a boolean.

180a976

Added a limit property to the mempool properties.

52aea73

asqasq force-pushed the iopsimprovements branch from 8d8935a to 52aea73 Compare January 17, 2018 23:43

patrickstuedi approved these changes Jan 18, 2018

View reviewed changes

PepperJo approved these changes Jan 18, 2018

View reviewed changes

Conversation

asqasq commented Nov 28, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

patrickstuedi left a comment

Choose a reason for hiding this comment

Uh oh!

asqasq commented Dec 13, 2017

Uh oh!

PepperJo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PepperJo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asqasq Jan 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

asqasq Jan 18, 2018 •

edited

Loading