-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Static cluster/session management in CQL backend #384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@pkendall64 Would appreciate you taking a look at this to make sure the updates here seem reasonable. One of the main performance improvements came from switching back to table truncation (which I know you had originally) in clearStorage. As mentioned above the errors that occurred when making this switch were resolved by ensuring commitlog directory was deleted. Not sure why this is required for CQL and not Thrift. |
|
@sjudeng instead of one build target now there are four travis builds for cql. How long does each of the lines 43-46 take? let me see if i can find that in the build. |
|
15, 10, 16 and 14 minutes. OK, great. https://travis-ci.org/JanusGraph/janusgraph/builds/249466821?utm_source=github_status&utm_medium=notification |
|
@amcp Yep. The Travis jobs for CQL (like Cassandra and HBase) are split by test execution and package. Test timing in Travis can vary considerably based on time of day and other external factors. Also by keeping individual Travis job times below 15 minutes they can get the full benefit of travis_retry, which will retry up to three times but only within the overall 50 minute job timeout. |
amcp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor questions
| private static final int DEFAULT_PORT = 9042; | ||
|
|
||
| private static Cluster cluster; | ||
| private static Map<String,Session> sessions = new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not running tests in parallel anymore as per the xml file. why does this need to be concurrent? As on line 117 there is only one keyspace variable for this store manager so what else will go in the sessions map? I searched the original file and this file and I do not see any other uses of the sessions map. Finally, initializeSession is private and it seems to be the only method that mutates the session map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sjudeng I get it, its static. You are considering the case of multiple cql store manager instances in the same JVM. When did this come up in testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amcp No it might not be a problem in testing. Do you think it might cause issues to use concurrent data structure and/or synchronized modifier here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no problem here.
| sessions.put(keyspaceName, cluster.connect()); | ||
| final Configuration configuration = getStorageConfig(); | ||
| final Map<String, Object> replication = Match(configuration.get(REPLICATION_STRATEGY)).of( | ||
| Case($("SimpleStrategy"), strategy -> HashMap.<String, Object>of("class", strategy, "replication_factor", configuration.get(REPLICATION_FACTOR))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the dollar sign for? Do we need to use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The $ turns a predicate into a pattern and most of the vavr APIs take patterns in the 0.9 version.
| final Future<Seq<ResultSet>> result = Future.sequence( | ||
| Iterator.ofAll(cluster.getMetadata().getKeyspace(this.keyspace).getTables()) | ||
| .map(table -> Future.fromJavaFuture(sessions.get(this.keyspace).executeAsync(truncate(this.keyspace, table.getName()))))); | ||
| result.await(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no timeout for await?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above
|
Yeah, looks good to me. I have to wonder whether it would be better to use a wrapper to provide the caching in the test area rather than baking it into the main code directly though. |
|
@pkendall64 That's not a bad idea. Probably would be a little ugly but at least it would contain it to test code. I'll take a look. The suggestion to use the singleton cluster/session comes from http://www.datastax.com/dev/blog/4-simple-rules-when-using-the-datastax-drivers-for-cassandra. But now that I think about it if there needs to be support for connecting to different Cassandra cluster's in the same JVM this would require more updates than provided here and would probably get pretty messy. |
…per to manage cluster and session statically in CQL tests. Signed-off-by: sjudeng <sjudeng@users.noreply.github.com>
|
@sjudeng maybe for testing it is useful to connect to be able to connect to different Cassandra clusters, but I think we can just convert the same-jvm-multiple-cluster-cql support issue into an issue on GitHub and move on with these changes? |
|
@amcp I think it's moot now ... I just pushed updates that refactor to use a wrapper class during testing (thanks @pkendall64). As expected it's a little ugly (needed some reflection on StandardStoreManager) but almost all updates are in test code only. I think it makes a lot more sense this way as it allows clients to build in caching themselves if necessary (e.g. caching CQLStoreManager instances by physical cluster and/or keyspace). |
amcp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking forward to getting test time down.
Static cluster/session management in CQL backend
Static cluster/session management in CQL backend
Makes the following updates to fix #383. With these updates the full (default+TinkerPop) test suite runtime for the CQL module is decreased from 15.1 hours to 3.3 hours.