11# Exporting Tool for Amazon Keyspaces
2- The exporting tool (the tool) offloads Amazon Keyspaces table to HDFS/FS
2+ The exporting tool offloads the Amazon Keyspaces table to HDFS/FS
33
44# Build this project
5- To build and use this library execute the following mvn command and place on the classpath of your application .
5+ To build and use this library, execute the following mvn command.
66```
77mvn install package
88```
99
1010# Quick start
11- Before running the tool, check the source table’s the capacity mode . The table should be provisioned with at least 3,000 RCUs,
12- or be configured for on-demand mode. We recommend setting the page size for the driver in the application.conf file to be 2,500.
11+ Before running the tool, verify the capacity mode of the source table . The table should be provisioned with at least 3,000 RCUs,
12+ or be configured for on-demand mode. The recommendation is to set the page size for the driver in the application.conf file to 2,500.
1313Run the tool in the terminal with the following command:
1414
1515` java -cp "AmazonKeyspacesExportTool-1.0-SNAPSHOT-fat.jar" com.amazon.aws.keyspaces.Runner HDFS_FOLDER SOURCE_QUERY [--recover] `
@@ -24,10 +24,10 @@ If you need to re-start the process, you can use the optional recover option to
2424
2525RECOVER OPTION – you can use the ` --recover ` option if the tool failed with
2626` Cassandra timeout during read query at consistency LOCAL_QUORUM (2 responses were required but only 0 replica responded) ` .
27- The failed state will be persisted into state.ser file and renamed after it is processed.
27+ The failed state will be saved in a state.ser file and renamed after it is processed.
2828
2929# Validation
30- You can validate parquet files on HDFS/FS by using Apache Spark (spark-shell).
30+ You can validate the parquet files on HDFS/FS by using Apache Spark (spark-shell).
3131For example,
3232```
3333 val parquetFileDF = spark.read.parquet("file:///keyspace-name/table-name")
0 commit comments