diff --git a/.flake8rc b/.flake8rc index 60c91a7..b820fa8 100644 --- a/.flake8rc +++ b/.flake8rc @@ -1,3 +1,3 @@ [flake8] -ignore = E124, W293, E261, W291, W292, F403, E303, F405 +ignore = E124, W293, E261, W291, W292, F403, E303, F405, W504 max-line-length = 80 diff --git a/.travis.yml b/.travis.yml index 2024dc6..9665bf8 100644 --- a/.travis.yml +++ b/.travis.yml @@ -27,6 +27,12 @@ install: # - pip install . # - pip install coverage - pip install keras + - pip install radical.utils==0.72 + - pip install radical.saga==0.72 + - pip install radical.pilot==0.72 + - pip install radical.entk==0.72 + - pip install tifffile + - pip install pandas - pip install flake8 - pip install pylint # - pip install codecov diff --git a/README.md b/README.md index ec3ed2c..732f456 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,92 @@ -# Rivers (Arctic hydrology) - -We provide a classification algorithm for ice surface features from high-resolution imagery. This algorithm was developed by convolutional neural network training to detect regions of large and small rivers and to distinguish them from crevasses and non-channelized slush. We also provide a detection algorithm to extract polyline water features from the classified high-probability river areas. - +# Rivers (Arctic hydrology) + +We provide a classification algorithm for ice surface features from high-resolution imagery. This algorithm was developed by convolutional neural network training to detect regions of large and small rivers and to distinguish them from crevasses and non-channelized slush. We also provide a detection algorithm to extract polyline water features from the classified high-probability river areas. + +## Prerequisites - all available on bridges via the commands below +- Linux +- Python 3 +- CPU and NVIDIA GPU + CUDA CuDNN + +## Software Dependencies - these will be installed automatically with the installation below. +- numpy +- scipy +- tifffile +- keras >= 1.0 +- tensorboardX==1.8 +- opencv-python +- rasterio +- affine + +## Installation +Preliminaries: +These instructions are specific to XSEDE Bridges but other resources can be used if cuda, python3, and a NVIDIA P100 GPU are present, in which case 'module load' instructions can be skipped, which are specific to Bridges. + +For Unix or Mac Users: +Login to bridges via ssh using a Unix or Mac command line terminal. Login is available to bridges directly or through the XSEDE portal. Please see the [Bridges User's Guide](https://portal.xsede.org/psc-bridges). + +For Windows Users: +Many tools are available for ssh access to bridges. Please see [Ubuntu](https://ubuntu.com/tutorials/tutorial-ubuntu-on-windows#1-overview), [MobaXterm](https://mobaxterm.mobatek.net) or [PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/) + +### PSC Bridges +Once you have logged into bridges, you can follow one of two methods for installing iceberg-rivers. + +#### Method 1 (Recommended): + +The lines below following a '$' are commands to enter (or cut and paste) into your terminal (note that all commands are case-sensitive, meaning capital and lowercase letters are differentiated.) Everything following '#' are comments to explain the reason for the command and should not be included in what you enter. Lines that do not start with '$' or '[rivers_env] $' are output you should expect to see. + +```bash +$ pwd +/home/username +$ cd $SCRATCH # switch to your working space. +$ mkdir Rivers # create a directory to work in. +$ cd Rivers # move into your working directory. +$ module load cuda # load parallel computing architecture. +$ module load python3 # load correct python version. +$ virtualenv rivers_env # create a virtual environment to isolate your work from the default system. +$ source rivers_env/bin/activate # activate your environment. Notice the command line prompt changes to show your environment on the next line. +[rivers_env] $ pwd +/pylon5/group/username/Rivers +[rivers_env] $ export PYTHONPATH=/rivers_env/lib/python3.5/site-packages # set a system variable to point python to your specific code. (Replace with the results of pwd command above. +[rivers_env] $ pip install iceberg_rivers.search # pip is a python tool to extract the requested software (iceberg_rivers.search in this case) from a repository. (this may take several minutes). +``` + +#### Method 2 (Installing from source; recommended for developers only): + +```bash +$ git clone https://github.com/iceberg-project/Rivers.git +$ module load cuda +$ module load python3 +$ virtualenv rivers_env +$ source rivers_env/bin/activate +[rivers_env] $ export PYTHONPATH=/rivers_env/lib/python3.5/site-packages +[rivers_env] $ pip install . --upgrade +``` + +#### To test +```bash +[iceberg_rivers] $ deactivate # exit your virtual environment. +$ interact -p GPU-small --gres=gpu:p100:1 # request a compute node. This package has been tested on P100 GPUs on bridges, but that does not exclude any other resource that offers the same GPUs. (this may take a minute or two or more to receive an allocation). +$ cd $SCRATCH/Rivers # make sure you are in the same directory where everything was set up before. +$ module load cuda # load parallel computing architecture, as before. +$ module load python3 # load correct python version, as before. +$ source rivers_env/bin/activate # activate your environment, no need to create a new environment because the Rivers tools are installed and isolated here. +[iceberg_rivers] $ iceberg_rivers.detect --help # this will display a help screen of available usage and parameters. +``` +## Prediction +- Download a pre-trained model at: + +You can download to your local machine and use scp, ftp, rsync, or Globus to [transfer to bridges](https://portal.xsede.org/psc-bridges). + +Rivers predicting is executed in three steps: +First, follow the environment setup commands under 'To test' above. Then create tiles from an input GeoTiff image and write to the output_folder. The scale_bands parameter (in pixels) depends on the trained model being used. The default scale_bands is 299 for the pre-trained model downloaded above. If you use your own model the scale_bands may be different. +```bash +[iceberg_rivers] $ iceberg_rivers.tiling --scale_bands=299 --input_image= --output_folder=./test +``` +Then, detect rivers on each tile and output counts and confidence for each tile. +```bash +[iceberg_rivers] $ iceberg_rivers.predicting --input_image= --model_architecture=UnetCntWRN --hyperparameter_set=A --training_set=test_vanilla --test_folder=./test --model_path=./ --output_folder=./test_image +``` +Finally, mosaic all the tiles back into one image +```bash +[iceberg_rivers] $ iceberg_rivers.mosaic --input_folder=./test_image +``` diff --git a/src/classification/model.py b/src/classification/model.py new file mode 100644 index 0000000..4447511 --- /dev/null +++ b/src/classification/model.py @@ -0,0 +1,143 @@ +import numpy as np +import os +import skimage.io as io +import skimage.transform as trans +import numpy as np +from keras.models import * +from keras.layers import * +from keras.optimizers import * +from keras.callbacks import ModelCheckpoint, LearningRateScheduler +from keras import backend as k +from keras.models import Model +from keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D, concatenate, Conv2DTranspose, BatchNormalization, Dropout +from keras.optimizers import Adam +from keras.utils import plot_model + +# Custom IoU metric +def mean_iou(y_true, y_pred): + prec = [] + for t in np.arange(0.5, 1.0, 0.05): + y_pred_ = tf.to_int32(y_pred > t) + score, up_opt = tf.metrics.mean_iou(y_true, y_pred_, 2) + K.get_session().run(tf.local_variables_initializer()) + with tf.control_dependencies([up_opt]): + score = tf.identity(score) + prec.append(score) + return K.mean(K.stack(prec), axis=0) + +def recall_m(y_true, y_pred): + true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1))) + possible_positives = K.sum(K.round(K.clip(y_true, 0, 1))) + recall = true_positives / (possible_positives + K.epsilon()) + return recall + +def precision_m(y_true, y_pred): + true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1))) + predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1))) + precision = true_positives / (predicted_positives + K.epsilon()) + return precision + +def f1_m(y_true, y_pred): + precision = precision_m(y_true, y_pred) + recall = recall_m(y_true, y_pred) + return 2*((precision*recall)/(precision+recall+K.epsilon())) + +def unet(n_classes=1, im_sz=224, n_channels=3, n_filters_start=32, growth_factor=2, upconv=True): + droprate=0.25 + n_filters = n_filters_start + inputs = Input((im_sz, im_sz, n_channels)) + #inputs = BatchNormalization()(inputs) + conv1 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(inputs) + conv1 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(conv1) + pool1 = MaxPooling2D(pool_size=(2, 2))(conv1) + #pool1 = Dropout(droprate)(pool1) + + n_filters *= growth_factor + pool1 = BatchNormalization()(pool1) + conv2 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(pool1) + conv2 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(conv2) + pool2 = MaxPooling2D(pool_size=(2, 2))(conv2) + pool2 = Dropout(droprate)(pool2) + + n_filters *= growth_factor + pool2 = BatchNormalization()(pool2) + conv3 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(pool2) + conv3 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(conv3) + pool3 = MaxPooling2D(pool_size=(2, 2))(conv3) + pool3 = Dropout(droprate)(pool3) + + n_filters *= growth_factor + pool3 = BatchNormalization()(pool3) + conv4_0 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(pool3) + conv4_0 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(conv4_0) + pool4_1 = MaxPooling2D(pool_size=(2, 2))(conv4_0) + pool4_1 = Dropout(droprate)(pool4_1) + + n_filters *= growth_factor + pool4_1 = BatchNormalization()(pool4_1) + conv4_1 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(pool4_1) + conv4_1 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(conv4_1) + pool4_2 = MaxPooling2D(pool_size=(2, 2))(conv4_1) + pool4_2 = Dropout(droprate)(pool4_2) + + n_filters *= growth_factor + conv5 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(pool4_2) + conv5 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(conv5) + + n_filters //= growth_factor + if upconv: + up6_1 = concatenate([Conv2DTranspose(n_filters, (2, 2), strides=(2, 2), padding='same')(conv5), conv4_1]) + else: + up6_1 = concatenate([UpSampling2D(size=(2, 2))(conv5), conv4_1]) + up6_1 = BatchNormalization()(up6_1) + conv6_1 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(up6_1) + conv6_1 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(conv6_1) + conv6_1 = Dropout(droprate)(conv6_1) + + n_filters //= growth_factor + if upconv: + up6_2 = concatenate([Conv2DTranspose(n_filters, (2, 2), strides=(2, 2), padding='same')(conv6_1), conv4_0]) + else: + up6_2 = concatenate([UpSampling2D(size=(2, 2))(conv6_1), conv4_0]) + up6_2 = BatchNormalization()(up6_2) + conv6_2 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(up6_2) + conv6_2 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(conv6_2) + conv6_2 = Dropout(droprate)(conv6_2) + + n_filters //= growth_factor + if upconv: + up7 = concatenate([Conv2DTranspose(n_filters, (2, 2), strides=(2, 2), padding='same')(conv6_2), conv3]) + else: + up7 = concatenate([UpSampling2D(size=(2, 2))(conv6_2), conv3]) + up7 = BatchNormalization()(up7) + conv7 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(up7) + conv7 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(conv7) + conv7 = Dropout(droprate)(conv7) + + n_filters //= growth_factor + if upconv: + up8 = concatenate([Conv2DTranspose(n_filters, (2, 2), strides=(2, 2), padding='same')(conv7), conv2]) + else: + up8 = concatenate([UpSampling2D(size=(2, 2))(conv7), conv2]) + up8 = BatchNormalization()(up8) + conv8 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(up8) + conv8 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(conv8) + conv8 = Dropout(droprate)(conv8) + + n_filters //= growth_factor + if upconv: + up9 = concatenate([Conv2DTranspose(n_filters, (2, 2), strides=(2, 2), padding='same')(conv8), conv1]) + else: + up9 = concatenate([UpSampling2D(size=(2, 2))(conv8), conv1]) + conv9 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(up9) + conv9 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(conv9) + + conv10 = Conv2D(n_classes, (1, 1), activation='sigmoid')(conv9) + + model = Model(inputs=inputs, outputs=conv10) + + model.compile(optimizer=Adam(lr = 1e-5), loss = 'binary_crossentropy', metrics = ['accuracy',f1_m,precision_m, recall_m]) + model.summary() + + return model + diff --git a/src/classification/mosaic.m b/src/classification/mosaic.m index 884df6d..3d8ca55 100644 --- a/src/classification/mosaic.m +++ b/src/classification/mosaic.m @@ -1,76 +1,64 @@ -% Author: Samira Daneshgar-Asl -% License: MIT -% Copyright: 2018-2019 - -clear all -clc - -[FileName, PathName] = uigetfile('*.tif', 'Select the 8 bit image file with 3 bands:'); -File = fullfile(PathName, FileName); -[I, R] = geotiffread(File); -img=I; -clear I; - -patch_size=800; - -desired_row_size=(patch_size/2)*ceil(size(img,1)/(patch_size/2)); -desired_col_size=(patch_size/2)*ceil(size(img,2)/(patch_size/2)); - -image = zeros(desired_row_size,desired_col_size,size(img,3),'int16'); -image(1:size(img,1),1:size(img,2),:) = img; - -a=1:patch_size/2:size(image, 1); -b=1:patch_size/2:size(image, 2); -row = 1:patch_size/2:a(1,end-1); -col = 1:patch_size/2:b(1,end-1); -A=zeros(a(1,end)+(patch_size/2)-1,b(1,end)+(patch_size/2)-1,'single'); -B=zeros(a(1,end)+(patch_size/2)-1,b(1,end)+(patch_size/2)-1,'single'); - -if isunix - path = 'data/WV_predicted'; -elseif ispc - path = 'data\WV_predicted'; -else - path = '' - disp 'Something went wrong'; -end - -files = dir(fullfile(path, '*.tif')); -files_ordered = natsortfiles({files.name}); -totalFiles = numel(files); - -k=1; -for i=1:size(row,2) - for j=1:size(col,2) - I=imread(fullfile('data\WV_predicted',files_ordered{1,k})); - A(row(i):row(i)+patch_size-1,col(j):col(j)+patch_size-1)=I; - B=max(A,B); - if k~=totalFiles - k=k+1; - else - end - end -end - -B(sum(image,3)==0)=0; -xi = [col(1,1)-.5, col(1,end)+patch_size-1+.5]; -yi = [row(1,1)-.5, row(1,end)+patch_size-1+.5]; -[xlimits, ylimits] = intrinsicToWorld(R, xi, yi); -subR = R; -subR.RasterSize = size(B); -subR.XLimWorld = sort(xlimits); -subR.YLimWorld = sort(ylimits); -info = geotiffinfo(File); -writeFileName=[strtok(FileName, '.'),'-mask-predicted.tif']; -geotiffwrite(writeFileName,B,subR,'GeoKeyDirectoryTag',info.GeoTIFFTags.GeoKeyDirectoryTag,'TiffTags',struct('Compression',Tiff.Compression.None)) - - - - - - - - - - - +% Author: Samira Daneshgar-Asl, Ioannis Paraskevakos +% License: MIT +% Copyright: 2018-2019 + +function mosaic(FileName, FilePath, PredictedPath, WriteDir) + File = fullfile(FilePath,FileName); + [img, R] = geotiffread(File); + if isunix + path = strcat(PredictedPath, '/data/predicted_tiles/', strtok(FileName, '.')); + elseif ispc + path = strcat(PredictedPath, '\data\predicted_tiles\', strtok(FileName, '.')); + else + path = ''; + disp 'Something went wrong'; + end + + if ~exist(WriteDir, 'dir') + mkdir(WriteDir); + end + + patch_size=800; + + desired_row_size=(patch_size/2)*ceil(size(img,1)/(patch_size/2)); + desired_col_size=(patch_size/2)*ceil(size(img,2)/(patch_size/2)); + + image = zeros(desired_row_size,desired_col_size,size(img,3),'int8'); + image(1:size(img,1),1:size(img,2),:) = img; + + a=1:patch_size/2:size(image, 1); + b=1:patch_size/2:size(image, 2); + row = 1:patch_size/2:a(1,end-1); + col = 1:patch_size/2:b(1,end-1); + A=zeros(a(1,end)+(patch_size/2)-1,b(1,end)+(patch_size/2)-1,'single'); + B=zeros(a(1,end)+(patch_size/2)-1,b(1,end)+(patch_size/2)-1,'single'); + + files = dir(fullfile(path, '*.tif')); + files_ordered = natsortfiles({files.name}); + totalFiles = numel(files); + + k=1; + for i=1:size(row,2) + for j=1:size(col,2) + I=imread(fullfile(path, files_ordered{1,k})); + A(row(i):row(i)+patch_size-1,col(j):col(j)+patch_size-1)=I; + B=max(A,B); + if k~=totalFiles + k=k+1; + else + end + end + end + + B(sum(image,3)==0)=0; + xi = [col(1,1)-.5, col(1,end)+patch_size-1+.5]; + yi = [row(1,1)-.5, row(1,end)+patch_size-1+.5]; + [xlimits, ylimits] = intrinsicToWorld(R, xi, yi); + subR = R; + subR.RasterSize = size(B); + subR.XLimWorld = sort(xlimits); + subR.YLimWorld = sort(ylimits); + info = geotiffinfo(File); + writeFileName=fullfile(WriteDir,strcat(strtok(FileName, '.'),'-mask-predicted.tif')); + geotiffwrite(writeFileName,B,subR,'GeoKeyDirectoryTag',info.GeoTIFFTags.GeoKeyDirectoryTag,'TiffTags',struct('Compression',Tiff.Compression.None)) +end diff --git a/src/classification/multipagetiff.m b/src/classification/multipagetiff.m deleted file mode 100644 index ad2879b..0000000 --- a/src/classification/multipagetiff.m +++ /dev/null @@ -1,29 +0,0 @@ -% Author: Samira Daneshgar-Asl -% License: MIT -% Copyright: 2018-2019 - -% when we have an 8-bit image with 3 bands and we want to save it as a multi-page -clear all -clc - -CurrentDir=pwd; -WriteDir = fullfile(pwd, '8bit-3bands Multi-Page Images'); -if ~exist(WriteDir, 'dir') - mkdir(WriteDir); -end - -ReadDir = fullfile(pwd, '8bit-3bands Images'); -files1 =dir(ReadDir); -files1(1:2) = []; -totalFiles = numel(files1); - -for i =1:totalFiles - Fileaddress{i,1}=strcat(ReadDir,'\',files1(i).name); - file{i} = geotiffread(Fileaddress{i,1}); - cd(CurrentDir) - writeFileName = strcat(WriteDir,'\multipage-',num2str(files1(i).name)); - saveastiff(file{i},writeFileName); - cd(ReadDir) % return to actualFile folder -end - - diff --git a/src/classification/predict.py b/src/classification/predict.py index e4bccb4..8cf2780 100644 --- a/src/classification/predict.py +++ b/src/classification/predict.py @@ -1,29 +1,16 @@ """ -Author: Samira Daneshgar-Asl +Authors: Samira Daneshgar-Asl, Ioannis Paraskevakos License: MIT Copyright: 2018-2019 """ +import os import math +import argparse import numpy as np import tifffile as tiff -import os -import sys - -from train_unet import weights_path, get_model, normalize, PATCH_SZ, N_CLASSES - -if not os.path.exists('data/WV_predicted'): - os.makedirs('data/WV_predicted') - -image = normalize(tiff.imread('8bit-3bands Multi-Page Images/name of the multi-page WV image.tif').transpose([1, 2, 0])) -wind_row, wind_col = 800,800 # dimensions of the image -windowSize = 800 -stepSize=400 -desired_row_size=stepSize*math.ceil(image.shape[0]/stepSize) -desired_col_size=stepSize*math.ceil(image.shape[1]/stepSize) -img = np.zeros((desired_row_size,desired_col_size,image.shape[2]), dtype=image.dtype) -img[:image.shape[0],:image.shape[1]] = image +from train_unet import get_model, normalize, PATCH_SZ, N_CLASSES def predict(x, model, patch_sz=160, n_classes=2): @@ -35,7 +22,8 @@ def predict(x, model, patch_sz=160, n_classes=2): npatches_horizontal = math.ceil(img_width / patch_sz) extended_height = patch_sz * npatches_vertical extended_width = patch_sz * npatches_horizontal - ext_x = np.zeros(shape=(extended_height, extended_width, n_channels), dtype=np.float32) + ext_x = np.zeros(shape=(extended_height, extended_width, n_channels), + dtype=np.float32) # fill extended image with mirrors: ext_x[:img_height, :img_width, :] = x for i in range(img_height, extended_height): @@ -53,7 +41,8 @@ def predict(x, model, patch_sz=160, n_classes=2): patches_array = np.asarray(patches_list) # predictions: patches_predict = model.predict(patches_array, batch_size=4) - prediction = np.zeros(shape=(extended_height, extended_width, n_classes), dtype=np.float32) + prediction = np.zeros(shape=(extended_height, extended_width, n_classes), + dtype=np.float32) for k in range(patches_predict.shape[0]): i = k // npatches_horizontal j = k % npatches_vertical @@ -62,27 +51,58 @@ def predict(x, model, patch_sz=160, n_classes=2): prediction[x0:x1, y0:y1, :] = patches_predict[k, :, :, :] return prediction[:img_height, :img_width, :] + # generating sliding window def sliding_window(img, stepSize, windowSize): for y in range(0, img.shape[0], stepSize): for x in range(0, img.shape[1], stepSize): - yield (x, y, img[y:y + windowSize, x:x + windowSize,:]) - + yield (x, y, img[y:y + windowSize, x:x + windowSize, :]) + + def main(): - outPath = "data/WV_predicted" - i=0 - for(x,y, window) in sliding_window(img, stepSize, windowSize): + parser = argparse.ArgumentParser() + parser.add_argument('-i', '--input', type=str, + help='Path and Filename of the 3-Band Multipage WV \ + Image', required=True) + parser.add_argument('-w', '--weights_path', type=str, + help='Path to the weights') + parser.add_argument('-o', '--output_folder', type=str, default='./', + help='Path where output will be stored.') + + + args = parser.parse_args() + model = get_model() + model.load_weights(args.weights_path) + _, tail = os.path.split(args.input) + getName = tail.split('-multipage.tif') + outPath = args.output_folder + "data/predicted_tiles/" + getName[0] + if not os.path.exists(outPath): + os.makedirs(outPath) + + image = normalize(tiff.imread(args.input).transpose([1, 2, 0])) + wind_row, wind_col = 800, 800 # dimensions of the image + windowSize = 800 + stepSize = 400 + + desired_row_size = stepSize * math.ceil(image.shape[0] / stepSize) + desired_col_size = stepSize * math.ceil(image.shape[1] / stepSize) + img = np.zeros((desired_row_size, desired_col_size, image.shape[2]), + dtype=image.dtype) + img[:image.shape[0], :image.shape[1]] = image + i = 0 + for(x, y, window) in sliding_window(img, stepSize, windowSize): if window.shape[0] != wind_row or window.shape[1] != wind_col: continue - t_img = img[y:y+wind_row,x:x+wind_col,:]# the image which has to be predicted - mask = predict(t_img, model, patch_sz=PATCH_SZ, n_classes=N_CLASSES).transpose([2,0,1]) - cnt=str(i) - imagename="image"+cnt+".tif" - fullpath = os.path.join(outPath,imagename) + # the image which has to be predicted + t_img = img[y:y + wind_row, x:x + wind_col, :] + mask = predict(t_img, model, patch_sz=PATCH_SZ, + n_classes=N_CLASSES).transpose([2, 0, 1]) + cnt = str(i) + imagename = cnt + ".tif" + fullpath = os.path.join(outPath, imagename) tiff.imsave(fullpath, mask) - i=i+1 - + i += 1 + + if __name__ == '__main__': - model = get_model() - model.load_weights(weights_path) - main() \ No newline at end of file + main() diff --git a/src/classification/tmp b/src/classification/tmp deleted file mode 100644 index d00491f..0000000 --- a/src/classification/tmp +++ /dev/null @@ -1 +0,0 @@ -1 diff --git a/src/entk_script/entk_script.py b/src/entk_script/entk_script.py new file mode 100644 index 0000000..7a7d35f --- /dev/null +++ b/src/entk_script/entk_script.py @@ -0,0 +1,220 @@ +""" +Seals Use Case EnTK Analysis Script +========================================================== + +This script contains the EnTK Pipeline script for the Rivers Use Case + +Author: Ioannis Paraskevakos +License: MIT +Copyright: 2018-2019 +""" + +import argparse +import os +import pandas as pd + +from radical.entk import Pipeline, Stage, Task, AppManager + + +def generate_discover_pipeline(path, env_path): + ''' + This function takes as an input a path on Bridges and returns a pipeline + that will provide a file for all the images that exist in that path. + ''' + pipeline = Pipeline() + pipeline.name = 'Disc' + stage = Stage() + stage.name = 'Disc-S0' + # Create Task 1, training + task = Task() + task.name = 'Disc-T0' + task.pre_exec = ['module load python3 cuda/9.0 gdal/2.2.1', + 'source %s/bin/activate' % env_path, + 'export PYTHONPATH=%s/lib/python3.5/site-packages' % env_path] + task.executable = 'python' # Assign executable to the task + task.arguments = ['image_disc.py', '%s' % path, '--filename=images.csv', + '--filesize'] + task.download_output_data = ['images.csv'] + task.upload_input_data = ['image_disc.py'] + task.cpu_reqs = {'processes': 1, 'threads_per_process': 1, + 'process_type': None, 'thread_type': 'OpenMP'} + stage.add_tasks(task) + # Add Stage to the Pipeline + pipeline.add_stages(stage) + + return pipeline + + +def generate_pipeline(name, image, image_size, env_path, model_path): + + ''' + This function creates a pipeline for an image that will be analyzed. + + :Arguments: + :name: Pipeline name, str + :image: image path, str + :image_size: image size in MBs, int + :device: Which GPU device will be used by this pipeline, int + ''' + # Create a Pipeline object + entk_pipeline = Pipeline() + entk_pipeline.name = name + # Create a Stage object + stage0 = Stage() + stage0.name = '%s-S0' % name + # Create Task 1, training + task0 = Task() + task0.name = '%s-T0' % stage0.name + task0.pre_exec = ['module load python3 cuda/9.0 gdal/2.2.1', + 'source %s/bin/activate' % env_path, + 'export PYTHONPATH=%s/lib/python3.5/site-packages' % env_path] + task0.executable = 'iceberg_rivers.tiling' # Assign executable to the task + # Assign arguments for the task executable + task0.arguments = ['--input', image, '--output', './tiles/'] + task0.cpu_reqs = {'processes': 1, 'threads_per_process': 1, + 'process_type': None, 'thread_type': 'OpenMP'} + + stage0.add_tasks(task0) + # Add Stage to the Pipeline + entk_pipeline.add_stages(stage0) + # Create a Stage object + stage1 = Stage() + stage1.name = '%s-S1' % name + # Create Task 1, training + task1 = Task() + task1.name = '%s-T1' % stage1.name + task1.pre_exec = ['module load python3 cuda/9.0 gdal/2.2.1', + 'source %s/bin/activate' % env_path, + 'export PYTHONPATH=%s/lib/python3.5/site-packages' % env_path] + task1.executable = 'iceberg_rivers.predict' # Assign executable to the task + # Assign arguments for the task executable + task1.arguments = ['--input', './tiles', '-o', './predicted/', '-w', model_path] + task1.link_input_data = ['$Pipeline_%s_Stage_%s_Task_%s/tiles >' % + (entk_pipeline.name, stage0.name, task0.name) + + 'tiles'] + task1.cpu_reqs = {'processes': 1, 'threads_per_process': 1, + 'process_type': None, 'thread_type': 'OpenMP'} + task1.gpu_reqs = {'processes': 1, 'threads_per_process': 1, + 'process_type': None, 'thread_type': 'OpenMP'} +# task1.tag = task0.name + + stage1.add_tasks(task1) + # Add Stage to the Pipeline + entk_pipeline.add_stages(stage1) + # Create a Stage object + stage2 = Stage() + stage2.name = '%s-S2' % (name) + # Create Task 1, training + task2 = Task() + task2.name = '%s-T2' % stage2.name + task2.pre_exec = ['module load python3 cuda/9.0 gdal/2.2.1', + 'source %s/bin/activate' % env_path, + 'export PYTHONPATH=%s/lib/python3.5/site-packages' % env_path] + task2.executable = 'iceberg_rivers.mosaic' # Assign executable to the task + # Assign arguments for the task executable + task2.arguments = ["-iw", image, '-i', './predicted', '-o', './'] + task2.link_input_data = ['$Pipeline_%s_Stage_%s_Task_%s/predicted >' % + (entk_pipeline.name, stage1.name, task1.name) + + 'predicted'] + task2.cpu_reqs = {'processes': 1, 'threads_per_process': 1, + 'process_type': None, 'thread_type': 'OpenMP'} +# task2.tag = task1.name + + stage2.add_tasks(task2) + # Add Stage to the Pipeline + entk_pipeline.add_stages(stage2) + return entk_pipeline + + +def args_parser(): + + ''' + Argument Parsing Function for the script. + ''' + parser = argparse.ArgumentParser(description='Executes the Seals ' + + 'pipeline for a set of images') + + parser.add_argument('-c', '--cpus', type=int, default=1, + help='The number of CPUs required for execution') + parser.add_argument('-g', '--gpus', type=int, default=1, + help='The number of GPUs required for execution') + parser.add_argument('-ip', '--input_dir', type=str, + help='Images input directory on the selected resource') + parser.add_argument('-m', '--model', type=str, + help='Which model will be used') + parser.add_argument('-p', '--project', type=str, + help='The project that will be charged') + parser.add_argument('-q', '--queue', type=str, + help='The queue from which resources are requested.') + parser.add_argument('-r', '--resource', type=str, + help='HPC resource on which the script will run.') + parser.add_argument('-w', '--walltime', type=int, + help='The amount of time resources are requested in' + + ' minutes') + parser.add_argument('--env', type=str, help='Path to conda environment ' + + ' on Bridges') + parser.add_argument('--uname', type=str, help='RabiitMQ user name') + parser.add_argument('--pwd', type=str, help='RabbitMQ password') + parser.add_argument('--host', type=str, help='RabbitMQ hostname') + parser.add_argument('--port', type=int, help='RabbitMQ port') + parser.add_argument('--name', type=str, + help='name of the execution. It has to be a unique' + + ' value') + + return parser.parse_args() + + +if __name__ == '__main__': + + args = args_parser() + + res_dict = {'resource': args.resource, + 'walltime': args.walltime, + 'cpus': args.cpus, + 'gpus': args.gpus, + 'schema': 'gsissh', + 'project': args.project, + 'queue': args.queue} + try: + # Create Application Manager + appman = AppManager(port=args.port, hostname=args.host, + username=args.uname,password=args.pwd, + name=args.name, autoterminate=False, + write_workflow=True) + + # Assign resource manager to the Application Manager + appman.resource_desc = res_dict + # Create a task that discovers the dataset + disc_pipeline = generate_discover_pipeline(args.input_dir, args.env) + appman.workflow = set([disc_pipeline]) + + # Run + appman.run() + print('Run Discovery') + images = pd.read_csv('images.csv') + + print('Images Found:', len(images)) + # Create a single pipeline per image + pipelines = list() + + for idx in range(0, len(images)): + p1 = generate_pipeline(name='P%03d' % idx, + image=images['Filename'][idx], + image_size=images['Size'][idx], + env_path=args.env, + model_path=args.model) + pipelines.append(p1) + # Assign the workflow as a set of Pipelines to the Application Manager + appman.workflow = set(pipelines) + + # Run the Application Manager + appman.run() + + print('Done') + except Exception as e: + # Something unexpected happened in the code above + print('Caught Error: %s' % e) + finally: + # Now that all images have been analyzed, release the resources. + print('Closing resources') + appman.resource_terminate() diff --git a/src/entk_script/image_disc.py b/src/entk_script/image_disc.py new file mode 100644 index 0000000..bf8d20c --- /dev/null +++ b/src/entk_script/image_disc.py @@ -0,0 +1,52 @@ +""" +Image Discovery Kernel +========================================================== +This script takes as input a path and returns a dataframe +with all the images and their size. +Author: Ioannis Paraskevakos +License: MIT +Copyright: 2018-2019 +""" +from glob import glob +import argparse +import os +import math +import pandas as pd + + +def image_discovery(path, filename='list.csv', filesize=False): + """ + This function creates a dataframe with image names and size from a path. + :Arguments: + :path: Images path, str + :filename: The filename of the CSV file containing the dataframe. + Default Value: list.csv + :filesize: Whether or not the image sizes should be inluded to the + dataframe. Default value: False + """ + + filepaths = glob(path + '/*.tif') + if filesize: + dataset_df = pd.DataFrame(columns=['Filename', 'Size']) + for filepath in filepaths: + filesize = int(math.ceil(os.path.getsize(filepath) / 1024 / 1024)) + dataset_df.loc[len(dataset_df)] = [filepath, filesize] + else: + dataset_df = pd.DataFrame(columns=['Filename']) + for filepath in filepaths: + dataset_df.loc[len(dataset_df)] = [filepath] + + dataset_df.to_csv(filename, index=False) + + +if __name__ == '__main__': + parser = argparse.ArgumentParser() + parser.add_argument('path', help='Path to a remote resource where data \ + are') + parser.add_argument('--filename', type=str, default='list.csv', + help='Name of the output CSV file') + parser.add_argument('--filesize', help='Include the filesize to the \ + output CSV', action='store_true') + args = parser.parse_args() + + image_discovery(args.path, args.filename, args.filesize) diff --git a/src/entk_script/tmp b/src/entk_script/tmp deleted file mode 100644 index d00491f..0000000 --- a/src/entk_script/tmp +++ /dev/null @@ -1 +0,0 @@ -1 diff --git a/src/mosaic/mosaic_unet.py b/src/mosaic/mosaic_unet.py new file mode 100644 index 0000000..769dbf1 --- /dev/null +++ b/src/mosaic/mosaic_unet.py @@ -0,0 +1,70 @@ +""" +Authors: Samira Daneshgar-Asl +License: MIT +Copyright: 2019-2020 +""" +import os +import numpy as np +import argparse +import math +from osgeo import gdal +from os import listdir + +#loading image +def load_image(path): + ds = gdal.Open(path) + img_proj = ds.GetProjection() + img_geotrans = ds.GetGeoTransform() + img_data = ds.ReadAsArray(0,0, ds.RasterXSize, ds.RasterYSize) + del ds + image = np.array(img_data,dtype=img_data.dtype) + return img_proj,img_geotrans,image + +#writing mosaic +def write_mosaic(filename,img_proj,img_geotrans,img_data): + driver = gdal.GetDriverByName("GTiff") + bands, (ysize, xsize) = 1,img_data.shape + ds = driver.Create(filename, xsize, ysize, bands, gdal.GDT_Float32) + ds.SetProjection(img_proj) + ds.SetGeoTransform(img_geotrans) + ds.GetRasterBand(1).WriteArray(img_data) + +def args_parser(): + parser = argparse.ArgumentParser(description="generates mosaic") + parser.add_argument('-iw', '--input_WV', type=str, + help='Path and name of the WorldView image') + parser.add_argument('-i', '--input', type=str, + help='Input predicted masks folder') + parser.add_argument('-t', '--tile_size', type=int, default=224, + help='Tile size') + parser.add_argument('-s', '--step', type=int, default=112, + help='Step size') + parser.add_argument('-o', '--output_folder', type=str, default='./', + help='Folder where output mosaic will be stored') + return parser.parse_args() + +if __name__ == '__main__': + args = args_parser() + + masks_path = 'predicted_tiles/' + args.input + '/' + list = sorted(os.listdir(masks_path),key=lambda x: int(os.path.splitext(x)[0])) + + out_path = 'predicted_mosaic/' + args.output_folder + if not os.path.exists(out_path): + os.makedirs(out_path) + + proj, geotrans, image = load_image(args.input_WV) + desired_row_size = args.step * (math.ceil(image.shape[1]/ args.step)+1) + desired_col_size = args.step * (math.ceil(image.shape[2]/ args.step)+1) + mask = np.zeros((desired_row_size,desired_col_size),dtype=np.float64) + + k=0 + for j in range(0, mask.shape[1]-(args.tile_size-args.step), args.step): + for i in range(0, mask.shape[0]-(args.tile_size-args.step), args.step): + mask_name = list[k] + mask_proj, mask_geotranse, mask_tile= load_image(masks_path + mask_name) + mask[i:i + args.tile_size, j:j + args.tile_size]=np.maximum(mask_tile[:,:],mask[i:i + args.tile_size, j:j + args.tile_size]) + k+=1 + write_mosaic(out_path+ "%s_predicted.tif"%args.input, proj, geotrans, mask[0:image.shape[1], 0:image.shape[2]]) + + diff --git a/src/utils/multipagetiff.m b/src/utils/multipagetiff.m new file mode 100644 index 0000000..9b2a35f --- /dev/null +++ b/src/utils/multipagetiff.m @@ -0,0 +1,29 @@ +% Author: Samira Daneshgar-Asl +% License: MIT +% Copyright: 2018-2019 + +% when we have an 8-bit image with 3 bands and we want to save it as a multi-page + +function multipagetiff(ReadDir, WriteDir) + + if ~exist(WriteDir, 'dir') + mkdir(WriteDir); + end + + image_files = dir(fullfile(ReadDir, '*.tif')); + totalFiles = numel(image_files); + + for i =1:totalFiles + ReadImage = image_files(i).name; + if isunix + image = geotiffread(strcat(ReadDir,'/',ReadImage)); + writeFileName = strcat(WriteDir,'/',strtok(ReadImage,'.'), '-multipage.tif'); + elseif ispc + image = geotiffread(strcat(ReadDir,'\',ReadImage)); + writeFileName = strcat(WriteDir,'\',strtok(ReadImage,'.'), '-multipage.tif'); + else + disp 'Something went wrong'; + end + + saveastiff(image,char(writeFileName)); +end diff --git a/src/classification/saveastiff.m b/src/utils/saveastiff.m similarity index 100% rename from src/classification/saveastiff.m rename to src/utils/saveastiff.m diff --git a/src/utils/tmp b/src/utils/tmp deleted file mode 100644 index d00491f..0000000 --- a/src/utils/tmp +++ /dev/null @@ -1 +0,0 @@ -1