Deep learning model based on the u-net architecture for image segmentation on human images dataset.
The model uses the u-net architecture proposed by Ronneberger et al in the paper titled U-net: Convolutional Networks for for Biomedical Image Segmentation , the paper can be accessed through this arxiv link .
As described in the paper, the network architecture consists of a contracting path and a expansive path.
The contracting path consisting of repeated application of two 3x3 unpadded convolution layers each consisting of a 2x2 max pooling with stride 2 after the convolution layer. After each downsampling, we double the number of feature channels.
The expansive path consists of an upsampling of the feature map which is followed by a 2x2 up-convolution to half the number of feature channels, a concatenation with the correspondingly cropped feature map from the contracting path at each level, and two 3x3 convolutions, each convolution layer followed by a ReLU.
The upsampling is followed by a final 1x1 convolution layer to map the 64-component vector to the desired number of classes. Totally, 23 convolution layers are used in the architecture.

Image Source: U-Net: Convolutional Networks for Biomedical Image Segmentation
For a seamless tiling of the segmentation map(output) the authors make it a point that the input tile size is such that the 2x2 max-pooling operations are applied to a layer with an even x- and y- size.
The dataset used was from kaggle and can be freely downloaded and used from this link: Person Segmentation The dataset comprises of images and segmentation masks corresponding to the images. The data comes from supervisely More information about the dataset could be found in this blog