confusion on field of view  and model inference time

Hi, RolandGao, nice to see a good job! I see you've done a lot of experiments on the backbone setting, but I still have some confusion after reading your published paper. 

- First, You calculate the fov of 4095 to see the bottom-right pixel when training cityscape (1024x2048), so you have verify the backbone should be exp48 [ (1,1) + (1,2) + 4 * (1, 4) + 7 *(1, 14) ] with fov (3807).  But I also find the same backbone when training the CamVid (720x960). Why not use a shallow backbone? I am training my own dataset with image resolution (512 x 512), do I need to modify the backbone architecture? Can you give some advice?
- Second, I test inference time of regseg. I notice that the speed is not better than other real-time archs due to split and dilated conv even if  model costs low GFLOPs. In the application, what we are concerned about is the speed, so is there any strategy to improve the speed?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

confusion on field of view and model inference time #8

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

confusion on field of view and model inference time #8

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions