-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathindex.html
More file actions
1054 lines (760 loc) · 103 KB
/
index.html
File metadata and controls
1054 lines (760 loc) · 103 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Yubinnzeng's Bloc</title>
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta property="og:type" content="website">
<meta property="og:title" content="Yubinnzeng's Bloc">
<meta property="og:url" content="https://t5eng.github.io/index.html">
<meta property="og:site_name" content="Yubinnzeng's Bloc">
<meta property="og:locale" content="en_US">
<meta property="article:author" content="Yubinn Zeng">
<meta name="twitter:card" content="summary">
<link rel="alternate" href="/atom.xml" title="Yubinnzeng's Bloc" type="application/atom+xml">
<link rel="shortcut icon" href="/favicon.png">
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/typeface-source-code-pro@0.0.71/index.min.css">
<link rel="stylesheet" href="/css/style.css">
<link rel="stylesheet" href="/fancybox/jquery.fancybox.min.css">
<meta name="generator" content="Hexo 5.4.0"></head>
<body>
<div id="container">
<div id="wrap">
<header id="header">
<div id="banner"></div>
<div id="header-outer" class="outer">
<div id="header-title" class="inner">
<h1 id="logo-wrap">
<a href="/" id="logo">Yubinnzeng's Bloc</a>
</h1>
</div>
<div id="header-inner" class="inner">
<nav id="main-nav">
<a id="main-nav-toggle" class="nav-icon"></a>
<a class="main-nav-link" href="/">Home</a>
<a class="main-nav-link" href="/archives">Archives</a>
</nav>
<nav id="sub-nav">
<a id="nav-rss-link" class="nav-icon" href="/atom.xml" title="RSS Feed"></a>
<a id="nav-search-btn" class="nav-icon" title="Search"></a>
</nav>
<div id="search-form-wrap">
<form action="//google.com/search" method="get" accept-charset="UTF-8" class="search-form"><input type="search" name="q" class="search-form-input" placeholder="Search"><button type="submit" class="search-form-submit"></button><input type="hidden" name="sitesearch" value="https://t5eng.github.io"></form>
</div>
</div>
</div>
</header>
<div class="outer">
<section id="main">
<article id="post-SuperResolution by VDSR, PerceptualSR, SubpixelConvSR,ESRGAN" class="h-entry article article-type-post" itemprop="blogPost" itemscope itemtype="https://schema.org/BlogPosting">
<div class="article-meta">
<a href="/2021/09/17/SuperResolution%20by%20VDSR,%20PerceptualSR,%20SubpixelConvSR%EF%BC%8CESRGAN/" class="article-date">
<time class="dt-published" datetime="2021-09-17T12:26:48.045Z" itemprop="datePublished">2021-09-17</time>
</a>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="p-name article-title" href="/2021/09/17/SuperResolution%20by%20VDSR,%20PerceptualSR,%20SubpixelConvSR%EF%BC%8CESRGAN/">SuperResolution by VDSR, PerceptualSR, SubpixelConvSR,ESRGAN</a>
</h1>
</header>
<div class="e-content article-entry" itemprop="articleBody">
<p>以下内容基于本人阅读理解</p>
<p>VDSR: arXiv:1511.04587v2 [cs.CV] 11 Nov 2016</p>
<p>Perceptual Loss SR: arXiv:1609.05158v2 [cs.CV] 23 Sep 2016</p>
<p>Subpixel SR: arXiv:1603.08155v1 [cs.CV] 27 Mar 2016</p>
<p>SRGAN: arXiv:1609.04802v5 [cs.CV] 25 May 2017</p>
<p>ESRGAN: arXiv:1809.00219v2 [cs.CV] 17 Sep 2018</p>
<p>4个改进DeepLearning方法实现超分辨率的代表作。</p>
<hr>
<h2 id="Abstract"><a href="#Abstract" class="headerlink" title="Abstract"></a>Abstract</h2><p>srcnn证明了deep cnn end2end可以实现优于传统方法的超分辨率任务。</p>
<p>本文介绍的4篇论文分别从网络结构、损失函数、upsample层、GAN四种不同角度优化原始的超分辨率网络。</p>
<hr>
<h2 id="VeryDeepSuperResolution"><a href="#VeryDeepSuperResolution" class="headerlink" title="VeryDeepSuperResolution"></a>VeryDeepSuperResolution</h2><p>原始的srcnn只用了3层,fsrcnn用了15层,已经非常难训练,拟合很慢。VDSR加上了残差连接、gradient clipping,实现了20层的网络和极高的learning rate,以及有更大的感知域(41x41)和在同一个网络实现不同upscale factor。</p>
<h3 id="Residual-Learning"><a href="#Residual-Learning" class="headerlink" title="Residual Learning"></a>Residual Learning</h3><p><a target="_blank" rel="noopener" href="https://ws4.sinaimg.cn/large/006tKfTcgy1g119z3tje0j31c40bqjzt.jpg"><img src="https://ws4.sinaimg.cn/large/006tKfTcgy1g119z3tje0j31c40bqjzt.jpg" alt="image-20190313120326694"></a></p>
<p>interpolation到hr分辨率,经过20层Conv后输出r,加上残差连接x,输出SR结果,因此r = y - x。Loss函数为x+r与y的Euclidean distance。</p>
<h3 id="Gradient-Clipping"><a href="#Gradient-Clipping" class="headerlink" title="Gradient Clipping"></a>Gradient Clipping</h3><p>一般的clipping用[−θ, θ],缺点是即使使用更大的lr,梯度也依然限制在[−θ, θ]内。因此提出可以动态clipping的方法:[-θ/lr , θ/lr],这样就可以根据lr的策略动态控制Gradient Clipping的范围。</p>
<h3 id="Multi—Scale"><a href="#Multi—Scale" class="headerlink" title="Multi—Scale"></a>Multi—Scale</h3><p>因为网络只负责reconstruction,上采样是用传统方法完成的。因此一个网络就可以实现多种不同的scale。</p>
<h3 id="Result"><a href="#Result" class="headerlink" title="Result"></a>Result</h3><p><a target="_blank" rel="noopener" href="https://ws3.sinaimg.cn/large/006tKfTcgy1g11atcmtsrj31bu0g4dp9.jpg"><img src="https://ws3.sinaimg.cn/large/006tKfTcgy1g11atcmtsrj31bu0g4dp9.jpg" alt="image-20190313123237150"></a></p>
<p>20层的conv的runtime居然比3层conv的srcnn还快???</p>
<hr>
<h2 id="PerceptualSR"><a href="#PerceptualSR" class="headerlink" title="PerceptualSR"></a>PerceptualSR</h2><p>PSNR(PeakSignalNoiseRatio)是通用的对比图片相似度的标准,与MSE成正比,因此很多CV任务的loss函数都采用了MSE,即输出和Ground Truth的每个像素的MSE。而Perceptual Loss采用了VGG中间某几层的activation output作为loss。在StyleTransfer和SuperResolution都得到了不错的结果。</p>
<h3 id="Feature-Reconstruction-Loss"><a href="#Feature-Reconstruction-Loss" class="headerlink" title="Feature Reconstruction Loss"></a>Feature Reconstruction Loss</h3><p><a target="_blank" rel="noopener" href="https://ws1.sinaimg.cn/large/006tKfTcgy1g11ctpmiabj30ii02qjrl.jpg"><img src="https://ws1.sinaimg.cn/large/006tKfTcgy1g11ctpmiabj30ii02qjrl.jpg" alt="image-20190313134211674"></a></p>
<p>其中:φ j (y)是VGG的第j层的activation output。</p>
<p>相当于借用VGG做feature extraction,feat Loss是图片在higher level feature map的<strong>EuclideanNorm</strong>,而不是像素级别的距离,保存了图片的semantic和空间结构,但是没有考虑到色彩、纹理和形状。</p>
<h3 id="Style-Reconstruction-Loss"><a href="#Style-Reconstruction-Loss" class="headerlink" title="Style Reconstruction Loss"></a>Style Reconstruction Loss</h3><p>先定义一个gram matrix:</p>
<p><a target="_blank" rel="noopener" href="https://ws1.sinaimg.cn/large/006tKfTcgy1g11dojzc0jj30oi03mq3b.jpg"><img src="https://ws1.sinaimg.cn/large/006tKfTcgy1g11dojzc0jj30oi03mq3b.jpg" alt="image-20190313141150026"></a></p>
<p>返回一个Cj x Cj的矩阵,计算φ j (y)的c通道和c‘通道对应的elementwise product的和。相当于计算channel wise的covariance。</p>
<p><a target="_blank" rel="noopener" href="https://ws1.sinaimg.cn/large/006tKfTcgy1g11e7ssmdfj30fu01s0su.jpg"><img src="https://ws1.sinaimg.cn/large/006tKfTcgy1g11e7ssmdfj30fu01s0su.jpg" alt="image-20190313143020157"></a></p>
<p>style Loss是2个图片的GramMatrix差值的<strong>FrobeniusNorm</strong>。</p>
<h3 id="More-Specific"><a href="#More-Specific" class="headerlink" title="More Specific"></a>More Specific</h3><p><a target="_blank" rel="noopener" href="https://ws4.sinaimg.cn/large/006tKfTcgy1g169htpdtdj311k0csadz.jpg"><img src="https://ws4.sinaimg.cn/large/006tKfTcgy1g169htpdtdj311k0csadz.jpg" alt="image-20190317193438543"></a></p>
<p>论文主要是利用了VGG16作为LossNetwork计算Loss,ImageTransformNet可以套用各种不同的网络。</p>
<p>做StyleTransfer时,需要同时使用ys,y_hat,yc作为LossNet的输入。</p>
<p><a target="_blank" rel="noopener" href="https://ws4.sinaimg.cn/large/006tKfTcgy1g16w2bk0r6j30wc02uwex.jpg"><img src="https://ws4.sinaimg.cn/large/006tKfTcgy1g16w2bk0r6j30wc02uwex.jpg" alt="image-20190317234812204"></a></p>
<p>做SR任务时,仅将SR作为y_hat,HR作为yc,选用<strong>relu2_2层</strong>计算FeatureLoss。</p>
<h3 id="Experiment"><a href="#Experiment" class="headerlink" title="Experiment"></a>Experiment</h3><p>论文展示了x4和x8,估计在x2和x3上表现不如其他方法。</p>
<p>图片downsample前先进行高斯模糊(σ = 1.0)</p>
<p>卷积层后面接spacial batch normalization</p>
<p>选了VGG16第2层的输出作为φ(y)计算feat Loss,基于srcnn修改</p>
<p>batch size = 4,计算200k步</p>
<p>Adam,lr=1e-3</p>
<p>test时先进行histogram matching处理,再计算psnr(偏高?)</p>
<h3 id="Result-1"><a href="#Result-1" class="headerlink" title="Result"></a>Result</h3><p>细节和边缘表现优秀</p>
<p>放大会看到部分色块超出原有边界</p>
<hr>
<h2 id="SubpixelConvSR-ESPCN"><a href="#SubpixelConvSR-ESPCN" class="headerlink" title="SubpixelConvSR (ESPCN)"></a>SubpixelConvSR (ESPCN)</h2><p>此前SR任务的upsample操作都是通过fractional convolution进行的,此论文提出了新的upsample方法:子像素重排列。</p>
<h3 id="Network-Architecture"><a href="#Network-Architecture" class="headerlink" title="Network Architecture"></a>Network Architecture</h3><p><a target="_blank" rel="noopener" href="https://ws2.sinaimg.cn/large/006tKfTcgy1g174h2fsrcj317y0b2jwi.jpg"><img src="https://ws2.sinaimg.cn/large/006tKfTcgy1g174h2fsrcj317y0b2jwi.jpg" alt="image-20190318132622948"></a></p>
<p>全卷积,逐层增加通道数</p>
<p>输入通道为c,upscale factor为r的话,最终通道数为c <em>r^2,即输出为</em><em>[h, w, c</em> r^2]**</p>
<p>Sub-pixel cone layer通过像素重新排列(不需要计算),将feature map转换为**[h *r, w* r, c]**,实现上采样,论文中命名为periodic shuffling operator (PS)。由于这个过程不需要计算,所以速度非常快。</p>
<p>参考SRCNN的3层网络结构,第一层conv(5x5, c, 64) 第二层conv(3x3, 64, 32) 第三层conv(3x3, 32, <strong>c*r^2</strong>) 第四层subpixel(r)</p>
<h3 id="Experiment-1"><a href="#Experiment-1" class="headerlink" title="Experiment"></a>Experiment</h3><p>图片分辨率为(17r x 17r),downsample前先进行高斯模糊</p>
<p>激活函数选tanh</p>
<p>训练100轮,lr=0.01,每1轮lr=lr*0.1</p>
<p>用K2训练,91images上训练了3hr,另外一个模型在ImageNet上训练了7天。</p>
<p>PSNR用matlab算(比较准)</p>
<h3 id="Result-2"><a href="#Result-2" class="headerlink" title="Result"></a>Result</h3><p>如果激活函数跟SRCNN一样用relu,对比数据集分别用91images和ImageNet,结果显示ESPCN在更大的数据集上会有进一步的表现(+0.33),而SRCNN不会(+0.07)</p>
<p><a target="_blank" rel="noopener" href="https://ws2.sinaimg.cn/large/006tKfTcgy1g177vdw0o4j319u080go1.jpg"><img src="https://ws2.sinaimg.cn/large/006tKfTcgy1g177vdw0o4j319u080go1.jpg" alt="image-20190318151055556"></a></p>
<p>1080p的图片在GPU上超分辨率可以做到<strong>0.038s</strong>每帧(算是实时?),而SRCNN需要<strong>0.435s</strong>。</p>
<h2 id="Enhanced-SuperResolution-GAN-ESRGAN"><a href="#Enhanced-SuperResolution-GAN-ESRGAN" class="headerlink" title="Enhanced SuperResolution GAN(ESRGAN)"></a>Enhanced SuperResolution GAN(ESRGAN)</h2><p>SRGAN的升级版,用生成对抗的方法实现超分辨率任务。与PerceptualLoss类似,只是这次的LossNetwork是可训练的,结果同样是在<strong>PSNR比较低</strong>的情况下得到<strong>感官效果更优秀</strong>的超分辨率图片。并且在超大upscale factor上的表现比PSNR方法的表现更好。</p>
<h3 id="前作SRGAN"><a href="#前作SRGAN" class="headerlink" title="前作SRGAN"></a>前作SRGAN</h3><p>提出一个<strong>mean-opinion score</strong>(MOS),找26个人评分(1~5)然后求均值。</p>
<p>GAN网络:Generator采用ResNet网络,Discriminator采用VGG。</p>
<p><a target="_blank" rel="noopener" href="https://ws1.sinaimg.cn/large/006tKfTcgy1g17c13mpx2j30se0ionfc.jpg"><img src="https://ws1.sinaimg.cn/large/006tKfTcgy1g17c13mpx2j30se0ionfc.jpg" alt="image-20190318173207720"></a></p>
<p>从manifold的角度分析:将patch映射到2维空间,可以发现GAN的结果与真实结果分布比较接近,而以MSE为objective function的结果偏离了真实分布,所以画面偏smooth。</p>
<h4 id="SRGAN-Loss"><a href="#SRGAN-Loss" class="headerlink" title="SRGAN Loss"></a>SRGAN Loss</h4><p>Loss函数采用,典型的GAN的min-max函数(基于binomial的loss,然后训练Discriminator最大化误差,训练Generator最小化误差)</p>
<p><a target="_blank" rel="noopener" href="https://ws1.sinaimg.cn/large/006tKfTcgy1g17c61avdqj30om04a750.jpg"><img src="https://ws1.sinaimg.cn/large/006tKfTcgy1g17c61avdqj30om04a750.jpg" alt="image-20190318175244115"></a></p>
<p><a target="_blank" rel="noopener" href="https://ws3.sinaimg.cn/large/006tKfTcgy1g188yi3ju6j30j605mmxs.jpg"><img src="https://ws3.sinaimg.cn/large/006tKfTcgy1g188yi3ju6j30j605mmxs.jpg" alt="image-20190319124717727"></a></p>
<p>其中:content loss可以选择通常使用的MSE或者基于VGG的perceptual loss,adversarial loss就是Discriminator的输出</p>
<p><a target="_blank" rel="noopener" href="https://ws1.sinaimg.cn/large/006tKfTcgy1g18a3yud9wj30h203udg4.jpg"><img src="https://ws1.sinaimg.cn/large/006tKfTcgy1g18a3yud9wj30h203udg4.jpg" alt="image-20190319132712222"></a></p>
<p>论文中说,adversarial loss是可以让生成图片更接近实际图片的分布。</p>
<p>没有采用cross entropy的log[1-D( G(input) )]而是采用了 -log( D( G(input) )</p>
<h3 id="Experiment-2"><a href="#Experiment-2" class="headerlink" title="Experiment"></a>Experiment</h3><p>在ImageNet中随机选了350k张图片剪裁成96x96的patch。图片先进行高斯模糊,再downsample。LR preprocess为[0,1],HR为[-1,1]。MSE也是在[-1,1]上计算,VGGLoss需要乘以1/12.75</p>
<p>batch size = 16</p>
<p>optimizer: Adam, beta1=0.9</p>
<p>lr = 1e-4 1e6 iterations</p>
<p>选用预训练SRResNet</p>
<p>测试时去掉边缘4个像素(upsample时由于边缘像素的效果比较差)</p>
<h4 id="Result-3"><a href="#Result-3" class="headerlink" title="Result"></a>Result</h4><p><a target="_blank" rel="noopener" href="https://ws1.sinaimg.cn/large/006tKfTcgy1g18dy21lhbj30rq0ca42t.jpg"><img src="https://ws1.sinaimg.cn/large/006tKfTcgy1g18dy21lhbj30rq0ca42t.jpg" alt="image-20190319153954384"></a></p>
<p>用MOS计算,SRGAN是最高分。可是PSNR只有29.4dB(set5)和26.02dB(set14)。</p>
<p>与之前的研究结论不一样,论文认为更深的残差网络可以进一步提高效果,但训练时间更长</p>
<h3 id="ESRGAN本体"><a href="#ESRGAN本体" class="headerlink" title="ESRGAN本体"></a>ESRGAN本体</h3><p>在SRGAN的基础上:</p>
<ol>
<li>提出Res-in-Res Dense Block,更容易训练。移除Batch Norm,改为Res Scaling(将res的output乘以一个scale之后再加input)</li>
<li>改用RelativisticGAN。</li>
<li>VGGLoss改用activate前的feature map。</li>
</ol>
</div>
<footer class="article-footer">
<a data-url="https://t5eng.github.io/2021/09/17/SuperResolution%20by%20VDSR,%20PerceptualSR,%20SubpixelConvSR%EF%BC%8CESRGAN/" data-id="cktocv4vk0008yutkcm72etz1" data-title="SuperResolution by VDSR, PerceptualSR, SubpixelConvSR,ESRGAN" class="article-share-link">Share</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/" rel="tag">深度学习</a></li></ul>
</footer>
</div>
</article>
<article id="post-SuperResolution by FSRCNN" class="h-entry article article-type-post" itemprop="blogPost" itemscope itemtype="https://schema.org/BlogPosting">
<div class="article-meta">
<a href="/2021/09/17/SuperResolution%20by%20FSRCNN/" class="article-date">
<time class="dt-published" datetime="2021-09-17T11:04:08.947Z" itemprop="datePublished">2021-09-17</time>
</a>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="p-name article-title" href="/2021/09/17/SuperResolution%20by%20FSRCNN/">SuperResolution by FSRCNN</a>
</h1>
</header>
<div class="e-content article-entry" itemprop="articleBody">
<p>以下内容基于本人阅读理解</p>
<p>arXiv:1608.00367v1 [cs.CV] 1 Aug 2016</p>
<p>作为卷积网络实现超像素任务的开山鼻祖,虽然已经不是state-of-the-art的网络,但是新的论文还是会把srcnn拿出来diss一轮。</p>
<hr>
<h2 id="论文简介"><a href="#论文简介" class="headerlink" title="论文简介"></a>论文简介</h2><p>srcnn是第一篇用卷积网络做超像素的,效果超过了传统方法,但是计算量大,不能做到实时。fsrcnn基于srcnn,提出了新的网络结构。</p>
<h2 id="srcnn的短板"><a href="#srcnn的短板" class="headerlink" title="srcnn的短板"></a>srcnn的短板</h2><h4 id="预处理过程"><a href="#预处理过程" class="headerlink" title="预处理过程"></a>预处理过程</h4><p>srcnn在第一步就通过bicubic interpolation将输入图像放大n倍(n为超像素的倍数),因此此后进行的非线性映射过程,计算量以n^2倍提升。</p>
<h4 id="非线性映射过程"><a href="#非线性映射过程" class="headerlink" title="非线性映射过程"></a>非线性映射过程</h4><p>非线性映射过程的层数越多,卷积核越大,效果越好,但是计算量也随之增大。</p>
<h2 id="fsrcnn的改进"><a href="#fsrcnn的改进" class="headerlink" title="fsrcnn的改进"></a>fsrcnn的改进</h2><h4 id="引入deconvolution层"><a href="#引入deconvolution层" class="headerlink" title="引入deconvolution层"></a>引入deconvolution层</h4><p>deconv层作为输出层,这样mapping的过程都在low resolution space完成,计算量以n^2倍减少。</p>
<h4 id="引入shrinking和expending"><a href="#引入shrinking和expending" class="headerlink" title="引入shrinking和expending"></a>引入shrinking和expending</h4><p>shrinking减少通道数,expending扩充通道数。因此mapping过程可以叠加多次卷积而计算量也不会很大。</p>
<h4 id="沙漏状的architecture"><a href="#沙漏状的architecture" class="headerlink" title="沙漏状的architecture"></a>沙漏状的architecture</h4><p>一堆卷积操作的叠加实现end-to-end的超像素任务</p>
<h4 id="速度很快"><a href="#速度很快" class="headerlink" title="速度很快"></a>速度很快</h4><p>比srcnn-ex快40倍,fsrcnn-s可以在cpu上做到实时而且效果不比srcnn差</p>
<h4 id="迁移学习"><a href="#迁移学习" class="headerlink" title="迁移学习"></a>迁移学习</h4><p>训练好的网络只需要fintune deconv层就可以实现其他scale factor的sr</p>
<p><a target="_blank" rel="noopener" href="https://ws1.sinaimg.cn/large/006tKfTcgy1g0egsscjvgj314i0fen81.jpg"><img src="https://ws1.sinaimg.cn/large/006tKfTcgy1g0egsscjvgj314i0fen81.jpg" alt="fsrcnnArchitecture"></a></p>
<h2 id="Trick"><a href="#Trick" class="headerlink" title="Trick"></a>Trick</h2><ol>
<li>网络按scale factor = 3训练,收敛后只fintune deconv层即可得到其他scale factor(如2,4)。</li>
<li>Augmenation,缩小了0.9到0.6倍以及旋转,数据集增大了20倍</li>
<li>输入的图片需要根据scale factor剪裁成10像素左右的patch(约定俗成?)</li>
<li>论文作者自制了一个数据集:General1000,图片中比较少纯色块的,适合训练sr任务(可是剪出来的patch还是很多色块啊)。训练时用General1000和91images。</li>
<li>BSD500是jpg格式的,不适合做sr</li>
<li>feature extraction用5x5,deconv用9x9</li>
<li>激活函数用PReLU,据说不会导致dead kernel</li>
<li>每层的权重初始化采用不同的mean和stdev,weight和bias采用不同的learning rate</li>
<li>只采用了MSE作为loss函数,目前state-of-the-art是perceptual loss以及adverserial loss</li>
</ol>
<h2 id="实验结果"><a href="#实验结果" class="headerlink" title="实验结果"></a>实验结果</h2><p><a target="_blank" rel="noopener" href="https://ws3.sinaimg.cn/large/006tKfTcgy1g0ehush4tmj31260bgadw.jpg"><img src="https://ws3.sinaimg.cn/large/006tKfTcgy1g0ehush4tmj31260bgadw.jpg" alt="result"></a></p>
<h2 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h2><p>将convNet应用到超像素任务,并且非常快,结构非常精简,论文简单易懂。</p>
<p>可是源码是陈旧是caffe,重现实验效果花了我1个月(其实是我太渣…)。忽略了源码中各层权重采用了不同的初始化参数,忽略了data需要做patch,忽略了网络是按scale factor = 3训练后再fintune到其他scale factor的。</p>
<p>曾经是state-of-the-art的超像素网络,但是李飞飞的新论文显示,超像素的效果,不完全是基于psnr判断的,提出一个perceptual loss,替换掉MSE的loss function之后,输出的图片psnr较低,但是感官效果更好。</p>
</div>
<footer class="article-footer">
<a data-url="https://t5eng.github.io/2021/09/17/SuperResolution%20by%20FSRCNN/" data-id="cktocv4vi0005yutk5gnvfbd9" data-title="SuperResolution by FSRCNN" class="article-share-link">Share</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/" rel="tag">深度学习</a></li></ul>
</footer>
</div>
</article>
<article id="post-tfRecord 搬运" class="h-entry article article-type-post" itemprop="blogPost" itemscope itemtype="https://schema.org/BlogPosting">
<div class="article-meta">
<a href="/2021/09/17/tfRecord%20%E6%90%AC%E8%BF%90/" class="article-date">
<time class="dt-published" datetime="2021-09-17T11:03:26.346Z" itemprop="datePublished">2021-09-17</time>
</a>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="p-name article-title" href="/2021/09/17/tfRecord%20%E6%90%AC%E8%BF%90/">tfRecord 搬运</a>
</h1>
</header>
<div class="e-content article-entry" itemprop="articleBody">
<h2 id="概述"><a href="#概述" class="headerlink" title="概述"></a>概述</h2><p>最新上传的<a target="_blank" rel="noopener" href="https://github.com/ycszen/TensorFlowLaboratory/tree/master/mcnn">mcnn</a>中有完整的数据读写示例,可以参考。</p>
<p>关于Tensorflow读取数据,官网给出了三种方法:</p>
<p>**供给数据(Feeding)**: 在TensorFlow程序运行的每一步, 让Python代码来供给数据。</p>
<p><strong>从文件读取数据</strong>: 在TensorFlow图的起始, 让一个输入管线从文件中读取数据。</p>
<p><strong>预加载数据</strong>: 在TensorFlow图中定义常量或变量来保存所有数据(仅适用于数据量比较小的情况)。</p>
<p>对于数据量较小而言,可能一般选择直接将数据加载进内存,然后再分<code>batch</code>输入网络进行训练(tip:使用这种方法时,结合<code>yield</code> 使用更为简洁,大家自己尝试一下吧,我就不赘述了)。但是,如果数据量较大,这样的方法就不适用了,因为太耗内存,所以这时最好使用tensorflow提供的队列<code>queue</code>,也就是第二种方法 从文件读取数据。对于一些特定的读取,比如csv文件格式,官网有相关的描述,在这儿我介绍一种比较通用,高效的读取方法(官网介绍的少),即使用tensorflow内定标准格式——<code>TFRecords</code></p>
<p>太长不看,直接看源码请猛戳我的<a target="_blank" rel="noopener" href="https://github.com/ycszen/tf_lab/blob/master/reading_data/example_tfrecords.py">github</a>,记得加星哦。</p>
<h2 id="TFRecords"><a href="#TFRecords" class="headerlink" title="TFRecords"></a>TFRecords</h2><p>TFRecords其实是一种二进制文件,虽然它不如其他格式好理解,但是它能更好的利用内存,更方便复制和移动,并且不需要单独的标签文件(等会儿就知道为什么了)… …总而言之,这样的文件格式好处多多,所以让我们用起来吧。</p>
<p>TFRecords文件包含了<code>tf.train.Example</code> 协议内存块(protocol buffer)(协议内存块包含了字段 <code>Features</code>)。我们可以写一段代码获取你的数据, 将数据填入到<code>Example</code>协议内存块(protocol buffer),将协议内存块序列化为一个字符串, 并且通过<code>tf.python_io.TFRecordWriter</code> 写入到TFRecords文件。</p>
<p>从TFRecords文件中读取数据, 可以使用<code>tf.TFRecordReader</code>的<code>tf.parse_single_example</code>解析器。这个操作可以将<code>Example</code>协议内存块(protocol buffer)解析为张量。</p>
<p>接下来,让我们开始读取数据之旅吧~</p>
<h2 id="生成TFRecords文件"><a href="#生成TFRecords文件" class="headerlink" title="生成TFRecords文件"></a>生成TFRecords文件</h2><p>我们使用<code>tf.train.Example</code>来定义我们要填入的数据格式,然后使用<code>tf.python_io.TFRecordWriter</code>来写入。</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> os</span><br><span class="line"><span class="keyword">import</span> tensorflow <span class="keyword">as</span> tf </span><br><span class="line"><span class="keyword">from</span> PIL <span class="keyword">import</span> Image</span><br><span class="line"></span><br><span class="line">cwd = os.getcwd()</span><br><span class="line"></span><br><span class="line"><span class="string">'''</span></span><br><span class="line"><span class="string">此处我加载的数据目录如下:</span></span><br><span class="line"><span class="string">0 -- img1.jpg</span></span><br><span class="line"><span class="string"> img2.jpg</span></span><br><span class="line"><span class="string"> img3.jpg</span></span><br><span class="line"><span class="string"> ...</span></span><br><span class="line"><span class="string">1 -- img1.jpg</span></span><br><span class="line"><span class="string"> img2.jpg</span></span><br><span class="line"><span class="string"> ...</span></span><br><span class="line"><span class="string">2 -- ...</span></span><br><span class="line"><span class="string"> 这里的0, 1, 2...就是类别,也就是下文中的classes</span></span><br><span class="line"><span class="string"> classes是我根据自己数据类型定义的一个列表,大家可以根据自己的数据情况灵活运用</span></span><br><span class="line"><span class="string">...</span></span><br><span class="line"><span class="string">'''</span></span><br><span class="line">writer = tf.python_io.TFRecordWriter(<span class="string">"train.tfrecords"</span>)</span><br><span class="line"><span class="keyword">for</span> index, name <span class="keyword">in</span> <span class="built_in">enumerate</span>(classes):</span><br><span class="line"> class_path = cwd + name + <span class="string">"/"</span></span><br><span class="line"> <span class="keyword">for</span> img_name <span class="keyword">in</span> os.listdir(class_path):</span><br><span class="line"> img_path = class_path + img_name</span><br><span class="line"> img = Image.<span class="built_in">open</span>(img_path)</span><br><span class="line"> img = img.resize((<span class="number">224</span>, <span class="number">224</span>))</span><br><span class="line"> img_raw = img.tobytes() <span class="comment">#将图片转化为原生bytes</span></span><br><span class="line"> example = tf.train.Example(features=tf.train.Features(feature={</span><br><span class="line"> <span class="string">"label"</span>: tf.train.Feature(int64_list=tf.train.Int64List(value=[index])),</span><br><span class="line"> <span class="string">'img_raw'</span>: tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw]))</span><br><span class="line"> }))</span><br><span class="line"> writer.write(example.SerializeToString()) <span class="comment">#序列化为字符串</span></span><br><span class="line">writer.close()</span><br></pre></td></tr></table></figure>
<p>关于<code>Example Feature</code>的相关定义和详细内容,我推荐去官网查看相关API。</p>
<p>基本的,一个<code>Example</code>中包含<code>Features</code>,<code>Features</code>里包含<code>Feature</code>(这里没s)的字典。最后,<code>Feature</code>里包含有一个 <code>FloatList</code>, 或者<code>ByteList</code>,或者<code>Int64List</code></p>
<p>就这样,我们把相关的信息都存到了一个文件中,所以前面才说不用单独的label文件。而且读取也很方便。</p>
<p>接下来是一个简单的读取小例子:</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">for</span> serialized_example <span class="keyword">in</span> tf.python_io.tf_record_iterator(<span class="string">"train.tfrecords"</span>):</span><br><span class="line"> example = tf.train.Example()</span><br><span class="line"> example.ParseFromString(serialized_example)</span><br><span class="line"></span><br><span class="line"> image = example.features.feature[<span class="string">'image'</span>].bytes_list.value</span><br><span class="line"> label = example.features.feature[<span class="string">'label'</span>].int64_list.value</span><br><span class="line"> <span class="comment"># 可以做一些预处理之类的</span></span><br><span class="line"> <span class="built_in">print</span> image, label</span><br></pre></td></tr></table></figure>
<h2 id="使用队列读取"><a href="#使用队列读取" class="headerlink" title="使用队列读取"></a>使用队列读取</h2><p>一旦生成了TFRecords文件,为了高效地读取数据,TF中使用队列(<code>queue</code>)读取数据。</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">read_and_decode</span>(<span class="params">filename</span>):</span></span><br><span class="line"> <span class="comment">#根据文件名生成一个队列</span></span><br><span class="line"> filename_queue = tf.train.string_input_producer([filename])</span><br><span class="line"></span><br><span class="line"> reader = tf.TFRecordReader()</span><br><span class="line"> _, serialized_example = reader.read(filename_queue) <span class="comment">#返回文件名和文件</span></span><br><span class="line"> features = tf.parse_single_example(serialized_example,</span><br><span class="line"> features={</span><br><span class="line"> <span class="string">'label'</span>: tf.FixedLenFeature([], tf.int64),</span><br><span class="line"> <span class="string">'img_raw'</span> : tf.FixedLenFeature([], tf.string),</span><br><span class="line"> })</span><br><span class="line"> </span><br><span class="line"> img = tf.decode_raw(features[<span class="string">'img_raw'</span>], tf.uint8)</span><br><span class="line"> img = tf.reshape(img, [<span class="number">224</span>, <span class="number">224</span>, <span class="number">3</span>])</span><br><span class="line"> img = tf.cast(img, tf.float32) * (<span class="number">1.</span> / <span class="number">255</span>) - <span class="number">0.5</span></span><br><span class="line"> label = tf.cast(features[<span class="string">'label'</span>], tf.int32)</span><br><span class="line"> </span><br><span class="line"> <span class="keyword">return</span> img, label</span><br></pre></td></tr></table></figure>
<p>之后我们可以在训练的时候这样使用</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">img, label = read_and_decode(<span class="string">"train.tfrecords"</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment">#使用shuffle_batch可以随机打乱输入</span></span><br><span class="line">img_batch, label_batch = tf.train.shuffle_batch([img, label],</span><br><span class="line"> batch_size=<span class="number">30</span>, capacity=<span class="number">2000</span>,</span><br><span class="line"> min_after_dequeue=<span class="number">1000</span>)</span><br><span class="line">init = tf.initialize_all_variables()</span><br><span class="line"></span><br><span class="line"><span class="keyword">with</span> tf.Session() <span class="keyword">as</span> sess:</span><br><span class="line"> sess.run(init)</span><br><span class="line"> threads = tf.train.start_queue_runners(sess=sess)</span><br><span class="line"> <span class="keyword">for</span> i <span class="keyword">in</span> <span class="built_in">range</span>(<span class="number">3</span>):</span><br><span class="line"> val, l= sess.run([img_batch, label_batch])</span><br><span class="line"> <span class="comment">#我们也可以根据需要对val, l进行处理</span></span><br><span class="line"> <span class="comment">#l = to_categorical(l, 12) </span></span><br><span class="line"> <span class="built_in">print</span>(val.shape, l)</span><br></pre></td></tr></table></figure>
<p>至此,tensorflow高效从文件读取数据差不多完结了。</p>
<p>恩?等等…什么叫差不多?对了,还有几个注意事项:</p>
<p>第一,tensorflow里的graph能够记住状态(<code>state</code>),这使得<code>TFRecordReader</code>能够记住tfrecord的位置,并且始终能返回下一个。而这就要求我们在使用之前,必须初始化整个graph,这里我们使用了函数<code>tf.initialize_all_variables()</code>来进行初始化。</p>
<p>第二,tensorflow中的队列和普通的队列差不多,不过它里面的<code>operation</code>和<code>tensor</code>都是符号型的(<code>symbolic</code>),在调用<code>sess.run()</code>时才执行。</p>
<p>第三, <code>TFRecordReader</code>会一直弹出队列中文件的名字,直到队列为空。</p>
<h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><ol>
<li>生成tfrecord文件</li>
<li>定义<code>record reader</code>解析tfrecord文件</li>
<li>构造一个批生成器(<code>batcher</code>)</li>
<li>构建其他的操作</li>
<li>初始化所有的操作</li>
<li>启动<code>QueueRunner</code></li>
</ol>
</div>
<footer class="article-footer">
<a data-url="https://t5eng.github.io/2021/09/17/tfRecord%20%E6%90%AC%E8%BF%90/" data-id="cktocv4vm000dyutk9m60080j" data-title="tfRecord 搬运" class="article-share-link">Share</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/" rel="tag">深度学习</a></li></ul>
</footer>
</div>
</article>
<article id="post-NonLocal Neural Network备忘" class="h-entry article article-type-post" itemprop="blogPost" itemscope itemtype="https://schema.org/BlogPosting">
<div class="article-meta">
<a href="/2021/09/17/NonLocal%20Neural%20Network%E5%A4%87%E5%BF%98/" class="article-date">
<time class="dt-published" datetime="2021-09-17T11:02:47.413Z" itemprop="datePublished">2021-09-17</time>
</a>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="p-name article-title" href="/2021/09/17/NonLocal%20Neural%20Network%E5%A4%87%E5%BF%98/">NonLocal Neural Network备忘</a>
</h1>
</header>
<div class="e-content article-entry" itemprop="articleBody">
<h2 id="论文简介"><a href="#论文简介" class="headerlink" title="论文简介"></a>论文简介</h2><p>论文由NonLocal mean启发,每个位置的response(响应?)是由全局的权重值求和相关的。NonLocal block设计为可以插入到已有网络结构层与层之间的形式。在视频分类中表现优秀。</p>
<h2 id="intro"><a href="#intro" class="headerlink" title="intro"></a>intro</h2><p>对于序列数据(语音),long distance dependency是建立在递归操作上(如LSTM);对于图像,long distance dependency是建立在由多层卷积操作建立的大感受域上的。</p>
<p>卷积和递归都是基于当前邻域,实现long distance dependency必须反复遍历数据。计算方式低效,训练困难,信息不容易实现双向传递。</p>
<p>跟self-attention有点关系,每个局部的输出都考虑了全局的加权平均值(self-attention是在embedding空间中计算的,该论文提出的Transformer模型革了RNN的命)。</p>
<p>我们的网络表现很牛,而且可以作为一个基础模块,插入到已有网络中。</p>
<h2 id="原理"><a href="#原理" class="headerlink" title="原理"></a>原理</h2><p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4uh90.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4uh90.jpg" alt="image-20190114133153895"></a></p>
<p><strong>yi</strong>为NonLocal mean输出</p>
<p><strong>i</strong>为当前计算点index</p>
<p><strong>j</strong>为全局所有计算点index</p>
<p>**f(xi,xj)**计算2点之间相似程度(paper提供了4种方式:Gaussian, Embedded Gaussian, Dot Product, Concatenation)</p>
<p>**g(xj)**为j点的response值</p>
<p>C(x)为归一化值,取值为sum_j {f(xi,xj)}</p>
<h2 id="NonLocal-Block"><a href="#NonLocal-Block" class="headerlink" title="NonLocal Block"></a>NonLocal Block</h2><p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4uWhq.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4uWhq.jpg" alt="image-20190114133936809"></a></p>
<p><strong>yi</strong>为NonLocal mean输出</p>
<p><strong>Wz</strong>为NL层的可训练权重</p>
<p><strong>xi</strong>作为残差连接,将Wz初始化为0时,则可以插入到已有模型中且不影响模型结果。</p>
<h2 id="实验"><a href="#实验" class="headerlink" title="实验"></a>实验</h2><p>基于ResNet50的Conv2D和Inflate3D版本,对比插入1、5、10个NonLocal Block之后的表现。</p>
<p>C2D可以直接用ResNet50的预训练权重;I3D可以将卷积核的每片都用2D的版本初始化然后除以t(t为inflate的倍数)</p>
<p>因为Inflate3D计算量比较大,因此每2个resBlock才inflate一个。</p>
<h2 id="Trick"><a href="#Trick" class="headerlink" title="Trick"></a>Trick</h2><p>采用Embedded Gaussian计算NonLocal Mean时通道数选为输入通道的一半,再加一层pooling可以进一步降低计算量到1/4</p>
<p>数据画面随机剪裁,从连续的64帧中随机截选<strong>32帧</strong>作为一组</p>
<p>8GPU,每个GPU计算8组,相当于一个mini-batch有62组数据</p>
<p>训练400k轮,lr=0.01,每150k减少一个数量级</p>
<p>opt用momentum=0.9,weight decay=0.0001,在最后的global pooling layer用0.5的dropout</p>
<p>在每个NonLocal Block的最后一层1x1x1中使用BatchNorm,并且该层权重初始化为0,保证可以插入到任意已训练的网络结构中</p>
<h2 id="实验结果"><a href="#实验结果" class="headerlink" title="实验结果"></a>实验结果</h2><p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4u5cT.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4u5cT.jpg" alt="image-20190114143358883"></a></p>
<p>4种NonLocal Weight计算方法效果接近。</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4uTuF.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4uTuF.jpg" alt="image-20190114143420341"></a></p>
<p>将1个NonLocal Block加到不同位置,效果接近</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4u43V.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4u43V.jpg" alt="image-20190114143455820"></a></p>
<p>更深的网络也可以用NonLocal Block进一步提高表现(ResNet101)</p>
<p>NonLocal的提升也不仅仅是因为深度增加了(R50 + 5-block > R101 base)</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4uqE9.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4uqE9.jpg" alt="image-20190114143622712"></a></p>
<p>NonLocal考虑space&time时效果最好(计算量更大?)</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4uIjU.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4uIjU.jpg" alt="image-20190114143709720"></a></p>
<p>对比I3D,减少了计算量,提高了效果</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4u7B4.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4u7B4.jpg" alt="image-20190115221557322"></a></p>
<p>对比C2D、Inflate3D和NonLocal Inflate3D的表现。NonLocal可以与conv3D互补,得到更好表现</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4uHHJ.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4uHHJ.jpg" alt="image-20190115220819622"></a></p>
<p>用128帧一组数据,效果更好(计算量翻4倍?)基于32帧的训练模型进行训练,lr=0.0025。在更长的数据中,NLI3D相比I3D也仍然有提升。</p>
</div>
<footer class="article-footer">
<a data-url="https://t5eng.github.io/2021/09/17/NonLocal%20Neural%20Network%E5%A4%87%E5%BF%98/" data-id="cktocv4vh0003yutk9vpz4akh" data-title="NonLocal Neural Network备忘" class="article-share-link">Share</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/" rel="tag">深度学习</a></li></ul>
</footer>
</div>
</article>
<article id="post-ML 入门:归一化、标准化和正则化" class="h-entry article article-type-post" itemprop="blogPost" itemscope itemtype="https://schema.org/BlogPosting">
<div class="article-meta">
<a href="/2021/09/17/ML%20%E5%85%A5%E9%97%A8%EF%BC%9A%E5%BD%92%E4%B8%80%E5%8C%96%E3%80%81%E6%A0%87%E5%87%86%E5%8C%96%E5%92%8C%E6%AD%A3%E5%88%99%E5%8C%96/" class="article-date">
<time class="dt-published" datetime="2021-09-17T11:02:27.164Z" itemprop="datePublished">2021-09-17</time>
</a>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="p-name article-title" href="/2021/09/17/ML%20%E5%85%A5%E9%97%A8%EF%BC%9A%E5%BD%92%E4%B8%80%E5%8C%96%E3%80%81%E6%A0%87%E5%87%86%E5%8C%96%E5%92%8C%E6%AD%A3%E5%88%99%E5%8C%96/">ML 入门:归一化、标准化和正则化</a>
</h1>
</header>
<div class="e-content article-entry" itemprop="articleBody">
<p>因为经常混淆Normalization和Regularization两个名词,所以搬运了一篇文章。原文:<a target="_blank" rel="noopener" href="https://zhuanlan.zhihu.com/p/29957294">https://zhuanlan.zhihu.com/p/29957294</a></p>
<h2 id="0x01-归一化-Normalization"><a href="#0x01-归一化-Normalization" class="headerlink" title="0x01 归一化 Normalization"></a><strong>0x01 归一化 Normalization</strong></h2><p>归一化一般是将数据映射到指定的范围,用于去除不同维度数据的量纲以及量纲单位。</p>
<p>常见的映射范围有 [0, 1] 和 [-1, 1] ,最常见的归一化方法就是 <strong>Min-Max 归一化</strong>:</p>
<h2 id="Min-Max-归一化"><a href="#Min-Max-归一化" class="headerlink" title="Min-Max 归一化"></a><strong>Min-Max 归一化</strong></h2><p><a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=x_%7Bnew%7D=%5Cfrac%7Bx-x_%7Bmin%7D%7D%7Bx_%7Bmax%7D-x_%7Bmin%7D%7D"><img src="http://www.zhihu.com/equation?tex=x_%7Bnew%7D=%5Cfrac%7Bx-x_%7Bmin%7D%7D%7Bx_%7Bmax%7D-x_%7Bmin%7D%7D" alt="x_{new}=\frac{x-x_{min}}{x_{max}-x_{min}}"></a></p>
<p>举个例子,我们判断一个人的身体状况是否健康,那么我们会采集人体的很多指标,比如说:身高、体重、红细胞数量、白细胞数量等。</p>
<p>一个人身高 180cm,体重 70kg,白细胞计数<a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=7.50%C3%9710%5E%7B9%7D/L"><img src="http://www.zhihu.com/equation?tex=7.50%C3%9710%5E%7B9%7D/L" alt="7.50×10^{9}/L"></a>,etc.</p>
<p>衡量两个人的状况时,白细胞计数就会起到主导作用从而<strong>遮盖住其他的特征</strong>,归一化后就不会有这样的问题。</p>
<h2 id="0x02-标准化-Normalization"><a href="#0x02-标准化-Normalization" class="headerlink" title="0x02 标准化 Normalization"></a><strong>0x02 标准化 Normalization</strong></h2><p><strong>在这里我们需要强调一下英文翻译的问题,在 Udacity 字幕组中对此进行了探讨:</strong></p>
<blockquote>
<p>归一化和标准化的英文翻译是一致的,但是根据其用途(或公式)的不同去理解(或翻译)</p>
</blockquote>
<p>下面我们将探讨最常见的标准化方法: <strong>Z-Score 标准化</strong>。</p>
<h2 id="Z-Score-标准化"><a href="#Z-Score-标准化" class="headerlink" title="Z-Score 标准化"></a><strong>Z-Score 标准化</strong></h2><p><a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=x_%7Bnew%7D=%5Cfrac%7Bx-%5Cmu+%7D%7B%5Csigma+%7D"><img src="http://www.zhihu.com/equation?tex=x_%7Bnew%7D=%5Cfrac%7Bx-%5Cmu+%7D%7B%5Csigma+%7D" alt="x_{new}=\frac{x-\mu }{\sigma }"></a></p>
<p>其中<a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=%5Cmu"><img src="http://www.zhihu.com/equation?tex=%5Cmu" alt="\mu"></a>是样本数据的<strong>均值(mean)</strong>,<a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=%5Csigma"><img src="http://www.zhihu.com/equation?tex=%5Csigma" alt="\sigma"></a>是样本数据的<strong>标准差(std)</strong>。</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4K9De.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4K9De.jpg" alt="img"></a></p>
<p>上图则是一个散点序列的标准化过程:原图->减去均值->除以标准差。</p>
<p>显而易见,变成了一个<strong>均值为 0 ,方差为 1 的分布</strong>,下图通过 Cost 函数让我们更好的理解标准化的作用。</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4KiEd.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4KiEd.jpg" alt="img"></a></p>
<p>机器学习的目标无非就是不断优化损失函数,使其值最小。在上图中,<a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=J(w,b)"><img src="http://www.zhihu.com/equation?tex=J(w,b)" alt="J(w,b)"></a>就是我们要优化的目标函数</p>
<p>我们不难看出,<strong>标准化后可以更加容易地得出最优参数</strong><a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=w"><img src="http://www.zhihu.com/equation?tex=w" alt="w"></a><strong>和</strong><a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=b"><img src="http://www.zhihu.com/equation?tex=b" alt="b"></a><strong>以及计算出</strong><a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=J(w,b)"><img src="http://www.zhihu.com/equation?tex=J(w,b)" alt="J(w,b)"></a><strong>的最小值,从而达到加速收敛的效果</strong>。[<img src="http://www.zhihu.com/equation?tex=%5E%7B%5B1%5D%7D" alt="^{[1]}](http://www.zhihu.com/equation?tex=%5E%7B%5B1%5D%7D)"></p>
<p><em>注:上图来源于 Andrew Ng 的课程讲义</em></p>
<h2 id="Batch-Normalization"><a href="#Batch-Normalization" class="headerlink" title="Batch Normalization"></a><strong>Batch Normalization</strong></h2><p>在机器学习中,<strong>最常用标准化的地方</strong>莫过于神经网络的 <strong>BN 层(Batch Normalization)</strong>,因此我们简单的谈谈 BN 层的原理和作用,想要更深入的了解可以<a href="http://link.zhihu.com/?target=https://arxiv.org/abs/1502.03167">查看论文</a>。</p>
<p>我们知道数据预处理做标准化可以加速收敛,同理,在神经网络使用标准化也可以<strong>加速收敛</strong>,而且还有如下好处:</p>
<ul>
<li>具有正则化的效果(Batch Normalization reglarizes the model)</li>
<li>提高模型的泛化能力(Be advantageous to the generalization of network)</li>
<li>允许更高的学习速率从而加速收敛(Batch Normalization enables higher learning rates)</li>
</ul>
<p>其原理是<strong>利用正则化减少内部相关变量分布的偏移(Reducing Internal Covariate Shift)</strong>,从而<strong>提高了算法的鲁棒性</strong>。[<img src="http://www.zhihu.com/equation?tex=%5E%7B%5B2%5D%7D" alt="^{[2]}](http://www.zhihu.com/equation?tex=%5E%7B%5B2%5D%7D)"></p>
<p>Batch Normalization 由两部分组成,第一部分是<strong>缩放与平移(scale and shift)</strong>,第二部分是<strong>训练缩放尺度和平移的参数(train a BN Network)</strong>,算法步骤如下:</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4KCHH.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4KCHH.jpg" alt="img"></a></p>
<p>接下来训练 BN 层参数<a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=%5Cgamma"><img src="http://www.zhihu.com/equation?tex=%5Cgamma" alt="\gamma"></a>和<a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=%5Cbeta+"><img src="http://www.zhihu.com/equation?tex=%5Cbeta+" alt="\beta "></a>,限于篇幅的原因按下不表,有兴趣的读者可以拜读<a href="http://link.zhihu.com/?target=https://arxiv.org/abs/1502.03167">这篇论文</a>。</p>
<h2 id="0x03-正则化-Regularization"><a href="#0x03-正则化-Regularization" class="headerlink" title="0x03 正则化 Regularization"></a><strong>0x03 正则化 Regularization</strong></h2><p><strong>正则化主要用于避免过拟合的产生和减少网络误差。</strong></p>
<p>正则化一般具有如下形式:</p>
<p><a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=J(w,b)=+%5Cfrac%7B1%7D%7Bm%7D+%5Csum_%7Bi=1%7D%5E%7Bm%7DL(f(x),y)+%5Clambda+R(f)"><img src="http://www.zhihu.com/equation?tex=J(w,b)=+%5Cfrac%7B1%7D%7Bm%7D+%5Csum_%7Bi=1%7D%5E%7Bm%7DL(f(x),y)+%5Clambda+R(f)" alt="J(http://www.zhihu.com/equation?tex=J%28w%2Cb%29%3D+%5Cfrac%7B1%7D%7Bm%7D+%5Csum_%7Bi%3D1%7D%5E%7Bm%7DL%28f%28x%29%2Cy%29%2B%5Clambda+R%28f%29)= \frac{1}{m} \sum_{i=1}^{m}L(f(x),y)+\lambda R(f)"></a></p>
<p>其中,第 1 项是<strong>经验风险</strong>,第 2 项是<strong>正则项</strong>,<a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=%CE%BB%E2%89%A50"><img src="http://www.zhihu.com/equation?tex=%CE%BB%E2%89%A50" alt="λ≥0"></a>为调整两者之间关系的系数。</p>
<p>第 1 项的经验风险较小的模型可能较复杂(有多个非零参数),这时第 2 项的模型复杂度会较大。</p>
<p>常见的有正则项有 <strong>L1 正则</strong> 和 <strong>L2 正则</strong> ,其中 <strong>L2 正则</strong> 的控制过拟合的效果比 <strong>L1 正则</strong> 的好。</p>
<p><strong>正则化的作用是选择经验风险与模型复杂度同时较小的模型</strong>。[<img src="http://www.zhihu.com/equation?tex=%5E%7B%5B3%5D%7D" alt="^{[3]}](http://www.zhihu.com/equation?tex=%5E%7B%5B3%5D%7D)"></p>
<p>常见的有正则项有 <strong>L1 正则</strong> 和 <strong>L2 正则</strong> 以及 <strong>Dropout</strong> ,其中 <strong>L2 正则</strong> 的控制过拟合的效果比 <strong>L1 正则</strong> 的好。</p>
<h2 id="范数"><a href="#范数" class="headerlink" title="范数"></a><a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=L_%7Bp%7D"><img src="http://www.zhihu.com/equation?tex=L_%7Bp%7D" alt="L_{p}"></a><strong>范数</strong></h2><p>为什么叫 L1 正则,有 L1、L2 正则 那么有没有 L3、L4 之类的呢?</p>
<p>首先我们补一补课,<a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=L_%7Bp%7D"><img src="http://www.zhihu.com/equation?tex=L_%7Bp%7D" alt="L_{p}"></a>正则的 L 是指<a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=L_%7Bp%7D"><img src="http://www.zhihu.com/equation?tex=L_%7Bp%7D" alt="L_{p}"></a>范数,其定义为:</p>
<p><a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=L_%7B0%7D"><img src="http://www.zhihu.com/equation?tex=L_%7B0%7D" alt="L_{0}"></a>范数:<a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=%5Cleft+%7C+w+%5Cright+%7C_%7B0%7D+=+%23(i)+with++x_%7Bi%7D+%5Cneq+0"><img src="http://www.zhihu.com/equation?tex=%5Cleft+%5C%7C+w+%5Cright+%5C%7C_%7B0%7D+=+%5C%23(i)%5C+with+%5C+x_%7Bi%7D+%5Cneq+0" alt="\left \| w \right \|_{0} = \#(i)\ with \ x_{i} \neq 0"></a><em>(非零元素的个数)</em></p>
<p><a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=L_%7B1%7D"><img src="http://www.zhihu.com/equation?tex=L_%7B1%7D" alt="L_{1}"></a>范数:<a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=%5Cleft+%7C+w+%5Cright+%7C_%7B1%7D+=+%5Csum_%7Bi+=+1%7D%5E%7Bd%7D%5Clvert+x_i%5Crvert"><img src="http://www.zhihu.com/equation?tex=%5Cleft+%5C%7C+w+%5Cright+%5C%7C_%7B1%7D+=+%5Csum_%7Bi+=+1%7D%5E%7Bd%7D%5Clvert+x_i%5Crvert" alt="\left \| w \right \|_{1} = \sum_{i = 1}^{d}\lvert x_i\rvert"></a><em>(每个元素绝对值之和)</em></p>
<p><a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=L_%7B2%7D"><img src="http://www.zhihu.com/equation?tex=L_%7B2%7D" alt="L_{2}"></a>范数:<a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=%5Cleft+%7C+w+%5Cright+%7C_%7B2%7D+=+%5CBigl(%5Csum_%7Bi+=+1%7D%5E%7Bd%7D+x_i%5E2%5CBigr)%5E%7B1/2%7D"><img src="http://www.zhihu.com/equation?tex=%5Cleft+%5C%7C+w+%5Cright+%5C%7C_%7B2%7D+=+%5CBigl(%5Csum_%7Bi+=+1%7D%5E%7Bd%7D+x_i%5E2%5CBigr)%5E%7B1/2%7D" alt="\left \| w \right \|_{2} = \Bigl(\sum_{i = 1}^{d} x_i^2\Bigr)^{1/2}"></a><em>(欧氏距离)</em></p>
<p><a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=L_%7Bp%7D"><img src="http://www.zhihu.com/equation?tex=L_%7Bp%7D" alt="L_{p}"></a>范数:<a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=%5Cleft+%7C+w+%5Cright+%7C_%7Bp%7D+=+%5CBigl(%5Csum_%7Bi+=+1%7D%5E%7Bd%7D+x_i%5Ep%5CBigr)%5E%7B1/p%7D"><img src="http://www.zhihu.com/equation?tex=%5Cleft+%5C%7C+w+%5Cright+%5C%7C_%7Bp%7D+=+%5CBigl(%5Csum_%7Bi+=+1%7D%5E%7Bd%7D+x_i%5Ep%5CBigr)%5E%7B1/p%7D" alt="\left \| w \right \|_{p} = \Bigl(\sum_{i = 1}^{d} x_i^p\Bigr)^{1/p}"></a></p>
<p>在机器学习中,若使用了<a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=%5ClVert+w%5CrVert_p"><img src="http://www.zhihu.com/equation?tex=%5ClVert+w%5CrVert_p" alt="\lVert w\rVert_p"></a>作为正则项,我们则说该机器学习任务<strong>引入了</strong><a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=L_%7Bp%7D"><img src="http://www.zhihu.com/equation?tex=L_%7Bp%7D" alt="L_{p}"></a><strong>正则项</strong>。</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4KFUA.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4KFUA.jpg" alt="img"></a></p>
<p><em>上图来自周志华老师的《机器学习》插图</em></p>
<h2 id="L1-正则-Lasso-regularizer"><a href="#L1-正则-Lasso-regularizer" class="headerlink" title="L1 正则 Lasso regularizer"></a><strong>L1 正则 Lasso regularizer</strong></h2><p><a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=J(w,b)=%5Cfrac%7B1%7D%7Bm%7D+%5Csum_%7Bi=1%7D%5E%7Bm%7DL(%5Chat%7By%7D,y)+%5Cfrac%7B%5Clambda+%7D%7Bm%7D%5Cleft+%7C+w+%5Cright+%7C_%7B1%7D"><img src="http://www.zhihu.com/equation?tex=J(w,b)=%5Cfrac%7B1%7D%7Bm%7D+%5Csum_%7Bi=1%7D%5E%7Bm%7DL(%5Chat%7By%7D,y)+%5Cfrac%7B%5Clambda+%7D%7Bm%7D%5Cleft+%5C%7C+w+%5Cright+%5C%7C_%7B1%7D" alt="J(w,b)=\frac{1}{m} \sum_{i=1}^{m}L(\hat{y},y)+\frac{\lambda }{m}\left \| w \right \|_{1}"></a></p>
<ul>
<li>凸函数,不是处处可微分</li>
<li>得到的是稀疏解(最优解常出现在顶点上,且顶点上的 w 只有很少的元素是非零的)</li>
</ul>
<h2 id="L2-正则-Ridge-Regularizer-Weight-Decay"><a href="#L2-正则-Ridge-Regularizer-Weight-Decay" class="headerlink" title="L2 正则 Ridge Regularizer / Weight Decay"></a><strong>L2 正则 Ridge Regularizer / Weight Decay</strong></h2><p><a target="_blank" rel="noopener" href="http://www.zhihu.com/equation?tex=J(w,b)=%5Cfrac%7B1%7D%7Bm%7D+%5Csum_%7Bi=1%7D%5E%7Bm%7DL(%5Chat%7By%7D,y)+%5Cfrac%7B%5Clambda+%7D%7B2m%7D%5Cleft+%7C+w+%5Cright+%7C%5E%7B2%7D_%7B2%7D"><img src="http://www.zhihu.com/equation?tex=J(w,b)=%5Cfrac%7B1%7D%7Bm%7D+%5Csum_%7Bi=1%7D%5E%7Bm%7DL(%5Chat%7By%7D,y)+%5Cfrac%7B%5Clambda+%7D%7B2m%7D%5Cleft+%5C%7C+w+%5Cright+%5C%7C%5E%7B2%7D_%7B2%7D" alt="J(w,b)=\frac{1}{m} \sum_{i=1}^{m}L(\hat{y},y)+\frac{\lambda }{2m}\left \| w \right \|^{2}_{2}"></a></p>
<ul>
<li>凸函数,处处可微分</li>
<li>易于优化</li>
</ul>
<h2 id="Dropout"><a href="#Dropout" class="headerlink" title="Dropout"></a><strong>Dropout</strong></h2><p><strong>Dropout 主要用于神经网络,其原理是使神经网络中的某些神经元随机失活,让模型不过度依赖某一神经元,达到增强模型鲁棒性以及控制过拟合的效果。</strong></p>
<p>除此之外,Dropout 还有多模型投票等功能,若有兴趣可以拜读<a href="http://link.zhihu.com/?target=https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf">这篇论文</a><strong>。</strong></p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4KpuD.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4KpuD.jpg" alt="img"></a></p>
<h2 id="0x04-Reference"><a href="#0x04-Reference" class="headerlink" title="0x04 Reference"></a><strong>0x04 Reference</strong></h2><p>[1] LeCun, Y., Bottou, L., Orr, G., and Muller, K. Efficient backprop. In Orr, G. and K., Muller (eds.), Neural Net-works: Tricks of the trade. Springer, 1998b.</p>
<p>[2] Sergey Ioffe, Christian Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift”, arXiv preprint arXiv:1502.03167, 2015.</p>
<p>[3] 李航. 统计方法学 P13-14</p>
<p>[4] 聊聊机器学习中的损失函数 <a href="http://link.zhihu.com/?target=http://kubicode.me/2016/04/11/Machine%20Learning/Say-About-Loss-Function/">http://kubicode.me/2016/04/11/Machine%20Learning/Say-About-Loss-Function/</a></p>
<p>[5] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever and Ruslan Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from<br>Overfitting”,</p>
</div>
<footer class="article-footer">
<a data-url="https://t5eng.github.io/2021/09/17/ML%20%E5%85%A5%E9%97%A8%EF%BC%9A%E5%BD%92%E4%B8%80%E5%8C%96%E3%80%81%E6%A0%87%E5%87%86%E5%8C%96%E5%92%8C%E6%AD%A3%E5%88%99%E5%8C%96/" data-id="cktocv4va0000yutk9o8q5dd7" data-title="ML 入门:归一化、标准化和正则化" class="article-share-link">Share</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/" rel="tag">深度学习</a></li></ul>
</footer>
</div>
</article>
<article id="post-量子计算简述 - intro" class="h-entry article article-type-post" itemprop="blogPost" itemscope itemtype="https://schema.org/BlogPosting">
<div class="article-meta">
<a href="/2021/09/17/%E9%87%8F%E5%AD%90%E8%AE%A1%E7%AE%97%E7%AE%80%E8%BF%B0%20-%20intro/" class="article-date">
<time class="dt-published" datetime="2021-09-17T11:01:53.967Z" itemprop="datePublished">2021-09-17</time>
</a>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="p-name article-title" href="/2021/09/17/%E9%87%8F%E5%AD%90%E8%AE%A1%E7%AE%97%E7%AE%80%E8%BF%B0%20-%20intro/">量子计算简述 - intro</a>
</h1>
</header>
<div class="e-content article-entry" itemprop="articleBody">
<p>以下内容基于课堂理解</p>
<h2 id="量子计算发展史"><a href="#量子计算发展史" class="headerlink" title="量子计算发展史"></a>量子计算发展史</h2><p>1982年,物理学家Feynman提出一个抽象模型,示范利用量子系统做运算。</p>
<p>1985年,牛津大学的Deutsch提出了第一个量子计算机的设计蓝图和网络模型,定义了量子图灵机。</p>
<p>1994年,Shor证明了大数质因子分解的问题,可以用量子计算机快速解决。</p>
<p>1995年,Grover提出在一个无序的数据库中搜索的问题,可以用量子计算机快速解决。</p>
<p>2001年,IBM使用7个量子比特的计算器,用Shor算法实现15分解成3x5.</p>
<p>2018年,IBM实现50量子比特的处理器原型。掌握50量子比特计算机成熟技术的国家将实现“量子霸权”(Quantum Computational Supremacy)。</p>
<h2 id="量子位和量子寄存器"><a href="#量子位和量子寄存器" class="headerlink" title="量子位和量子寄存器"></a>量子位和量子寄存器</h2><p>相较于传统计算机,量子机定义了“<strong>量子位</strong>”(Qubit)。Qubit可以像传统计算机一样处于“0”态或“1”态,也可以处于<strong>叠加态</strong>。在量子力学中用狄拉克符号表示一个Qubit:<br>$$<br>| \psi \rangle = a <em>| 0 \rangle + b</em> | 1 \rangle<br>$$<br>其中:|0>与|1>为<strong>基本态</strong>;a,b为<strong>概率幅</strong>,且有:</p>
<p>|a|2+|b|2=1|a|2+|b|2=1</p>
<p>对于2个Qubit组成的<strong>希尔伯特(Hilbert)空间</strong>,会具有4个基本态,其叠加态为:<br>$$<br>| \psi \rangle = a_{00} <em>|00 \rangle + a_{01}</em> |01 \rangle + a_{10} <em>|10 \rangle + a_{11}</em> |11 \rangle<br>$$<br>其中各参数定义与单Qubit情况类似。</p>
<p>推广到N位Qubit可知,将具有2^n种相互正交的基本态,因此量子寄存器位数的<strong>线性增长</strong>将实现量子寄存器容量的<strong>指数增长</strong>。</p>
<h2 id="量子逻辑门"><a href="#量子逻辑门" class="headerlink" title="量子逻辑门"></a>量子逻辑门</h2><p>个人理解,Qubit的计算可以参考线性代数。</p>
<p>对量子寄存器叠加态进行变换(幺正变换)以实现逻辑功能的操作,称为<strong>量子逻辑门</strong>。</p>
<p>常见的量子逻辑门包括:CNOT门、Hadamard门、Pauli X/Y/Z门、相位门、π/8门、量子旋转门</p>
<p>叠加态的量子比特在多维空间上进行线性变换(量子学称为<strong>幺正变换</strong>):将叠加态左乘一个<strong>幺正矩阵</strong>(量子逻辑门),可以同时对叠加态中所有Qubit进行变换,因此量子计算具有并行性。即对量子寄存器进行一次幺正变换,相当于传统计算机进行2^n次运算。</p>
<p>而传统计算机由于部分计算是串行的,因此不可能通过无限增加处理器数量达到100%并行。</p>
<h2 id="量子测量与输出"><a href="#量子测量与输出" class="headerlink" title="量子测量与输出"></a>量子测量与输出</h2><p>对叠加态量子体系的测量将造成波函数的坍缩,使叠加态转化为基态。即n位的量子寄存器叠加保存了2^n个二进制数,但输出结果只能为n位的二进制数。</p>
<p>如n=2的情况:<br>$$<br>| \psi \rangle = a_{00} <em>|00 \rangle + a_{01}</em> |01 \rangle + a_{10} <em>|10 \rangle + a_{11}</em> |11 \rangle<br>$$<br>测量后,量子体系将会坍缩到|00>、|01>、|10>、|11>四种状态之一,且坍缩的过程是随机的。因此量子计算的目的就是设计量子算法(通过幺正变换),使所需要的结果的概率尽可能大,不需要的结果的概率尽可能小。</p>
<p>对于多量子体系,各个量子的测量是相对独立的,且符合统计原理的。</p>
<p>例如,对第一个Qubit的测量结果为0的概率为:</p>
<p>P|00⟩+|01⟩=|a00|2+|a01|2P|00⟩+|01⟩=|a00|2+|a01|2</p>
<p>若测量得知第一个Qubit结果为0,此时测量第二个Qubit概率分布应为:</p>
<p>P|00⟩=a00|a00|2+|a01|2</p>
</div>
<footer class="article-footer">
<a data-url="https://t5eng.github.io/2021/09/17/%E9%87%8F%E5%AD%90%E8%AE%A1%E7%AE%97%E7%AE%80%E8%BF%B0%20-%20intro/" data-id="cktocv4vr000pyutkfkfl7oxk" data-title="量子计算简述 - intro" class="article-share-link">Share</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/%E6%97%A5%E8%AE%B0/" rel="tag">日记</a></li></ul>
</footer>
</div>
</article>
<article id="post-MobileNet概览" class="h-entry article article-type-post" itemprop="blogPost" itemscope itemtype="https://schema.org/BlogPosting">
<div class="article-meta">
<a href="/2021/09/17/MobileNet%E6%A6%82%E8%A7%88/" class="article-date">
<time class="dt-published" datetime="2021-09-17T11:00:29.069Z" itemprop="datePublished">2021-09-17</time>
</a>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="p-name article-title" href="/2021/09/17/MobileNet%E6%A6%82%E8%A7%88/">MobileNet概览</a>
</h1>
</header>
<div class="e-content article-entry" itemprop="articleBody">
<p>以下内容基于本人阅读理解</p>
<p>arXiv:1704.04861v1 [cs.CV] 17 Apr 2017</p>
<p>arXiv:1801.04381v3 [cs.CV] 2 Apr 2018</p>
<h2 id="论文简介"><a href="#论文简介" class="headerlink" title="论文简介"></a>论文简介</h2><p>谷歌对于目前网络结构为了追求准确率,而不断变深变大,牺牲运算速度的情况不爽。考虑到移动设备和嵌入式设备的运算能力,优化了网络结构,并提出两个超参数width multiplier和resolution multiplier以平衡调节准确率和运算速度的问题。</p>
<p>mobilenetV2中提出新的网络结构inverted residual block。</p>
<h2 id="网络结构"><a href="#网络结构" class="headerlink" title="网络结构"></a>网络结构</h2><p>通过优化网络结构,将mobile net的运算量减少到传统卷积网络的1/9,参数量减少到1/7。</p>
<p>下图为mobile net的“深度可分离卷积”:</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4uB1f.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4uB1f.jpg" alt="image-20180912235700174"></a></p>
<p>通过将传统的卷积,分解为 对单个通道的卷积后叠加(而非相加)再执行1x1卷积,减少参数量同时多增加一次batchnorm和relu运算,增加非线性。</p>
<p>并不是稀疏矩阵就一定会比稠密矩阵运算快,但MobileNet将大部分运算集中在1x1卷积中(95%),可以直接调用General Matrix Multiply Funciton(GEMM),反正就是很快。</p>
<h2 id="新增两个超参数"><a href="#新增两个超参数" class="headerlink" title="新增两个超参数"></a>新增两个超参数</h2><h3 id="Width-multiplier"><a href="#Width-multiplier" class="headerlink" title="Width multiplier"></a>Width multiplier</h3><p>相当于压缩depthwise seperable Conv的kernel数,可以压缩计算过程中的通道数。</p>
<h3 id="Resolution-multiplier"><a href="#Resolution-multiplier" class="headerlink" title="Resolution multiplier"></a>Resolution multiplier</h3><p>压缩feature map的尺寸,可以减少每层的输入尺寸,计算量以平方级别减少。</p>
<h2 id="MobileNetV2"><a href="#MobileNetV2" class="headerlink" title="MobileNetV2"></a>MobileNetV2</h2><p>文章引入了manifold的概念,认为高维度空间的信息是可以被映射(encode)到低维度空间的。</p>
<p>V1就是利用较少的通道数,让空间复杂度降低。width multiplier可以调整feature map的空间维度,让manifold充满(saturate?sapn?)feature map。但是这个方法有时候会失效:对一个1维直线使用ReLU,会输出一条2维的射线;对于n维空间,就会产生有n个结点的曲线。(不确定理解是否有误)</p>
<p>2个可以确定高维度activation space可以映射到低维度子空间的条件:</p>
<ol>
<li>activation space经过RuLU的<strong>非线性</strong>变换之后仍为非零,则这个变换存在一个对应的<strong>线性</strong>变换。</li>
<li>仅当activation space可以映射到低维度的时候,ReLU才可能保留input space的所有信息。</li>
</ol>
<h2 id="Inverted-Residual-Block"><a href="#Inverted-Residual-Block" class="headerlink" title="Inverted Residual Block"></a>Inverted Residual Block</h2><p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4u09P.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4u09P.jpg" alt="InvertedResidualBlock"></a></p>
<p>residual connection是为了优化多层网络时的梯度</p>
<ul>
<li>中间层深度为0时,这个block就是identity function</li>
<li>中间层深度为input深度时,这个block就是传统的resBlock</li>
<li>中间层深度大于input深度是,效果最好(更高维度的特征表示,更稀疏,耦合更少,更好训练?)</li>
</ul>
<p>V2的论文同样引入一个新的hyper parameter用于调节中间层深度与输入层深度的ratio,试验结果同样是非常快,非常准。</p>
<h2 id="Conclusions"><a href="#Conclusions" class="headerlink" title="Conclusions"></a>Conclusions</h2><p>以上为个人理解总结,部分结论不一定正确。</p>
<p>欢迎指出我的错误。</p>
</div>
<footer class="article-footer">
<a data-url="https://t5eng.github.io/2021/09/17/MobileNet%E6%A6%82%E8%A7%88/" data-id="cktocv4ve0001yutk7hed7nwx" data-title="MobileNet概览" class="article-share-link">Share</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/" rel="tag">深度学习</a></li></ul>
</footer>
</div>
</article>
<article id="post-PyTorch 常用api (不定期更新)" class="h-entry article article-type-post" itemprop="blogPost" itemscope itemtype="https://schema.org/BlogPosting">
<div class="article-meta">
<a href="/2019/01/25/PyTorch%20%E5%B8%B8%E7%94%A8api%20(%E4%B8%8D%E5%AE%9A%E6%9C%9F%E6%9B%B4%E6%96%B0)/" class="article-date">
<time class="dt-published" datetime="2019-01-24T16:00:00.000Z" itemprop="datePublished">2019-01-25</time>
</a>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="p-name article-title" href="/2019/01/25/PyTorch%20%E5%B8%B8%E7%94%A8api%20(%E4%B8%8D%E5%AE%9A%E6%9C%9F%E6%9B%B4%E6%96%B0)/">PyTorch 常用api (不定期更新)</a>
</h1>
</header>
<div class="e-content article-entry" itemprop="articleBody">
<h1 id="Checkpoint-amp-Loading"><a href="#Checkpoint-amp-Loading" class="headerlink" title="Checkpoint & Loading"></a>Checkpoint & Loading</h1><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 保存和加载整个模型 </span></span><br><span class="line">torch.save(model_object, <span class="string">'model.pkl'</span>) </span><br><span class="line">model = torch.load(<span class="string">'model.pkl'</span>)</span><br><span class="line"><span class="comment"># 仅保存和加载模型参数(推荐使用) </span></span><br><span class="line">torch.save(model_object.state_dict(), <span class="string">'params.pkl'</span>) </span><br><span class="line">model_object.load_state_dict(torch.load(<span class="string">'params.pkl'</span>))</span><br></pre></td></tr></table></figure>
<h1 id="Initialization"><a href="#Initialization" class="headerlink" title="Initialization"></a>Initialization</h1><p>参数的初始化其实就是对参数赋值。而我们需要学习的参数其实都是Variable,它其实是对Tensor的封装,同时提供了data,grad等接口,这就意味着我们可以直接对这些参数进行操作赋值了。这就是PyTorch简洁高效所在。</p>
<p>所以我们可以进行如下操作进行初始化,当然其实有其他的方法,但是这种方法是PyTorch作者所推崇的:</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">weight_init</span>(<span class="params">m</span>):</span></span><br><span class="line"><span class="comment"># 使用isinstance来判断m属于什么类型</span></span><br><span class="line"> <span class="keyword">if</span> <span class="built_in">isinstance</span>(m, nn.Conv2d):</span><br><span class="line"> n = m.kernel_size[<span class="number">0</span>] * m.kernel_size[<span class="number">1</span>] * m.out_channels</span><br><span class="line"> m.weight.data.normal_(<span class="number">0</span>, math.sqrt(<span class="number">2.</span> / n))</span><br><span class="line"> <span class="keyword">elif</span> <span class="built_in">isinstance</span>(m, nn.BatchNorm2d):</span><br><span class="line"><span class="comment"># m中的weight,bias其实都是Variable,为了能学习参数以及后向传播</span></span><br><span class="line"> m.weight.data.fill_(<span class="number">1</span>)</span><br><span class="line"> m.bias.data.zero_()</span><br></pre></td></tr></table></figure>
<p>另一种常见的初始化方法:</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">nn.init.kaiming_normal(self.W.weight)</span><br><span class="line">nn.init.constant(self.W[<span class="number">0</span>].bias, <span class="number">0</span>)</span><br></pre></td></tr></table></figure>
<h1 id="Optimization"><a href="#Optimization" class="headerlink" title="Optimization"></a>Optimization</h1><h2 id="局部微调"><a href="#局部微调" class="headerlink" title="局部微调"></a>局部微调</h2><p>有时候我们加载了训练模型后,只想调节最后的几层,其他层不训练。其实不训练也就意味着不进行梯度计算,PyTorch中提供的requires_grad使得对训练的控制变得非常简单。</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">model = torchvision.models.resnet18(pretrained=<span class="literal">True</span>)</span><br><span class="line"><span class="keyword">for</span> param <span class="keyword">in</span> model.parameters():</span><br><span class="line"> param.requires_grad = <span class="literal">False</span></span><br><span class="line"><span class="comment"># 替换最后的全连接层, 改为训练100类</span></span><br><span class="line"><span class="comment"># 新构造的模块的参数默认requires_grad为True</span></span><br><span class="line">model.fc = nn.Linear(<span class="number">512</span>, <span class="number">100</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 只优化最后的分类层</span></span><br><span class="line">optimizer = optim.SGD(model.fc.parameters(), lr=<span class="number">1e-2</span>, momentum=<span class="number">0.9</span>)</span><br></pre></td></tr></table></figure>
<h2 id="全局微调"><a href="#全局微调" class="headerlink" title="全局微调"></a>全局微调</h2><p>有时候我们需要对全局都进行finetune,只不过我们希望改换过的层和其他层的学习速率不一样,这时候我们可以把其他层和新层在optimizer中单独赋予不同的学习速率。比如:</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">ignored_params = <span class="built_in">list</span>(<span class="built_in">map</span>(<span class="built_in">id</span>, model.fc.parameters()))</span><br><span class="line">base_params = <span class="built_in">filter</span>(<span class="keyword">lambda</span> p: <span class="built_in">id</span>(p) <span class="keyword">not</span> <span class="keyword">in</span> ignored_params,</span><br><span class="line"> model.parameters())</span><br><span class="line"></span><br><span class="line">optimizer = torch.optim.SGD([</span><br><span class="line"> {<span class="string">'params'</span>: base_params},</span><br><span class="line"> {<span class="string">'params'</span>: model.fc.parameters(), <span class="string">'lr'</span>: <span class="number">1e-2</span>}</span><br><span class="line"> ], lr=<span class="number">1e-3</span>, momentum=<span class="number">0.9</span>)</span><br></pre></td></tr></table></figure>
<p>其中base_params使用1e-3来训练,model.fc.parameters使用1e-2来训练,momentum是二者共有的。</p>
</div>
<footer class="article-footer">
<a data-url="https://t5eng.github.io/2019/01/25/PyTorch%20%E5%B8%B8%E7%94%A8api%20(%E4%B8%8D%E5%AE%9A%E6%9C%9F%E6%9B%B4%E6%96%B0)/" data-id="cktocv4vi0004yutk5pv19az4" data-title="PyTorch 常用api (不定期更新)" class="article-share-link">Share</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/" rel="tag">深度学习</a></li></ul>
</footer>
</div>
</article>
<article id="post-利用Google Cloud Platform的1年免费期搭建私人梯子" class="h-entry article article-type-post" itemprop="blogPost" itemscope itemtype="https://schema.org/BlogPosting">
<div class="article-meta">
<a href="/2018/09/01/%E5%88%A9%E7%94%A8Google%20Cloud%20Platform%E7%9A%841%E5%B9%B4%E5%85%8D%E8%B4%B9%E6%9C%9F%E6%90%AD%E5%BB%BA%E7%A7%81%E4%BA%BA%E6%A2%AF%E5%AD%90/" class="article-date">
<time class="dt-published" datetime="2018-08-31T16:00:00.000Z" itemprop="datePublished">2018-09-01</time>
</a>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="p-name article-title" href="/2018/09/01/%E5%88%A9%E7%94%A8Google%20Cloud%20Platform%E7%9A%841%E5%B9%B4%E5%85%8D%E8%B4%B9%E6%9C%9F%E6%90%AD%E5%BB%BA%E7%A7%81%E4%BA%BA%E6%A2%AF%E5%AD%90/">利用Google Cloud Platform的1年免费期搭建私人梯子</a>
</h1>
</header>
<div class="e-content article-entry" itemprop="articleBody">
<h1 id=""><a href="#" class="headerlink" title=""></a></h1><p>Google Cloud Platform(以下简称GCP)是类似于AWS的云服务器平台,提供托管、计算、储存、人工智能api等服务。GCP新用户注册一年或300usd的礼包,且具有台湾地区服务器,ping在40ms以内,与国内dns延迟相当。下面介绍如何快速搭建。</p>
<hr>
<h1 id="注册Gmail-Account"><a href="#注册Gmail-Account" class="headerlink" title="注册Gmail Account"></a>注册Gmail Account</h1><p>首先你要有梯子,注册一个gmail账号,然后才能开始搭建梯子。(滑稽</p>
<h1 id="注册GCP"><a href="#注册GCP" class="headerlink" title="注册GCP"></a>注册GCP</h1><ol>
<li><p>打开网页 <a target="_blank" rel="noopener" href="https://cloud.google.com/free/">https://cloud.google.com/free/</a> ,选择免费试用GCP</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4m2rQ.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4m2rQ.jpg" alt="img"></a></p>
</li>
<li><p>同意条款 - 填写个人信息、地址(随意) - 填写<code>visa/master</code>信用卡信息.</p>
<p>信用卡用于支付服务器费用,并会尝试扣1usd验证,然后退还。由于有首年或300usd的新用户礼包,因此第一年不花钱。</p>
</li>
<li><p>注册后检查自己是否获得新用户礼包。</p>
<ol>
<li><p>打开控制台的结算页面概览:</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4nUzT.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4nUzT.jpg" alt="image-20180714185147878"></a></p>
</li>
<li><p>或者,右上角出现礼物盒图标:</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4n8ds.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4n8ds.jpg" alt="image-20180714185239439"></a></p>
</li>
</ol>
</li>
</ol>
<h1 id="开始部署服务器"><a href="#开始部署服务器" class="headerlink" title="开始部署服务器"></a>开始部署服务器</h1><ol>
<li><p>创建实例:</p>
<p>选择<code>compute engine</code>的vm实例</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4nNWV.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4nNWV.jpg" alt="image-20180714185720274"></a></p>
</li>
<li><p>配置实例:</p>
<p>只做梯子,选低配的实例即可</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4nGon.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4nGon.jpg" alt="image-20180714185937988"></a></p>
<p>启动磁盘更改一下系统,选择centOS:<a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4n3Zj.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4n3Zj.jpg" alt="image-20180714190113826"></a></p>
<p>防火墙配置:</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4ntJ0.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4ntJ0.jpg" alt="image-20180714190557536"></a></p>
<p>创建后等待片刻就看到运行状态了:</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4nYiq.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4nYiq.jpg" alt="image-20180714190826609"></a></p>
</li>
</ol>
<h1 id="网络设置"><a href="#网络设置" class="headerlink" title="网络设置"></a>网络设置</h1><ol>
<li>打开<code>VPC网络</code>设置,将刚刚创建的实例的外部IP设为静态,避免服务器重启后IP改变导致梯子失效。</li>
</ol>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4nDeJ.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4nDeJ.jpg" alt="image-20180714191138428"></a></p>
<ol>
<li><p>设置防火墙规则,允许外部主机访问:</p>
<ol>
<li><p>流量方向<code>入站</code>、来源ip地址<code>0.0.0.0/0</code>、协议和端口<code>全部允许</code></p>
</li>
<li><p>流量方向<code>出站</code>、来源ip地址<code>0.0.0.0/0</code>、协议和端口<code>全部允许</code></p>
<p>(要创建两次防火墙规则,一次出站,一次入站)</p>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4n0L4.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4n0L4.jpg" alt="image-20180714191626974"></a></p>
</li>
</ol>
</li>
</ol>
<h1 id="在实例中安装SS服务"><a href="#在实例中安装SS服务" class="headerlink" title="在实例中安装SS服务"></a>在实例中安装SS服务</h1><ol>
<li>返回<code>compute engine</code>页面,用ssh通道操作实例:</li>
</ol>
<p><a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4ndQU.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4ndQU.jpg" alt="image-20180714191954536"></a></p>
<ol>
<li><p>获取root权限</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ sudo su</span><br></pre></td></tr></table></figure></li>
<li><p>安装SS:</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ wget --no-check-certificate https://raw.githubusercontent.com/teddysun/shadowsocks_install/master/shadowsocksR.sh</span><br><span class="line">chmod +x shadowsocksR.sh</span><br></pre></td></tr></table></figure></li>
<li><p>配置ss服务器信息:</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ ./shadowsocksR.sh 2>&1 | tee shadowsocksR.log</span><br></pre></td></tr></table></figure>
<p>服务器端口:建议为 8989<br>密码:自定义<br>加密方式:建议改为chacha20<br>协议(Protocol):默认为 origin<br>混淆(obfs):默认为 plain</p>
</li>
<li><p>返回<code>compute engine</code>重置VM实例(即重启,不是停止或删除)</p>
</li>
</ol>
<p><strong>至此,服务器及梯子服务已经搭建完成。</strong></p>
<h1 id="在shadowsocks客户端连接梯子"><a href="#在shadowsocks客户端连接梯子" class="headerlink" title="在shadowsocks客户端连接梯子"></a>在shadowsocks客户端连接梯子</h1><p>服务器地址即实例外部IP地址:<a target="_blank" rel="noopener" href="https://s2.ax1x.com/2019/05/12/E4nwyF.jpg"><img src="https://s2.ax1x.com/2019/05/12/E4nwyF.jpg" alt="image-20180714192856045"></a></p>
<p>密码、端口、加密方式均按以上自定义内容输入</p>
<hr>
<p>参考:</p>
<p><a target="_blank" rel="noopener" href="https://shadowsocks.org/en/index.html">shadowsocks官网</a></p>
<p><a target="_blank" rel="noopener" href="https://cloud.google.com/">google cloud官网</a></p>
</div>
<footer class="article-footer">
<a data-url="https://t5eng.github.io/2018/09/01/%E5%88%A9%E7%94%A8Google%20Cloud%20Platform%E7%9A%841%E5%B9%B4%E5%85%8D%E8%B4%B9%E6%9C%9F%E6%90%AD%E5%BB%BA%E7%A7%81%E4%BA%BA%E6%A2%AF%E5%AD%90/" data-id="cktocv4vn000fyutk46nnext6" data-title="利用Google Cloud Platform的1年免费期搭建私人梯子" class="article-share-link">Share</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/%E6%97%A5%E8%AE%B0/" rel="tag">日记</a></li></ul>
</footer>
</div>
</article>
<article id="post-在macOS平台搭建基于Github Page的Hexo博客" class="h-entry article article-type-post" itemprop="blogPost" itemscope itemtype="https://schema.org/BlogPosting">
<div class="article-meta">
<a href="/2018/09/01/%E5%9C%A8macOS%E5%B9%B3%E5%8F%B0%E6%90%AD%E5%BB%BA%E5%9F%BA%E4%BA%8EGithub%20Page%E7%9A%84Hexo%E5%8D%9A%E5%AE%A2/" class="article-date">
<time class="dt-published" datetime="2018-08-31T16:00:00.000Z" itemprop="datePublished">2018-09-01</time>
</a>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="p-name article-title" href="/2018/09/01/%E5%9C%A8macOS%E5%B9%B3%E5%8F%B0%E6%90%AD%E5%BB%BA%E5%9F%BA%E4%BA%8EGithub%20Page%E7%9A%84Hexo%E5%8D%9A%E5%AE%A2/">在macOS平台搭建基于Github Page的Hexo博客</a>
</h1>
</header>
<div class="e-content article-entry" itemprop="articleBody">
<p>以下内容基于网络搬运及个人踩坑总结</p>
<h2 id="环境配置"><a href="#环境配置" class="headerlink" title="环境配置"></a>环境配置</h2><ol>
<li><p>前端工具<strong>Node.js</strong> - 用于生成静态页面</p>
<figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ brew install nod</span><br></pre></td></tr></table></figure>
<p>没有homebrew的话在Terminal键入:</p>
<figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/in stall)"</span><br></pre></td></tr></table></figure></li>
<li><p>前端工具<strong>npm</strong> - Node Package Manager</p>
<figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ npm install</span><br></pre></td></tr></table></figure></li>
<li><p><strong>Git</strong><br>在Mac Store安装Xcode自带Git工具</p>
<p>otherwise, visit <a target="_blank" rel="noopener" href="https://git-scm.com/download">https://git-scm.com/download</a></p>
</li>
<li><p><strong>Hexo</strong> - 博客框架,一键生成静态页</p>
<figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ sudo npm install -g hexo</span><br></pre></td></tr></table></figure></li>
</ol>
<h2 id="初始化"><a href="#初始化" class="headerlink" title="初始化"></a>初始化</h2><ol>
<li><p>建立hexo目录,执行<code>hero init</code>命令:</p>
</li>
<li><pre><code>$ hexo init <folder>
<figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">hexo将自动下载框架至`folder`。</span><br><span class="line"></span><br><span class="line">初始化后`folder`中内容:</span><br><span class="line"></span><br></pre></td></tr></table></figure>
_config.yml
db.json
node_modules
package.json
scaffolds
source
themes
<figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">1. 开启hexo服务器:</span><br><span class="line"></span><br></pre></td></tr></table></figure>
$ hexo server
<figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"> 浏览器中打开网址[http://localhost:4000](http://0.0.0.0:4000/),能看到hexo的hello world,则本地环境搭建成功。</span><br><span class="line"></span><br><span class="line"> [](https://s2.ax1x.com/2019/05/12/E4nX6S.jpg)</span><br><span class="line"></span><br><span class="line">## 关联Github</span><br><span class="line"></span><br><span class="line">1. 创建新的Github Repository作为github page的仓库:</span><br><span class="line"></span><br><span class="line"> 命名规则为:自定义.github.io</span><br><span class="line"></span><br><span class="line"> [](https://s2.ax1x.com/2019/05/12/E4njOg.jpg)</span><br><span class="line"></span><br><span class="line">2. 打开`folder`中`_config.yml` 编辑最后一行:</span><br><span class="line"></span><br></pre></td></tr></table></figure>
deploy:
type: git
repo: https://github.com/user_name/自定义.github.io.git
branch: master
<figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">1. 用hexo生成静态页:</span><br><span class="line"></span><br></pre></td></tr></table></figure>
$ hexo generate
<figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">2. 部署网页:</span><br><span class="line"></span><br></pre></td></tr></table></figure>
$ hexo deploy
<figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">若deploy出错,可能是未安装hexo deployer git</span><br><span class="line"></span><br></pre></td></tr></table></figure>
$ npm install hexo-deployer-git --save
<figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">deploy成功会显示:</span><br><span class="line"></span><br></pre></td></tr></table></figure>
INFO Deploy done: git
<figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">1. 首次运行要输入github账户密码:</span><br><span class="line"></span><br></pre></td></tr></table></figure>
Username for 'https://github.com':
Password for 'https://github.com':
<figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">2. deploy执行后会将`folder`文件同步至Git Repo,根据网络状况可能上传失败,建议打开vpn。</span><br><span class="line"></span><br><span class="line"> 上传成功后打开`https://自定义.github.io`即可看到和打开`https://localhost:400`一样的Hello World。</span><br><span class="line"></span><br><span class="line">## 安装hexo主题</span><br><span class="line"></span><br><span class="line">1. **[hexo官网主题页](https://hexo.io/themes/)** 有大量第三方主题。下面以**cactus**为例。</span><br><span class="line"></span><br><span class="line">2. 在`folder`目录下执行:</span><br><span class="line"></span><br></pre></td></tr></table></figure>
$ git clone https://github.com/probberechts/hexo-theme-cactus.git themes/cactus
<figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"> 即可将主题下载到`themes`目录下`cactus`文件夹。</span><br><span class="line"></span><br><span class="line">3. 修改`folder`中`_config.yml`:</span><br><span class="line"></span><br></pre></td></tr></table></figure>
# theme: landscape
theme: cactus
</code></pre>
</li>
<li><p>保存<code>_config.yml</code>后重新执行<code>hexo generate</code>及<code>hexo deploy</code>即可将新主题的静态页部署在git上。</p>
</li>
</ol>
</div>
<footer class="article-footer">
<a data-url="https://t5eng.github.io/2018/09/01/%E5%9C%A8macOS%E5%B9%B3%E5%8F%B0%E6%90%AD%E5%BB%BA%E5%9F%BA%E4%BA%8EGithub%20Page%E7%9A%84Hexo%E5%8D%9A%E5%AE%A2/" data-id="cktocv4vp000kyutkbtp3fvn4" data-title="在macOS平台搭建基于Github Page的Hexo博客" class="article-share-link">Share</a>
<ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/%E6%97%A5%E8%AE%B0/" rel="tag">日记</a></li></ul>
</footer>
</div>
</article>
<nav id="page-nav">
<span class="page-number current">1</span><a class="page-number" href="/page/2/">2</a><a class="extend next" rel="next" href="/page/2/">Next »</a>
</nav>
</section>
<aside id="sidebar">
<div class="widget-wrap">
<h3 class="widget-title">Tags</h3>
<div class="widget">
<ul class="tag-list" itemprop="keywords"><li class="tag-list-item"><a class="tag-list-link" href="/tags/%E6%97%A5%E8%AE%B0/" rel="tag">日记</a></li><li class="tag-list-item"><a class="tag-list-link" href="/tags/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/" rel="tag">深度学习</a></li></ul>
</div>
</div>
<div class="widget-wrap">
<h3 class="widget-title">Tag Cloud</h3>
<div class="widget tagcloud">
<a href="/tags/%E6%97%A5%E8%AE%B0/" style="font-size: 10px;">日记</a> <a href="/tags/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/" style="font-size: 20px;">深度学习</a>
</div>
</div>
<div class="widget-wrap">
<h3 class="widget-title">Archives</h3>
<div class="widget">
<ul class="archive-list"><li class="archive-list-item"><a class="archive-list-link" href="/archives/2021/09/">September 2021</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2019/01/">January 2019</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2018/09/">September 2018</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2018/07/">July 2018</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2018/06/">June 2018</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2018/01/">January 2018</a></li></ul>
</div>
</div>
<div class="widget-wrap">
<h3 class="widget-title">Recent Posts</h3>
<div class="widget">
<ul>
<li>
<a href="/2021/09/17/SuperResolution%20by%20VDSR,%20PerceptualSR,%20SubpixelConvSR%EF%BC%8CESRGAN/">SuperResolution by VDSR, PerceptualSR, SubpixelConvSR,ESRGAN</a>
</li>
<li>
<a href="/2021/09/17/SuperResolution%20by%20FSRCNN/">SuperResolution by FSRCNN</a>
</li>
<li>
<a href="/2021/09/17/tfRecord%20%E6%90%AC%E8%BF%90/">tfRecord 搬运</a>
</li>
<li>
<a href="/2021/09/17/NonLocal%20Neural%20Network%E5%A4%87%E5%BF%98/">NonLocal Neural Network备忘</a>
</li>