lix90.github.io/atom.xml at master · lix90/lix90.github.io · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Lix is Blogging</title>
  <subtitle>菜鸟笔记，原创不多</subtitle>
  <link href="/atom.xml" rel="self"/>

  <link href="blog.alexiangli.com/"/>
  <updated>2017-01-01T03:19:51.000Z</updated>
  <id>blog.alexiangli.com/</id>

  <author>
    <name>Lix</name>
    <email>xiangli90@outlook.com</email>
  </author>

  <generator uri="http://hexo.io/">Hexo</generator>

  <entry>
    <title>学习 LIMO EEG</title>
    <link href="blog.alexiangli.com/learn_limo_eeg/"/>
    <id>blog.alexiangli.com/learn_limo_eeg/</id>
    <published>2016-12-31T16:00:00.000Z</published>
    <updated>2017-01-01T03:19:51.000Z</updated>

    <content type="html"><![CDATA[<h1 id="数据准备">数据准备</h1>
<ol>
<li>save a matrix of single trials 保存单试次矩阵。</li>
<li>update the EEG.etc field 更新 EEG.etc 域。</li>
</ol>
<p><strong>如何创建单试次矩阵数据？</strong></p>
<p><strong>创建单个数据集的单试次数据矩阵</strong><br>
使用 <code>limo_create_single_trials</code> 函数</p>
<figure class="highlight matlab"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div></pre></td><td class="code"><pre><div class="line">options = &#123;<span class="string">'format'</span>,<span class="string">'matrix'</span>, ...</div><div class="line">           <span class="string">'datatype'</span>, <span class="string">'channels'</span>, ...</div><div class="line">           <span class="string">'erp'</span>,<span class="string">'off'</span>,...</div><div class="line">           <span class="string">'spec'</span>,<span class="string">'on'</span>,...</div><div class="line">           <span class="string">'ersp'</span>,<span class="string">'off'</span>,...</div><div class="line">           <span class="string">'itc'</span>,<span class="string">'on'</span>,...</div><div class="line">           <span class="string">'rmicacomps'</span>,<span class="string">'on'</span>, ...</div><div class="line">           <span class="string">'erpparams'</span>,[],...</div><div class="line">           <span class="string">'specparams'</span>,[],...</div><div class="line">           <span class="string">'erspparams'</span>,[],...</div><div class="line">           <span class="string">'interp'</span>,<span class="string">'off'</span>, ...</div><div class="line">           <span class="string">'scalp'</span>,<span class="string">'off'</span>,...</div><div class="line">           <span class="string">'recompute'</span>,<span class="string">'on'</span>,...</div><div class="line">           <span class="string">'savetrials'</span>,<span class="string">'on'</span> &#125;;</div><div class="line">           <span class="comment">% erpparams, specparams, erspparams 对应为 str_precomp 中对应的参数，分别对应 std_erp, std_spec, std_ersp 中的参数。默认的参数并没有自动内插缺失的通道。</span></div><div class="line">limo_create_single_trials(<span class="string">'my_subject.set'</span>, options&#123;:&#125;);</div></pre></td></tr></table></figure>
<p>使用图形界面：<code>LIMO GUI &gt; LIMO Tools &gt; create single trials</code></p>
<p><strong>创建多个数据集的单试次数据矩阵</strong></p>
<ul>
<li>创建保存数据集文件名列表的文本文件，然后导入即可批量创建多个数据集的单试次数据矩阵；</li>
<li>通过 EEGLAB STUDY 界面</li>
</ul>
<h1 id="1st-level-analysis-一阶分析">1st level analysis 一阶分析</h1>
<p>一阶分析至少需要两种文件 (categorical variable 和 continuous variables) 的其中一种。</p>
<blockquote>
<p>LIMO EEG expect (at least for now) that your have your conditions coded separately for each trial - this should be a single vector with all your conditions (basically you can export this from your EEG structure). A continuous might some measures from your stimuli, the response (like coding incorrect responses to remove some variance), or RTs.</p>
<p>Your <strong>categorical variable</strong> is basically your condition code for each epoch. I assume you have an epoched dataset? If so, for each file you need to generate a list with an integer per row corresponding to the order of trials in your epoched file. The <strong>continuous variable</strong> would be the same but it would not bin your data into condition. So for example it would be the degree of brightness for each picture (hence, continuous).</p>
<p>As Bastien point out, any integer works – also if there are some conditions you want to remove you could just put a NaN. Files can be .mat or .txt</p>
<p>Extracted from: <a href="https://sccn.ucsd.edu/pipermail/eeglablist/2014/007978.html" target="_blank" rel="external">https://sccn.ucsd.edu/pipermail/eeglablist/2014/007978.html</a></p>
</blockquote>
<p>其中，categorical variable 可以由 <code>limo_read_events.m</code> 函数生成:</p>
<figure class="highlight matlab"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div></pre></td><td class="code"><pre><div class="line">markders = &#123;<span class="string">'marker1'</span>, <span class="string">'marker2'</span>&#125;;</div><div class="line"><span class="built_in">cat</span> = limo_read_events(data.set, markers);</div></pre></td></tr></table></figure>
<h1 id="2nd-level-analysis-二阶分析">2nd level analysis 二阶分析</h1>
<p>创建 expected channels 和 neighbouring channels</p>
<p><code>channeighbstructmat = limo_expected_chanlocs(data_set_name, path, neighbour_distance);</code><br>
<code>[neighbours, channeighbstructmat] = limo_get_channeighbstructmat(EEG, neighbdist);</code></p>
]]></content>

    <summary type="html">

      &lt;h1 id=&quot;数据准备&quot;&gt;数据准备&lt;/h1&gt;
&lt;ol&gt;
&lt;li&gt;save a matrix of single trials 保存单试次矩阵。&lt;/li&gt;
&lt;li&gt;update the EEG.etc field 更新 EEG.etc 域。&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;stro

    </summary>

      <category term="Original" scheme="blog.alexiangli.com/categories/Original/"/>


      <category term="EEG" scheme="blog.alexiangli.com/tags/EEG/"/>

      <category term="LIMO EEG" scheme="blog.alexiangli.com/tags/LIMO-EEG/"/>

      <category term="Matlab" scheme="blog.alexiangli.com/tags/Matlab/"/>

  </entry>

  <entry>
    <title>学习 MNE</title>
    <link href="blog.alexiangli.com/learn_mne/"/>
    <id>blog.alexiangli.com/learn_mne/</id>
    <published>2016-12-31T16:00:00.000Z</published>
    <updated>2017-01-13T17:13:12.000Z</updated>

    <content type="html"><![CDATA[<h1 id="基本的数据分析">基本的数据分析</h1>
<h2 id="数据合并">数据合并</h2>
<p><code>raw1.append(raw2, preload=False)</code><br>
说明：有时候，记录和保存数据时，每个被试的数据集是分段存储的，这时需要将一个被试的数据合并成单个数据集。</p>
<h2 id="数据读取">数据读取</h2>
<p>MNE 文档: <a href="http://martinos.org/mne/stable/manual/io.html#importing-eeg-data" target="_blank" rel="external">Importing EEG data</a></p>
<ul>
<li>Brainvision (.vhdr): <code>mne.io.read_raw_brainvision()</code></li>
<li>European data format (.edf): <code>mne.io.read_raw_edf()</code></li>
<li>Biosemi data format (.bdf): <code>mne.io.read_raw_edf()</code></li>
<li>Neuroscan CNT data format (.cnt): <code>mne.io.read_raw_cnt()</code></li>
<li>EGI simple binary (.egi): <code>mne.io.read_raw_egi()</code> 注意：需要从 EGI Netstation 从到处为 <code>simple binary file</code></li>
<li>EEGLAB set files (.set): <code>mne.io.read_raw_eeglab()</code> 和 <code>mne.read_epochs_eeglab()</code></li>
</ul>
<h2 id="对通道的操作">对通道的操作</h2>
<h3 id="选择通道">选择通道</h3>
<p><code>raw.pick_types(meg=False, eeg=True, eog=True)</code><br>
说明：限制为 EEG 和 EOG 通道。<br>
<code>raw.pick_types(include = ['Fz'])</code><br>
说明：选择指定的通道，并返回下标<br>
<code>raw.set_channel_types(mapping={'EOG 061': 'eeg'})</code><br>
说明：改变通道的类型，在这里将 <code>EOG 061</code> 改为 EEG 通道。<br>
<code>raw.rename_channels(mapping={'EOG 061': 'EOG'})</code><br>
说明：对通道重新命名</p>
<h3 id="读取并加载通道位置文件">读取并加载通道位置文件</h3>
<p><a href="http://martinos.org/mne/stable/generated/mne.channels.read_montage.html?highlight=montage#mne.channels.read_montage" target="_blank" rel="external"><code>mne.channels.read_montage()</code></a><br>
说明：通过指定 <code>kind</code> 和 <code>path</code> 参数指定通道位置文件的类型。</p>
<ul>
<li>kind: 指定内置通道位置文件，<code>standard_1005</code> <code>standard_1020</code> 等</li>
<li>path: 通道位置文件的外部文件路径</li>
</ul>
<p><code>data_obj.set_montage()</code><br>
说明：数据对象均有一个可以设置通道位置文件类型的方法。</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div></pre></td><td class="code"><pre><div class="line">montage = mne.channels.read_montage(<span class="string">'standard_1020'</span>)</div><div class="line">raw.set_montage(montage)</div></pre></td></tr></table></figure>
<h2 id="查看数据">查看数据</h2>
<p>MNE 中的不同类型的数据对象均有绘图方法 <code>data_obj.plot*</code>。</p>
<h2 id="标记和删除坏道">标记和删除坏道</h2>
<p><code>data_obj.info['bads'] = ['bads_channels_label']</code><br>
说明：将坏道信息保存在数据对象的 <code>info</code> 属性对象的 <code>bads</code> 属性中。坏道应该尽早发现，排除坏道对后面分析的影响。<br>
<code>picks = mne.pick_types(raw.info, meg=False, eeg=True, exclude='bads')</code></p>
<h2 id="自动识别坏道">自动识别坏道</h2>
<p>使用 autoreject 模块中的 Ransac 算法：</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><div class="line">1</div></pre></td><td class="code"><pre><div class="line"><span class="keyword">from</span> autoreject <span class="keyword">import</span> Ransac</div></pre></td></tr></table></figure>
<h2 id="恢复在线参考点">恢复在线参考点</h2>
<p><code>mne.add_reference_channels(data_obj, ref_channels, copy=True/False)</code><br>
说明：有时需要恢复在线参考点的数据到原始数据结构中，然后更改参考点。这个函数所做的事情就是增加一个在线参考点到数据中，将它的值设为0。</p>
<h2 id="重参考">重参考</h2>
<p><code>mne.set_eeg_reference(data_obj, ref_channels, copy=False/True)</code><br>
说明：<code>ref_channels</code> 为字符串列表，保存通道名称。如果 <code>ref_channels = None</code> 那么执行的为平均参考（默认）；如果为多个通道，那么参考点为多个通道的均值。有时再更改为平均参考时，会出现异常。这是需要将应用自定义的参考点设为 <code>raw.info[&quot;custom_ref_applied&quot;] = False</code>，然后再进行平均参考。</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div></pre></td><td class="code"><pre><div class="line">raw = mne.io.read_raw_brainvision(filepath)</div><div class="line">raw.info[<span class="string">'custom_ref_applied'</span>] = <span class="keyword">False</span></div><div class="line">raw_avg_ref, _ = mne.io.set_eeg_reference(raw, ref_channels=<span class="keyword">None</span>)</div></pre></td></tr></table></figure>
<h2 id="滤波">滤波</h2>
<ul>
<li><code>mne.filter.band_pass_filter</code> 带通滤波</li>
<li><code>mne.filter.band_stop_filter</code> 带阻滤波</li>
<li><code>mne.filter.high_pass_filter</code> 高通滤波</li>
<li><code>mne.filter.low_pass_filter</code> 低通滤波</li>
<li><code>mne.filter.notch_filter</code> 凹陷滤波</li>
<li><code>mne.filter.detrend</code> 去趋势化</li>
</ul>
<p>作为对象的方法使用：</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div></pre></td><td class="code"><pre><div class="line">raw.filter(<span class="number">1</span>, <span class="number">40</span>, l_trans_bandwidth=<span class="number">0.5</span>, h_trans_bandwidth=<span class="string">'auto'</span>,</div><div class="line">                   filter_length=<span class="string">'auto'</span>, phase=<span class="string">'zero'</span>, fir_window=<span class="string">'hann'</span>)</div></pre></td></tr></table></figure>
<h2 id="分段">分段</h2>
<p><code>events = mne.find_events(raw)</code><br>
说明：从原始数据中提取事件。</p>
<p><code>mne.make_fixed_length_events</code> 按固定长度设置事件，然后可以由得到的事件将数据分为等长的 epoch。</p>
<p>使用字典定义事件：</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div></pre></td><td class="code"><pre><div class="line">events_id = &#123;</div><div class="line">    <span class="string">'label1'</span>: <span class="number">1</span>,</div><div class="line">    <span class="string">'label2'</span>: <span class="number">2</span></div><div class="line">&#125;</div></pre></td></tr></table></figure>
<figure class="highlight python"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div></pre></td><td class="code"><pre><div class="line">reject = dic(grad=<span class="number">4000e-13</span>, mag=<span class="number">4e-12</span>, eog=<span class="number">150e-6</span>)</div><div class="line">epochs = mne.Epochs(raw, events, event_id = <span class="number">1</span>, tmin=<span class="number">-0.2</span>, tmax=<span class="number">0.5</span>,</div><div class="line">                    proj = <span class="keyword">True</span>, picks=picks, baseline=(<span class="keyword">None</span>, <span class="number">0</span>),</div><div class="line">                    preload=<span class="keyword">True</span>, reject=reject)</div></pre></td></tr></table></figure>
<p>使用 <code>mne.Epochs</code> 进行分段。基线矫正通过 <code>baseline</code> 参数指定。如果基线为 <code>(None, None)</code>，那么将整段epoch的均值做基线。</p>
<h2 id="降采样率">降采样率</h2>
<p>数据对象均具有一个 <code>resample</code> 的方法。</p>
<h2 id="识别和标记坏段">识别和标记坏段</h2>
<p>peak-to-peak去伪迹<br>
指定 <code>reject</code> 的阈限：<code>reject = dict(grad=4000e-13, mag=4e-12, eog=150e-6)</code>。<br>
然后作为 <code>reject</code> 形参的实参传入 <code>mne.Epochs()</code>。</p>
<p>删除坏段<br>
<code>epochs.drop_bad()</code></p>
<p>自动检测坏试次：使用 autoreject 模块。</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div></pre></td><td class="code"><pre><div class="line"><span class="keyword">from</span> autoreject <span class="keyword">import</span> LocalAutoRejectCV, Ransac</div><div class="line">ar = LocalAutoRejectCV(verbose=<span class="string">'tqdm'</span>)</div><div class="line">epochs_clean = ar.fit_transform(epochs)</div></pre></td></tr></table></figure>
<h2 id="独立成分分析">独立成分分析</h2>
<figure class="highlight python"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div></pre></td><td class="code"><pre><div class="line"><span class="comment"># preppare</span></div><div class="line"><span class="keyword">import</span> mne</div><div class="line"><span class="keyword">from</span> mne.preprocessing <span class="keyword">import</span> ICA</div><div class="line"><span class="keyword">from</span> mne.preprocessing <span class="keyword">import</span> create_eog_epochs, create_ecg_epochs</div><div class="line"></div><div class="line"><span class="comment"># read dataset</span></div><div class="line">raw = mne.io.read_raw_fif(raw_fname, preload=<span class="keyword">True</span>, add_eeg_ref=<span class="keyword">False</span>)</div><div class="line">raw.filter(<span class="number">1</span>, <span class="number">40</span>, n_jobs=<span class="number">2</span>)</div><div class="line">picks_meg = mne.pick_types(raw.info, meg=<span class="keyword">True</span>, eeg=<span class="keyword">False</span>, eog=<span class="keyword">False</span>, stim=<span class="keyword">False</span>, exclude=<span class="string">'bads'</span>)</div><div class="line"></div><div class="line"><span class="comment"># artifact correction</span></div><div class="line"></div><div class="line"><span class="comment">#  fit ica</span></div><div class="line">ica = ICA(n_components=<span class="number">25</span>, method=<span class="string">'fastica'</span>, random_state=<span class="number">23</span>)</div><div class="line"></div><div class="line"><span class="comment"># 剔除极端的伪迹</span></div><div class="line">reject = dict(mag=<span class="number">5e-12</span>, grad=<span class="number">4000e-13</span>)</div><div class="line"></div><div class="line"><span class="comment"># ica fit</span></div><div class="line">ica.fit(raw, picks=picks_meg, decim=decim, reject=reject)</div><div class="line"></div><div class="line"><span class="comment"># plot ica components</span></div><div class="line">ica.plot_components()</div><div class="line"></div><div class="line"><span class="comment"># inspect component properties</span></div><div class="line">ica.plot_properties(raw, picks=<span class="number">0</span>)</div></pre></td></tr></table></figure>
<p>MNE 支持三种 ICA 算法：fastica, infomax, extended-infomax</p>
<h2 id="剔除眼动伪迹">剔除眼动伪迹</h2>
<h3 id="ica">ICA</h3>
<p>如果有 EOG 通道：EOG 与成分之间的相关系数</p>
<p>如果没有 EOG 通道：使用 corrmap 算法</p>
<h3 id="ssp">SSP</h3>
<p>需要有 EOG 通道</p>
<h2 id="内插坏道">内插坏道</h2>
<p><code>epoch_clean.interpolate_bads(reset_bads = True/False, mode = 'accurate')</code></p>
<p>说明：实现标记并在 <code>info</code> 对象中保存了 <code>bads</code> 坏道的属性。然后就可以对数据对象使用 <code>interpolate_bads</code> 方法执行内插。<code>reset_bads</code> 参数表示是否移除 <code>info</code> 中的 <code>bads</code> 坏道信息。</p>
<h2 id="可视化">可视化</h2>
<h3 id="查看事件相关电位">查看事件相关电位</h3>
<h3 id="查看功率谱">查看功率谱</h3>
<h3 id="查看时頻分析">查看时頻分析</h3>
<h2 id="统计检验">统计检验</h2>
<p>非参数时空聚类置换检验</p>
<p>Prepare the input data</p>
<p>Set cluster threshold</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div></pre></td><td class="code"><pre><div class="line"><span class="keyword">from</span> scipy.stats.t <span class="keyword">import</span> ppf</div><div class="line"><span class="keyword">import</span> numpy <span class="keyword">as</span> np</div><div class="line"></div><div class="line">p_accept = <span class="number">0.01</span></div><div class="line">tail = <span class="number">0.</span> <span class="comment"># for two sided test</span></div><div class="line"></div><div class="line"><span class="comment"># ----------------</span></div><div class="line">p_thresh = p_accept / (<span class="number">1</span> + (tail == <span class="number">0</span>))</div><div class="line">n_samples = len(data)</div><div class="line">threshold = -ppf(p_thresh, n_samples - <span class="number">1</span>)</div><div class="line"><span class="keyword">if</span> np.sign(tail) &lt; <span class="number">0</span>:</div><div class="line">    threshold = -threshold</div></pre></td></tr></table></figure>
<p>Make <code>connectivity</code> in <code>spatio_temporal_cluster_stats</code><br>
创建邻近通道矩阵</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div></pre></td><td class="code"><pre><div class="line"><span class="keyword">from</span> scipy <span class="keyword">import</span> spatial</div><div class="line"><span class="keyword">import</span> mne</div><div class="line"></div><div class="line">lay = mne.channels.make_eeg_layout(contrast.info)</div><div class="line">neigh = spatial.Delaunay(lay.pos[:, :<span class="number">2</span>]).vertices</div><div class="line">connectivity = mne.surface.mesh_edges(neigh)</div></pre></td></tr></table></figure>
<h1 id="其他高级分析">其他高级分析</h1>
<h2 id="并行计算">并行计算</h2>
<figure class="highlight python"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div></pre></td><td class="code"><pre><div class="line"><span class="keyword">from</span> mne.parallel <span class="keyword">import</span> parallel_func</div><div class="line"></div><div class="line"><span class="function"><span class="keyword">def</span> <span class="title">run_func</span><span class="params">()</span>:</span></div><div class="line">    <span class="comment">## code</span></div><div class="line"></div><div class="line"><span class="comment"># run parallel computing</span></div><div class="line">parallel, run_func, _ = parallel_func(run_func, n_jobs=N_JOBS)</div><div class="line">parallel(run_func(subject_id) <span class="keyword">for</span> subject_id <span class="keyword">in</span> range(<span class="number">1</span>, <span class="number">20</span>))</div></pre></td></tr></table></figure>
<h2 id="源定位">源定位</h2>
<h3 id="make-noise-covariance-matrix">Make noise covariance matrix</h3>
<h3 id="make-digitized-points-of-eeg">Make digitized points of EEG</h3>
<figure class="highlight python"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div></pre></td><td class="code"><pre><div class="line"><span class="keyword">from</span> mne.channels <span class="keyword">import</span> (read_montage,</div><div class="line">                          read_dig_montage)</div><div class="line"></div><div class="line">montage = read_montage(<span class="string">'standard_1020'</span>)</div><div class="line">positions = montage.pos</div><div class="line">labels = montage.ch_names</div><div class="line">labels[<span class="number">2</span>] = <span class="string">'nasion'</span></div><div class="line">digitization = read_dig_montage(hsp=positions,</div><div class="line">                                elp=positions,</div><div class="line">                                point_names=labels,</div><div class="line">                                unit=<span class="string">'mm'</span>)</div><div class="line">raw.set_montage(digitization)</div><div class="line">raw.save()</div></pre></td></tr></table></figure>
<h3 id="make-trans-file">Make trans file</h3>
<p>Coregister an MRI with a subject’s head shape</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><div class="line">1</div></pre></td><td class="code"><pre><div class="line">$ mne coreg <span class="_">-d</span> SUBJECTS_DIR <span class="_">-s</span> SUBJECT <span class="_">-f</span> INST</div></pre></td></tr></table></figure>
<p><code>SUBJECTS_DIR</code> Subjects directory<br>
<code>SUBJECT</code> Subject name<br>
<code>INST</code> FIFF file with digitizer data for coregistration</p>
<p>Tutorial: <a href="http://www.slideshare.net/mne-python/mnepython-coregistration" target="_blank" rel="external">MNE-Python: Coregistration</a></p>
<h3 id="make-the-bem-model">Make the BEM model</h3>
<h3 id="build-the-forward-model-and-inverse-operator">Build the forward model and inverse operator</h3>
<hr>
<p>Hello Florian,</p>
<p>I do source reconstruction with EEG 64 channels and using fsaverage average brain from FreeSurfer too.</p>
<p>I had the same problem. I was trying to get the trans file using mne coreg but my EEG data did not have the dig points. So what I did is <strong>to use the standard 10-20 electrode positions</strong> that I read using montage function, when loading the data.</p>
<p>I <strong>copied the positions of the EEG channels and the fiducial points into the <a href="http://raw.info" target="_blank" rel="external">raw.info</a>[‘dig’] structure, saved the new raw file as .fif</strong> and then when I <strong>loaded it in mne correg</strong> I would <strong>see the digitized points over the fsaverage brain</strong> I previously selected.</p>
<p>Let me know if you don’t know how to do it and I will try to help you but basically you just need to copy the positions into this info structure. <strong>Make sure LPA, RPA and nazion are ordered as dig structure requieres</strong>, just in case. You can <strong>load mne sample data set and check how this <a href="http://raw.info" target="_blank" rel="external">raw.info</a>[‘dig’] structure has to be</strong>.</p>
<p>Hope it helps although if you need more help I can send you my code or try to be more specific. I am not in the office now and I don’t exactly remember how I did it. But it is basically what I am telling you now.</p>
<p>Marina.</p>
<hr>
<p>In general there is nothing different from MEG and EEG analysis, so<br>
the tutorial would look identical except the initial raw file would<br>
only have EEG data. <strong>The only real additional consideration you have is<br>
that your BEM has a much greater influence on EEG data than on MEG</strong>, so<br>
you may want to think about including special MRI sequences like the<br>
multi-echo FLASH for generating your BEM.</p>
<hr>
<p>use a standard montage with electrodes constrained on a sphere. If you want<br>
to use a BEM model you need to rescale fsaverage. The coreg GUI in mne-python<br>
allows you to do this AFAIK.</p>
<h2 id="连通性">连通性</h2>
]]></content>

    <summary type="html">

      将 EEGLAB 中使用的分析流程用 MNE 实现一遍。

    </summary>

      <category term="Original" scheme="blog.alexiangli.com/categories/Original/"/>


      <category term="EEG" scheme="blog.alexiangli.com/tags/EEG/"/>

      <category term="Python" scheme="blog.alexiangli.com/tags/Python/"/>

      <category term="MNE" scheme="blog.alexiangli.com/tags/MNE/"/>

  </entry>

  <entry>
    <title>使用 zotero 管理文献</title>
    <link href="blog.alexiangli.com/use-zotero/"/>
    <id>blog.alexiangli.com/use-zotero/</id>
    <published>2016-12-28T16:00:00.000Z</published>
    <updated>2017-01-04T08:22:11.000Z</updated>

    <content type="html"><![CDATA[<h1 id="什么是-zotero？">什么是 <a href="https://www.zotero.org/" target="_blank" rel="external">zotero</a>？</h1>
<p>zotero 是一个免费开源的文献管理软件。相似的软件有：<br>
免费开源：JabRef，docear，easybib(貌似闭源)<br>
商业付费（有的也提供免费或试用功能）：endnotes，mendeley，readcube，colwiz，paperpile（适合使用 google docs 写论文） …</p>
<p>据说在 windows 操作系统且搭配 Microsoft Word，endnotes 使用效果最佳。我没有在 windows 下使用过 endnotes，没有发言权。<br>
有的人还说，JabRef 和 LaTeX/BibTeX 配合最为默契，我也没有用过，以后有机会尝试。</p>
<h1 id="为什么使用-zotero？">为什么使用 zotero？</h1>
<ul>
<li>免费＋开源。</li>
<li>多平台支持：MacOS/Windows/Linux/Firefox (Extension)。</li>
<li>支持云端同步：100M 的容量，比较小，但是可以通过其他同步盘同步数据。</li>
<li>众多插件，高度自定义。</li>
<li>支持多种办公套件：Microsoft Office/LibreOffice/OpenOffice/NeoOffice。</li>
<li>纯文本解决方案：ODF/RTF 格式可以使用<a href="https://zotero-odf-scan.github.io/zotero-odf-scan/" target="_blank" rel="external">标记语法</a>插入，然后通过 <a href="https://zotero-odf-scan.github.io/zotero-odf-scan/" target="_blank" rel="external">ODF/RTF Scan</a> 插件扫描转换为指定引文格式的文档。</li>
<li>支持浏览器插件导入引文信息：支持 Chrome/Firefox/Safari 浏览器。</li>
<li>支持众多引文格式。</li>
<li>支持笔记功能。</li>
<li>支持群组，可进行团队协作。</li>
<li>如果你爱折腾，还可以写插件拓展你想要的功能。</li>
</ul>
<h1 id="为什么不使用-zotero？">为什么不使用 zotero？</h1>
<ul>
<li>已经习惯了其他文献管理软件</li>
<li>嫌弃它的云端存储容量太小（100MB）：但可以通过其他同步盘同步数据</li>
<li>我已找不到其他不使用 zotero 的理由了，所以放心使用 zotero 吧</li>
</ul>
<h1 id="使用流程">使用流程</h1>
<p><strong>注册账号</strong><br>
前往 <a href="https://www.zotero.org/" target="_blank" rel="external">zotero官网</a> 注册。</p>
<p><strong>安装桌面版</strong><br>
从 <a href="https://www.zotero.org/" target="_blank" rel="external">zotero官网</a> 获取，三大操作系统均支持。</p>
<p><strong>安装浏览器插件</strong><br>
仍然从官网获取浏览器插件。如果使用的是 Chrome 浏览器，暂时无法翻墙，可以将插件包下载下来，然后拖入插件管理页面。如果可以翻墙，那么直接从应用商店获取安装。</p>
<p><strong>了解首选项/参数设置</strong><br>
使用一个软件或者应用，应该习惯性地看看首选项或者参数设置。<br>
zotero 的首选项有几个地方需要更改：</p>
<ul>
<li>同步 &gt; 同步服务器：登陆账号同步</li>
<li>同步 &gt; 文件同步：取消附件同步</li>
<li>导出 &gt; 便捷复制：更改为你需要的格式</li>
<li>导出 &gt; 字符编码：为了更大的兼容性，更改字符编码为 Unicode (UTF-8)</li>
<li>引用：安装对应的办公软件插件</li>
<li>高级 &gt; 文件和文件夹：更改数据存储位置为自定义，这是为了方便使用第三方同步盘同步数据</li>
</ul>
<p><strong>添加引文数据</strong><br>
引文添加有以下几种手段：</p>
<ul>
<li>通过浏览器插件自动从网页上抓取（推荐）</li>
<li>导入带有元数据的 pdf 文件，自动抓取其中的数据</li>
<li>手动添加：支持众多的文献类型</li>
<li>导入引文数据包，支持多种格式</li>
</ul>
<p><strong>使用浏览器插件抓取引文数据</strong><br>
zotero 浏览器插件支持大多数的携带有元数据的网页，包括期刊，图书，视频，普通网页等等，具体包括哪些试试就知道了。这里有一点需要注意：在你抓取的同时，zotero 桌面软件是正在运行的。请选中对应的文献库分类（左侧我的文献库中的分类），通过浏览器插件抓取的数据会导入至 zotero 中你当前选中的分类中。</p>
<p><strong>建立引文附件副本</strong><br>
zotero 支持将引文条目对应的 pdf 文档拷贝到统一路径下方便同步管理。zotero 支持将 pdf 文件按照元数据自动重命名。</p>
<p><strong>做笔记</strong><br>
zotero 支持单独的或者与引文条目关联的富文本笔记。</p>
<p><strong>导出pdf中的注释</strong><br>
当你对某条目关联的 pdf 文档进行了高亮或者其他方式注释，可以一键导出为对应的引文条目下的笔记，并且还保留了 pdf 中文档位置信息。点击对应的位置链接，即可跳转到 pdf 中该注释的位置。这是 zotero 非常强大的一个地方，但是该功能是通过一个插件（ZotFile）提供支持的。插件从官网插件页面获取，或者搜索引擎检索：zotero + 插件名。</p>
<p><strong>交叉引用／关联文献</strong><br>
可以通过关联文献来更高校的管理文献，当你查阅一个条目时，即可快速跳转到相关的文献。当然，关联文献需要你事先手动进行关联。</p>
<p><strong>快速导入引文</strong><br>
有以下方式快速导入引文：</p>
<ul>
<li>快捷键复制到剪贴板然后粘贴：请查看“首选项 &gt; 高级 &gt; 快捷键”。</li>
<li>通过菜单中的编辑中的选项“复制引文”和“复制引文目录”复制到剪贴板然后粘贴。</li>
<li>通过弹出条目对应的右键菜单，选择“由所选条目创建引文目录”。</li>
<li>直接选中条目拖拽到相应的文本编辑窗口。</li>
</ul>
<p><strong>在办公软件中插入引文</strong><br>
首选确保对应办公软件的 zotero 插件已经安装成功。安装成功后，便可在软件中查看到 zotero 插件的工具图标。不同版本的办公软件可能现实的位置不一样。例如，Mac 下的 Microsoft Word 2015，zotero 在“加载项”中。<br>
该插件提供了以下功能：</p>
<ul>
<li>插入引文：支持单篇或多篇引文。</li>
<li>插入引文时添加前缀或后缀：以 (see Author, 2014 for review) 为例，see 为前缀，for review 为后缀。</li>
<li>插入引文时可选择省略作者，因为此时作者已在文中出现，只需插入年份。</li>
<li>插入引文时添加页码。</li>
<li>编辑插入后的引文和引文目录：edit citation/bibliography</li>
<li>对引文进行了修改，可以点击刷新 bibliography</li>
</ul>
<p><strong>建立群组进行团队协作与共享</strong><br>
在新建分类的按钮旁边有一个新建群组文献库的按钮，点击将打开 zotero 网址登陆并新建群组。群组支持公开的，也支持私有的。私有的群组可以邀请成员加入群组，这样一来，就可以一个小团队或者合作者之间共同管理文献。</p>
<h1 id="结束语">结束语</h1>
<p>至此，zotero 几乎解决了你在写论文文献管理过程中所有问题。如果需要一些个性化的功能支持，还可以去 zotero 论坛求助或者自己写一个插件。zotero 的开源性质决定了它的功能也会越来越强，社区也会不断发展壮大。</p>
<h1 id="to-dos">TO-DOs</h1>
<p><strong>纯文本写论文</strong></p>
<ul>
<li>LaTeX/BibTeX + zotero</li>
<li>markdown + zotero</li>
<li>org-mode (emacs) + zotero</li>
</ul>
]]></content>

    <summary type="html">

      zotero 使用经验整理。

    </summary>

      <category term="Original" scheme="blog.alexiangli.com/categories/Original/"/>


      <category term="zotero" scheme="blog.alexiangli.com/tags/zotero/"/>

      <category term="bibiography" scheme="blog.alexiangli.com/tags/bibiography/"/>

  </entry>

  <entry>
    <title>谷歌学术文献检索技巧</title>
    <link href="blog.alexiangli.com/use-google-scholar/"/>
    <id>blog.alexiangli.com/use-google-scholar/</id>
    <published>2016-12-13T16:00:00.000Z</published>
    <updated>2016-12-14T10:04:23.000Z</updated>

    <content type="html"><![CDATA[<h1 id="google-scholar-谷歌学术文献检索技巧">Google Scholar 谷歌学术文献检索技巧</h1>
<p>这是我个人总结的谷歌学术检索文献的技巧，用于实验室内部分享。</p>
<h2 id="谷歌学术适合什么样的文献检索需求？">谷歌学术适合什么样的文献检索需求？</h2>
<ul>
<li>部分全文获取</li>
<li>初步或探索性地查询相关领域文献</li>
<li>查询作者/研究者的相关文献</li>
<li>快速查找少量相关文献</li>
<li>查找文献被引频次：也包含peer-reviewed以外的文献的引用</li>
<li>查找一篇文献有关联的文章</li>
<li>订阅研究者的新文献</li>
<li>…</li>
</ul>
<h2 id="谷歌学术不适合什么样的文献检索？">谷歌学术不适合什么样的文献检索？</h2>
<ul>
<li>进行系统地文献检索：例如写元分析和综述时所需要的文献检索</li>
<li>根据机构检索文献</li>
<li>一部分未公开的文献的全文</li>
<li>对引文查询结果进行统计分析</li>
<li>…</li>
</ul>
<h2 id="如何确定关键词">如何确定关键词</h2>
<p>检索用到的关键词一般为实词。为了提高检索效率，在检索前必须确定明确的关键词，请勿直接使用“句子”进行检索。在确定关键词前，请考虑同义词和词性的问题。虽然谷歌学术会帮你拓展同义词和词性，但是在头脑中形成这样的信息检索思路有助于你获取信息。</p>
<h2 id="普通检索">普通检索</h2>
<p>普通的检索直接输入关键词就可以了。这种检索比较简单，适合探索性地检索，即没有特定的目的，仅仅粗略地获取信息。但是，通过这种检索方式获取信息较为低效。建议使用限定词和检索符进行检索。</p>
<h2 id="高级检索">高级检索</h2>
<ul>
<li>时间：年份，年份范围</li>
<li>排序：相关性和日期排序</li>
<li>标题</li>
<li>作者</li>
<li>期刊</li>
</ul>
<h2 id="如何定位作者？">如何定位作者？</h2>
<ul>
<li>英文文献使用拉丁字母，中文文献使用中文</li>
<li>可以使用全称，也可以使用简称，全称更准确，简称较模糊</li>
<li>添加限定词：<code>author:</code></li>
<li>使用双引号（英文半角引号）包括姓名：<code>author:&quot;james gross&quot;</code> 或者简称 <code>author:&quot;j gross&quot;</code></li>
<li>或者手动使用搜索选项：搜索框右侧倒三角形按钮</li>
</ul>
<h2 id="如何限定标题？">如何限定标题？</h2>
<ul>
<li>使用限定词：<code>intitle:</code> 只搜索标题中出现关键字的条目</li>
<li>使用双引号：<code>intitle:&quot;emotion regulation&quot;</code></li>
</ul>
<h2 id="如何限定期刊？">如何限定期刊？</h2>
<p>谷歌学术没有针对期刊的限定词，但是可以在高级搜索框中在期刊输入框中输入对应的期刊进行检索。需要检索多个期刊时使用双引号包裹期刊名，使用 <code>OR</code> 逻辑符来分隔多个由双引号包裹的期刊名。例如：<code>&quot;emotion&quot; OR &quot;cognition &amp; emotion&quot;</code>。或者直接搜索框中输入 <code>(&quot;emotion&quot; OR &quot;cognition &amp; emotion&quot;)</code>，记得用括号包裹为一个整体。</p>
<h2 id="如何限定年份">如何限定年份</h2>
<ul>
<li>直接通过左边栏的年份选项去选择。</li>
<li>在搜索框输入年份检索（不推荐）。</li>
</ul>
<h2 id="如何检索多个并列的关键词">如何检索多个并列的关键词</h2>
<p>使用 <code>OR</code> <code>()</code> <code>&quot; &quot;</code> 来构建多个并列检索词，例如：<code>(emotion OR mood OR affect)</code>。其实这个时候并不需要 <code>()</code> 来进行包括，也同样有效。</p>
<h2 id="如何使用-and-逻辑符">如何使用 AND 逻辑符</h2>
<p>通过 <code>AND</code> 可以设定必须出现的关键词，加上 <code>OR</code> 就可以进行更为强大的检索。例如：<code>(emotion OR mood OR affect) AND (erp OR eeg OR meg OR fmri OR &quot;functional MRI&quot;)</code>。</p>
<h2 id="如何限定支持全文获取的文献">如何限定支持全文获取的文献</h2>
<ul>
<li>如果条目右侧没有出现 <code>pdf</code> 字样，不要认为就没有全文链接。点击进入 <code>所有 n 个版本</code> 链接查看是否有全文链接。</li>
<li>仅限定可获取全文的条目：使用限定词 <code>filetype:pdf</code>。例如：<code>emotion filetype:pdf</code>。</li>
</ul>
<h1 id="最后来一个更复杂的例子">最后来一个更复杂的例子</h1>
<p><code>filetype:pdf 2015 (&quot;emotion&quot; OR &quot;emotion review&quot; OR &quot;cognition and emotion&quot; OR &quot;motivation and emotion&quot;) (&quot;emotion regulation&quot; OR &quot;emotion control&quot; OR &quot;cognitive reappraisal&quot; OR &quot;emotion suppression&quot;) AND (&quot;erp&quot; OR &quot;eeg&quot; OR &quot;meg&quot; OR &quot;fmri&quot; OR &quot;functional MRI&quot; OR &quot;electroencephalography&quot; OR &quot;event related potential&quot; OR &quot;magnetoencephalography&quot;)</code></p>
<p>说明：获取特定情绪有关的期刊在2015年的文中所有地方出现相关关键词的全文条目。</p>
<p>欢迎分享与转载，但请注明出处 :)</p>
]]></content>

    <summary type="html">

      个人总结的谷歌学术文献检索技巧。

    </summary>

      <category term="Original" scheme="blog.alexiangli.com/categories/Original/"/>


      <category term="google scholar" scheme="blog.alexiangli.com/tags/google-scholar/"/>

      <category term="文献检索" scheme="blog.alexiangli.com/tags/%E6%96%87%E7%8C%AE%E6%A3%80%E7%B4%A2/"/>

      <category term="labs" scheme="blog.alexiangli.com/tags/labs/"/>

  </entry>

  <entry>
    <title>每日总结 2016-12-08</title>
    <link href="blog.alexiangli.com/r-making-contrasts/"/>
    <id>blog.alexiangli.com/r-making-contrasts/</id>
    <published>2016-12-07T16:00:00.000Z</published>
    <updated>2017-03-22T08:23:46.000Z</updated>

    <content type="html"><![CDATA[<h1 id="make-contrasts-in-r">Make contrasts in R<sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup></h1>
<p>两种应用 contrasts 的方式：</p>
<ul>
<li>添加 contrasts 到 data.frame 中，那么只要用到 data.frame 来构建模型，contrasts 将每次都将被使用。</li>
<li>添加 contrasts 到 model 中，那么在检验模型时就会使用到它的 contrasts。</li>
</ul>
<blockquote>
<p>By default, R uses traditional dummy coding (called “treatment contrasts” in R) for any non- ordered factors, and polynomial trend contrasts for any ordered factors. That works out well if you intend to look at regression coefficients.</p>
</blockquote>
<p>R 中默认的 contrasts 为非等级因子使用 treatment contrasts，等级因子使用 polynomial trend contrasts。这在为了考察回归系数时有效。</p>
<blockquote>
<p>Note that traditional dummy coding is fine for regression coefficients, but since traditional dummy codes aren’t orthogonal, it messes things up when you’re just trying to partition variance (i.e. an ANOVA). (Also remember that the default R anova functions use type 1 sums of squares, which is generally not what you want. To get type 3 sums of squares, use the Anova() function from the car package.)</p>
</blockquote>
<p>但是，默认的 contrasts 并不适用于需要分解方差时，即方差分析。另外，R 的方差分析函数 <code>aov</code> 使用 Type-I 平方和，而一般来说，常用 Type-III 平方和。<code>car::Anova()</code> <code>ez::ezANOVA()</code> <code>afex::aov_ez()</code> 等均支持 Type-III 平方和。</p>
<blockquote>
<p>R is still using traditional dummy coding (treatment contrasts) behind the scenes here, unlike other stats software (like SPSS) that would switch to effects coding for an ANOVA.</p>
<p>For an ANOVA, you should set your factors to use effects coding, rather than relying on the default treatment codes. You can do that with the <code>contr.sum()</code> function.</p>
<p>As long as they’re orthogonal, they’ll work fine in an ANOVA.</p>
<p>For ANOVAs, effects coding works great, orthogonal contrast coding works great, and traditional dummy coding not so much.</p>
</blockquote>
<p>如果进行方差分析，必须在进行分析之前将 contrasts 给改为 effects coding 或 orthogonal contrast coding。</p>
<blockquote>
<p>Anova() won’t show you the individual contrast results, just the overall effect of each factor. You can see the results of each contrast by using the summary() function on the model object.</p>
</blockquote>
<p>使用 <code>car::Anova()</code> 并不会显示出每个独立的比较，而仅仅显示出每个因子的总效应。可以把模型对象传入函数 <code>summary(modobj)</code>，这样就能显示出每个独立的比较的结果。</p>
<blockquote>
<p>If you use aov() instead of lm() to specify the original model, then you’ll need to add a split argument to the summary() call to see the contrast results.</p>
</blockquote>
<p>如果使用 <code>aov()</code> 建立模型，而不是 <code>lm()</code>，传入 <code>split</code> 参数到 <code>summary()</code> 才能看到比较的结果。<code>summary(..., split = list(var = list(...), ...))</code>。其中，split 参数中也可以不包含所有变量的所有水平。</p>
<blockquote>
<p>You need to be careful, though, because the contrasts() function is a sneaky little bastard, as noted above. To apply contrast weights, you’ll need to give it the inverse of your matrix of weights.</p>
</blockquote>
<p>在构建自定义的比较时，需要谨慎，因为 <code>contrasts()</code> 函数有一些小技巧，特别时当应用 contrast 权重时，需要对 contasts 矩阵进行求逆运算。</p>
<p>以下为构建自定义比较的步骤：</p>
<ol>
<li>Specfiy the weights for your contrasts (and be sure to check the order of the levels of the factor, so your weights will line up properly). 首先指定 contrasts 的权重。</li>
<li>Create a temporary matrix with each contrast as one row. The top row (for the constant) should be 1/j for j groups. 构建临时的矩阵，第一列每个元素为 1/j，如果 j 为水平数。使用 <code>rbind</code> 构建临时矩阵。</li>
<li>Get the inverse of that temporary matrix. 使用 <code>solve</code> 函数对临时矩阵求逆。</li>
<li>The first column of the inverse will be all 1’s. Drop that first column. The remaining columns are your contrast matrix. 求逆后的矩阵第一列每一个元素都为1，删除第一列，剩余的列即为最终的 contrast 矩阵。然后就可以在建立模型时传入 contrasts 参数 <code>contrasts = list(...)</code>。</li>
</ol>
<blockquote>
<p>If you specify orthogonal contrasts, the regression coefficients for each contrast should just equal the difference between those group means.</p>
</blockquote>
<p>如果指定的 contrasts 为 orthogonal，每个比较的回归系数应该等于比较的组的均值的差值。</p>
<blockquote>
<p>So what happens if you try to run your own contrasts without doing the weird inverse thing? It depends. If your contrasts are orthogonal, then the t-tests and p-values you get for the regression coefficients will all be fine, but your contrast estimates (and corresponding SEs) might not match the difference between group means you expected. If your contrasts are nonorthogonal, then failing to do the weird inverse thing can result in totally garbage estimates and useless t-tests. So you MUST do this inverse thing if you specify nonorthogonal contrasts, but you should probably get in the habbit of doing it anyway for orthogonal ones as well.</p>
</blockquote>
<p>不进行矩阵求逆会出现什么结果呢？如果 contrasts 为正交的，那么对回归系数检验的 t 值和 p 值时没有问题的，但是 contrast estimate 以及对应的标准误可能不等于比较的组的均值的差值。如果 contrasts 不是正交的，那么所得到的结果是无效的。所以，最好是对所有 contasts 权重矩阵进行求逆。</p>
<blockquote>
<p>you can get j-1 orthogonal contrasts out of a factor with j levels.</p>
<p>Yep. If you want to save time and only specify the contrast(s) you care about, you can do that, and R will come up with some orthogonal contrasts to fill in the rest. What you won’t be able to do is take the inverse of your contrast weights; you can only take the inverse of a square matrix, and if you have fewer than j-1 contrasts, your temporary matrix won’t be square. But remember: as long as your contrasts are orthogonal, your t-tests will all be fine even if you don’t do the inverse thing. So go ahead and just use the contrasts you want directly with the contrasts() function, and be aware that your contrast estimates may not accurately reflect the differences between group means.</p>
</blockquote>
<p>如果进行的比较次数少于 j-1，那么 R 会自动地计算出剩余的正交 contrasts。只不过，此时无法进行矩阵逆运算。不过只要 contrasts 是正交的，那么就不用担心检验的结果。需要考虑的是，contrast estimates 将可能不等于比较的组均值的差值。</p>
<blockquote>
<p>Note that if you add fewer than j-1 contrasts to the contrasts argument in lm(), it will NOT fill out the remaining contrasts for you. Rather, any group differences other than those represented in your contrast will get lumped into the error term!</p>
</blockquote>
<p>需要注意，如果 <code>lm()</code> 模型中 contrasts 少于 j-1 次，没有反映在 contrasts 中的组差异将被纳入到误差项中。</p>
<p><strong>问题：什么是正交的 contrasts？如何建立正交 contrasts？</strong></p>
<blockquote>
<p><strong>Orthogonal contrasts</strong> are a set of contrasts in which, for any distinct pair, <strong>the sum of the cross-products of the coefficients is zero</strong> (assume sample sizes are equal). Although there are potentially infinite sets of orthogonal contrasts, within any given set there will always be a maximum of exactly k – 1 possible orthogonal contrasts (where k is the number of group means available).</p>
</blockquote>
<blockquote>
<p>There are only k-1 orthogonal comparisons, where k is the number of factor levels.<br>
Comparisons/contrasts orthogonal to each other are statistically independent.<br>
Which of the possible comparisons should we conduct? Well, this very much depends on our hypothesis we have in mind.</p>
</blockquote>
<blockquote>
<p>We need to specify a contrast matrix, showing which comparisons we want to make. A contrast matrix consists of so-called contrast coefficients that (in the end) all have to sum to zero. This means, those things we want to compare have to get the opposite sign (e.g. +1 and –1), while those things we don ́t want to compare will receive a value of zero.</p>
</blockquote>
<blockquote>
<p>Orthogonal contrasts are planned, a priori tests that partition the experimental variance cleanly. They are a powerful tool for analyzing data, but they are not appropriate for all experiments. Less restrictive comparisons among treatment means can be performed using various means separation tests, or multiple comparison tests.</p>
</blockquote>
<ul>
<li>两个比较的向量点对点相乘的和等于0的比较为正交比较。</li>
<li>水平数为 k 的因子，最多有 k-1 次正交比较。</li>
<li>正交比较之间在统计上相对独立。</li>
<li>进行什么样的比较与研究假设相关。</li>
<li>在一个比较中，相互比较的水平之间的比较矩阵中对应的数值和为0，两者符号相反，其它水平数值为0。</li>
</ul>
<p><strong>问题：R 有哪些内置的构建 contrasts 的函数？</strong></p>
<p><code>contr.treatment()</code><br>
<code>contr.helmert()</code><br>
<code>contr.poly()</code><br>
<code>contr.sum()</code><br>
<code>contr.SAS</code></p>
<hr>
<h1 id="杂项">杂项</h1>
<blockquote>
<p>calculating post-hoc power are flawed and can produce misleading conclusions. Once a confidence interval has been computed, there is no additional information that a post hoc power calculation can provide. (Hoenig &amp; Heisey 2001)</p>
</blockquote>
<h1 id="翻译">翻译</h1>
<blockquote>
<p>We have empirically assessed the distribution of published effect sizes and estimated power by extracting more than 100,000 statistical records from about 10,000 cognitive neuroscience and psychology papers published during the past 5 years. The reported median effect size was d=0.93 (inter-quartile range: 0.64-1.46) for nominally statistically significant results and d=0.24 (0.11-0.42) for non-significant results. Median power to detect small, medium and large effects was 0.12, 0.44 and 0.73, reflecting no improvement through the past half-century. Power was lowest for cognitive neuroscience journals. 14% of papers reported some statistically significant results, although the respective F statistic and degrees of freedom proved that these were non-significant; p value errors positively correlated with journal impact factors. False report probability is likely to exceed 50% for the whole literature. In light of our findings the recently reported low replication success in psychology is realistic and worse performance may be expected for cognitive neuroscience.<sup class="footnote-ref"><a href="#fn2" id="fnref2">[2]</a></sup></p>
</blockquote>
<p>我们从10,000篇近五年的认知神经科学和心理学已发表的期刊论文中提取了100,000份统计学记录，评估了发表的效应量 (effect sizes) 和估计的统计检验力 (power) 的统计分布。对于名义上达到统计显著性的结果中报告的效应量中位数为 d=0.93（四分位数范围为：0.64-1.46），而对于未达到统计显著性的结果，报告效应量中位数为 d=0.24 (四分位数范围：0.11-0.42)。检测到较弱，中等、较强的效应的统计检验力中位数分别为0.12，0.44，以及0.73。这表面在这几个领域内统计检验力在近半个世纪没有任何提高。其中，认知神经科学期刊论文的统计检验力最低（研究者考察了医学、心理学和认知神经科学）。14% 的论文报告了一些统计上显著的结果，然而这些结果中对应的 F 统计量和自由度均证明了这些结果并未达到显著。（有意思的是）p值的错误报告与期刊的影响因子呈正相关关系。在所有文献范围内，错误报告概率（False report probability）超过了50%。我们的结果提示，近来报告的心理学领域的低重复率问题是确实存在的，而且在认知神经科学领域可能预示着更严重的低重复率问题。</p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p><a href="http://rstudio-pubs-static.s3.amazonaws.com/65059_586f394d8eb84f84b1baaf56ffb6b47f.html" target="_blank" rel="external">A (sort of) Complete Guide to Contrasts in R</a> <a href="#fnref1" class="footnote-backref">↩</a></p>
</li>
<li id="fn2" class="footnote-item"><p><a href="http://biorxiv.org/content/early/2016/08/25/071530" target="_blank" rel="external">Szucs, D., &amp; Ioannidis, J. P. (2016). Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. bioRxiv, 071530.</a> <a href="#fnref2" class="footnote-backref">↩</a></p>
</li>
</ol>
</section>
]]></content>

    <summary type="html">

      - 如何在 R 语言中构建 contrasts&lt;br&gt;- 一篇论文摘要的翻译：心理学和认知神经科学中效应量和统计检验力使用问题

    </summary>

      <category term="Daily" scheme="blog.alexiangli.com/categories/Daily/"/>


      <category term="R" scheme="blog.alexiangli.com/tags/R/"/>

      <category term="Statistics" scheme="blog.alexiangli.com/tags/Statistics/"/>

      <category term="Effect-size" scheme="blog.alexiangli.com/tags/Effect-size/"/>

      <category term="Power" scheme="blog.alexiangli.com/tags/Power/"/>

  </entry>

  <entry>
    <title>R：基本数据对象</title>
    <link href="blog.alexiangli.com/r-data-type/"/>
    <id>blog.alexiangli.com/r-data-type/</id>
    <published>2016-11-23T16:00:00.000Z</published>
    <updated>2016-11-28T12:26:25.000Z</updated>

    <content type="html"><![CDATA[<h1 id="vector-向量">vector 向量</h1>
<p>向量 vector 由同类型的元素构成。</p>
<h2 id="向量创建">向量创建</h2>
<p><code>vector</code> 创建给定长度和模式的向量。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div></pre></td><td class="code"><pre><div class="line">&gt;&gt;&gt; vector(mode = <span class="string">"character"</span>, length = <span class="number">5</span>)</div><div class="line">[<span class="number">1</span>] <span class="string">""</span> <span class="string">""</span> <span class="string">""</span> <span class="string">""</span> <span class="string">""</span></div></pre></td></tr></table></figure>
<p><code>as.vector</code> 可以将实参转换为给定模式的向量。如果输入结果为原子模式（atomic），属性将丢失。R 中原子模式为 “logical”, “integer”, “numeric” (“double”), “complex”, “character”, 和 “raw”。在 R 中原子模式应该就是指最基本的模式，能够组合成其他的复杂模式，例如 “list” “expression”。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div></pre></td><td class="code"><pre><div class="line">&gt;&gt;&gt; x &lt;- c(a = <span class="number">1</span>, b = <span class="number">2</span>)</div><div class="line">a b</div><div class="line"><span class="number">1</span> <span class="number">2</span></div><div class="line">&gt;&gt;&gt; is.vector(x)</div><div class="line">[<span class="number">1</span>] <span class="literal">TRUE</span></div><div class="line">&gt;&gt;&gt; as.vector(x)</div><div class="line">[<span class="number">1</span>] <span class="number">1</span> <span class="number">2</span></div><div class="line">all.equal(x, as.vector(x))</div><div class="line">[<span class="number">1</span>] <span class="string">"names for target but not for current"</span></div></pre></td></tr></table></figure>
<p><code>c</code> 可以将它的实参组合成一个向量。所有的实参都将被强制转换为通用的类型，除了 names 之外，其他属性都将丢失。输出数据类型由实参中元素的最高级类型决定（NULL &lt; raw &lt; logical &lt; integer &lt; double &lt; complex &lt; character &lt; list &lt; expression）。</p>
<p>例如：</p>
<p>转换为字符串。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div></pre></td><td class="code"><pre><div class="line">&gt;&gt;&gt; (out &lt;- c(<span class="number">1</span>, <span class="string">"2"</span>))</div><div class="line">[<span class="number">1</span>] <span class="string">"1"</span> <span class="string">"2"</span></div></pre></td></tr></table></figure>
<p>转换为 list</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div></pre></td><td class="code"><pre><div class="line">&gt;&gt;&gt; (out &lt;- c(<span class="number">1</span>, list(<span class="number">2</span>)))</div><div class="line">[[<span class="number">1</span>]]</div><div class="line">[<span class="number">1</span>] <span class="number">1</span></div><div class="line"></div><div class="line">[[<span class="number">2</span>]]</div><div class="line">[<span class="number">1</span>] <span class="number">2</span></div></pre></td></tr></table></figure>
<h2 id="向量运算">向量运算</h2>
<p><strong>向量的索引</strong><br>
<code>a[1]</code> 单个元素的索引<br>
<code>a[1:2]</code> <code>a[c(1,3)]</code> 多个元素的索引<br>
<code>a[a==5]</code> 使用逻辑值来索引<br>
<code>a[-1]</code> <code>a[c(-1,-3)]</code> 排除元素</p>
<p><strong>向量的运算</strong><br>
<code>a*2</code> 向量与2相乘，2将与向量的每一个元素相乘。<br>
<code>a*b</code> 向量与向量相乘，对应位置的元素相乘。如果 <code>a</code> 和 <code>b</code> 的长度不等长呢？试一个例子：</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div></pre></td><td class="code"><pre><div class="line">&gt;&gt;&gt; v1 &lt;- c(<span class="number">1</span>,<span class="number">2</span>)</div><div class="line">&gt;&gt;&gt; v2 &lt;- c(<span class="number">3</span>,<span class="number">3</span>,<span class="number">3</span>)</div><div class="line">&gt;&gt;&gt; v3 &lt;- c(<span class="number">4</span>,<span class="number">4</span>,<span class="number">4</span>,<span class="number">4</span>)</div><div class="line">&gt;&gt;&gt; v1*v2</div><div class="line">[<span class="number">1</span>] <span class="number">3</span> <span class="number">6</span> <span class="number">3</span></div><div class="line">Warning message:</div><div class="line">In v1 * v2 : 长的对象长度不是短的对象长度的整倍数</div><div class="line">&gt;&gt;&gt; v1*v3</div><div class="line">[<span class="number">1</span>] <span class="number">4</span> <span class="number">8</span> <span class="number">4</span> <span class="number">8</span></div></pre></td></tr></table></figure>
<p>也就是说，向量相乘时，若长度不相等，较短的向量会继续往后与长的向量相乘。如果向量的长度不是整数倍，会出现警告：“长的对象长度不是短的对象长度的整倍数”。</p>
<p><code>a%%2</code> 对向量所有元素求余数<br>
<code>a%/%2.4</code> 向量所有元素与2.4进行整除运算<br>
<code>t(a)</code> 对 <code>a</code> 进行转置</p>
<p><strong>总结：对于向量的运算，都是对其元素的运算。</strong></p>
<h1 id="matrix-矩阵">matrix 矩阵</h1>
<h2 id="矩阵创建">矩阵创建</h2>
<p><code>matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE, dimnames = NULL)</code></p>
<ul>
<li><code>data</code> 为向量类型，非原子模式的对象将由 <code>as.vector</code> 强制转换为向量，并且去掉属性。</li>
<li><code>nrow</code> <code>ncol</code> 指定矩阵的行或列的数目，如果两个参数均未指定，那么将得到一个单列矩阵。</li>
<li><code>byrow</code> 逻辑值，若 <code>FALSE</code>（默认），矩阵按列来填充；若 <code>TRUE</code> 则按行填充。</li>
<li><code>dimnames</code> 矩阵的名称属性，<code>NULL</code> 或者长度为 list 的对象，空 list 为 <code>NULL</code>。list 的第一个元素的值被当作矩阵行的名称。list 也可以含有名称，那么 list 的名称将被当作矩阵的维度名称。</li>
</ul>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div></pre></td><td class="code"><pre><div class="line">&gt;&gt;&gt; mdat &lt;- matrix(c(<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">11</span>,<span class="number">12</span>,<span class="number">13</span>), nrow = <span class="number">2</span>, ncol = <span class="number">3</span>, byrow = <span class="literal">TRUE</span>,</div><div class="line">+ dimnames = list(c(<span class="string">"R.1"</span>, <span class="string">"R.2"</span>), c(<span class="string">"C.1"</span>, <span class="string">"C.2"</span>, <span class="string">"C.3"</span>)))</div><div class="line">&gt;&gt;&gt; mdat</div><div class="line">     C.1 C.2 C.3</div><div class="line">row1   <span class="number">1</span>   <span class="number">2</span>   <span class="number">3</span></div><div class="line">row2  <span class="number">11</span>  <span class="number">12</span>  <span class="number">13</span></div></pre></td></tr></table></figure>
<h2 id="矩阵运算">矩阵运算</h2>
<p><code>a[,2]</code> <code>a[1,2]</code> <code>a[c(1,2),]</code> <code>a[-2,]</code> 矩阵索引或者取子矩阵<br>
<code>rbind(a,b)</code> <code>cbind(a,c)</code> 将两个矩阵或向量按照行或者列合并</p>
<p><code>a*b</code> 一对一乘积（点积），行和列数必须匹配。<br>
<code>a%*%b</code> 矩阵乘积，<code>a</code> 的列必须等于 <code>b</code> 的行。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div></pre></td><td class="code"><pre><div class="line">&gt;&gt;&gt; a &lt;- matrix(<span class="number">1</span>:<span class="number">6</span>, nrow = <span class="number">2</span>)</div><div class="line">&gt;&gt;&gt; b &lt;- matrix(<span class="number">2</span>:<span class="number">7</span>, nrow = <span class="number">3</span>)</div><div class="line">&gt;&gt;&gt; a*b</div><div class="line">Error <span class="keyword">in</span> a * b : 非整合陈列</div><div class="line">&gt;&gt;&gt; a*t(b)</div><div class="line">[,<span class="number">1</span>] [,<span class="number">2</span>] [,<span class="number">3</span>]</div><div class="line">[<span class="number">1</span>,]    <span class="number">2</span>    <span class="number">9</span>   <span class="number">20</span></div><div class="line">[<span class="number">2</span>,]   <span class="number">10</span>   <span class="number">24</span>   <span class="number">42</span></div><div class="line">&gt;&gt;&gt; a%*%b</div><div class="line">[,<span class="number">1</span>] [,<span class="number">2</span>]</div><div class="line">[<span class="number">1</span>,]   <span class="number">31</span>   <span class="number">58</span></div><div class="line">[<span class="number">2</span>,]   <span class="number">40</span>   <span class="number">76</span></div><div class="line">&gt;&gt;&gt; a%*%t(b)</div><div class="line">Error <span class="keyword">in</span> a %*% t(b) : 非整合参数</div></pre></td></tr></table></figure>
<p><code>apply(a, MARGIN, FUN, ...)</code> 对矩阵或数组的向量化运算，如果存在名称的话，<code>MARGIN</code> 也可为维度的名称字符串。在此只讨论对矩阵的用法。</p>
<ul>
<li><code>apply(a, MARGIN = 1, sum)</code> 对矩阵 a 的行求和。</li>
<li><code>apply(a, MARGIN = 2, sum)</code> 对矩阵 a 的列求和。</li>
<li><code>apply(a, MARGIN = C(1,2), sum)</code> 对矩阵 a 的行列求和，得到的仍为原来的矩阵。</li>
</ul>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div></pre></td><td class="code"><pre><div class="line">&gt;&gt;&gt; a &lt;- matrix(<span class="number">1</span>:<span class="number">6</span>, nrow = <span class="number">2</span>)</div><div class="line">&gt;&gt;&gt; apply(a, <span class="number">1</span>, sum)</div><div class="line">[<span class="number">1</span>]  <span class="number">9</span> <span class="number">12</span></div><div class="line">&gt;&gt;&gt; apply(a, <span class="number">2</span>, sum)</div><div class="line">[<span class="number">1</span>]  <span class="number">3</span>  <span class="number">7</span> <span class="number">11</span></div><div class="line">&gt;&gt;&gt; apply(a, c(<span class="number">1</span>,<span class="number">2</span>), sum)</div><div class="line">     [,<span class="number">1</span>] [,<span class="number">2</span>] [,<span class="number">3</span>]</div><div class="line">[<span class="number">1</span>,]    <span class="number">1</span>    <span class="number">3</span>    <span class="number">5</span></div><div class="line">[<span class="number">2</span>,]    <span class="number">2</span>    <span class="number">4</span>    <span class="number">6</span></div></pre></td></tr></table></figure>
<p><code>diag(a)</code> 提取矩阵 a 的对角元素<br>
<code>diag(1:4)</code> 构建一个新的对角矩阵<br>
<code>crossprod(a,b)</code> 矩阵叉积，等同于 <code>t(a)%*%b</code>，前者计算效率更高。</p>
<h1 id="array-数组创建">array 数组创建</h1>
<p><code>array(data = NA, dim = length(data), dimnames = NULL)</code></p>
<p>数组可以存储两个以上维度数据。矩阵其实是特殊的两个维度的数组。<code>dim</code> 用于指定每个维度的长度。如果 <code>data</code> 数据长度不足，将会被复制循环填充。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div></pre></td><td class="code"><pre><div class="line">&gt;&gt;&gt; arr &lt;- array(<span class="number">1</span>:<span class="number">12</span>, dim = c(<span class="number">2</span>,<span class="number">2</span>,<span class="number">3</span>))</div><div class="line">&gt;&gt;&gt; arr</div><div class="line">, , <span class="number">1</span></div><div class="line"></div><div class="line">[,<span class="number">1</span>] [,<span class="number">2</span>]</div><div class="line">[<span class="number">1</span>,]    <span class="number">1</span>    <span class="number">3</span></div><div class="line">[<span class="number">2</span>,]    <span class="number">2</span>    <span class="number">4</span></div><div class="line"></div><div class="line">, , <span class="number">2</span></div><div class="line"></div><div class="line">[,<span class="number">1</span>] [,<span class="number">2</span>]</div><div class="line">[<span class="number">1</span>,]    <span class="number">5</span>    <span class="number">7</span></div><div class="line">[<span class="number">2</span>,]    <span class="number">6</span>    <span class="number">8</span></div><div class="line"></div><div class="line">, , <span class="number">3</span></div><div class="line"></div><div class="line">[,<span class="number">1</span>] [,<span class="number">2</span>]</div><div class="line">[<span class="number">1</span>,]    <span class="number">9</span>   <span class="number">11</span></div><div class="line">[<span class="number">2</span>,]   <span class="number">10</span>   <span class="number">12</span></div></pre></td></tr></table></figure>
<h1 id="factor-因子">factor 因子</h1>
<p>因子是一个由字符串或者整数组成的向量，用来对另外一个等长的向量进行分类的离散变量。与一般的向量不同点是 factor 具有 <code>level</code> 属性。R 提供了已排序和未排序两类 factor。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div></pre></td><td class="code"><pre><div class="line">factor(x = character(), levels, labels = levels,</div><div class="line">    exclude = <span class="literal">NA</span>, ordered = is.ordered(x), nmax = <span class="literal">NA</span>)</div></pre></td></tr></table></figure>
<ul>
<li><code>x</code> 向量，一般为少数几个不同的值。向量 <code>x</code> 的类型必须可以被转换为字符串 <code>as.character()</code> 和可以被排序 <code>sort.list</code>。</li>
<li><code>levels</code> 水平，<code>x</code> 中选择性的值构成的字符串向量。默认值为 <code>unique(as.character(x))</code>，并且增序排列。<code>levels</code> 可以少于 <code>sort(unique(x))</code>。如果 <code>levels</code> 少于 <code>sort(unique(x))</code>，<code>x</code> 中没有包含的元素将被当作 <code>NA</code>。</li>
<li><code>labels</code> 可选的参数，用来命名因子水平名称。例如，如果 <code>labels = &quot;f&quot;</code>，那么因子名为 <code>f1, f2, ...</code>。</li>
<li><code>exclude</code> 在组成 <code>levels</code> 时需要排除掉的元素。与 <code>x</code> 类型相同，否则需要强制转换。</li>
<li><code>ordered</code> 是否有序。有序的因子与未排序的因子仅仅在类上不同，但是方法和模型拟合函数对两者的处理有非常大的不同。</li>
<li><code>nmax</code> 水平数的上限。</li>
</ul>
<p><code>factor</code> 函数返回一个 “factor” 类的对象，这个对象具有和 <code>x</code> 等长的整数码，具有 “character” 模式的 “levels” 属性。如果 <code>ordered = TRUE</code> 或者使用了 <code>ordered()</code> ，那么对象则具有两个类 <code>c(&quot;ordered&quot;, &quot;factor&quot;)</code>。对因子的解释依赖于编码（codes）和水平属性（levels）两个因素，所以在对因子进行比较时需谨慎。对因子使用 <code>as.numeric</code> 没有意义，因为会自动强制转换。如果要将因子 <code>f</code> 转换为初始的数值，建议使用 <code>as.numeric(levels(f))[f]</code>，这样要比 <code>as.numeric(as.character(f))</code> 效率高。因子的水平默认进行了排序，但是排序的标准依赖本地区域设置。可能不是基于 ASCII。尽量不要使用 <code>NA</code> 作为水平值。</p>
<h2 id="因子的创建和操作">因子的创建和操作</h2>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div><div class="line">28</div><div class="line">29</div><div class="line">30</div><div class="line">31</div><div class="line">32</div><div class="line">33</div><div class="line">34</div><div class="line">35</div></pre></td><td class="code"><pre><div class="line">&gt;&gt;&gt; a &lt;- c(<span class="number">1</span>,<span class="number">1</span>,<span class="number">2</span>,<span class="number">2</span>)</div><div class="line">&gt;&gt;&gt; f &lt;- factor(a)</div><div class="line">&gt;&gt;&gt; f</div><div class="line">[<span class="number">1</span>] <span class="number">1</span> <span class="number">1</span> <span class="number">2</span> <span class="number">2</span></div><div class="line">Levels: <span class="number">1</span> <span class="number">2</span></div><div class="line"></div><div class="line"><span class="comment">## 指派了 levels 结果一致</span></div><div class="line">&gt;&gt;&gt; f &lt;- factor(a, levels = c(<span class="string">"1"</span>, <span class="string">"2"</span>))</div><div class="line">&gt;&gt;&gt; f</div><div class="line">[<span class="number">1</span>] <span class="number">1</span> <span class="number">1</span> <span class="number">2</span> <span class="number">2</span></div><div class="line">Levels: <span class="number">1</span> <span class="number">2</span></div><div class="line"></div><div class="line"><span class="comment">## levels 个数少于 a</span></div><div class="line">&gt;&gt;&gt; f &lt;- factor(a, levels = c(<span class="string">"1"</span>))</div><div class="line">&gt;&gt;&gt; f</div><div class="line">[<span class="number">1</span>] <span class="number">1</span>    <span class="number">1</span>    &lt;<span class="literal">NA</span>&gt; &lt;<span class="literal">NA</span>&gt;</div><div class="line">Levels: <span class="number">1</span></div><div class="line"></div><div class="line"><span class="comment">## 增加标签</span></div><div class="line">&gt;&gt;&gt; f &lt;- factor(a, labels = <span class="string">"x"</span>)</div><div class="line">&gt;&gt;&gt; f</div><div class="line">[<span class="number">1</span>] x1 x1 x2 x2</div><div class="line">Levels: x1 x2</div><div class="line"></div><div class="line"><span class="comment">## 增加排序</span></div><div class="line">&gt;&gt;&gt; f &lt;- factor(a, labels = <span class="string">"x"</span>, ordered = <span class="literal">TRUE</span>)</div><div class="line">&gt;&gt;&gt; f</div><div class="line">[<span class="number">1</span>] x1 x1 x2 x2</div><div class="line">Levels: x1 &lt; x2</div><div class="line"></div><div class="line"><span class="comment">## 改变 levels 的顺序</span></div><div class="line">&gt;&gt;&gt; f &lt;- factor(a, levels = c(<span class="string">"2"</span>, <span class="string">"1"</span>), labels = <span class="string">"x"</span>, ordered = <span class="literal">TRUE</span>)</div><div class="line">&gt;&gt;&gt; f</div><div class="line">[<span class="number">1</span>] x2 x2 x1 x1</div><div class="line">Levels: x1 &lt; x2</div></pre></td></tr></table></figure>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div><div class="line">28</div><div class="line">29</div><div class="line">30</div><div class="line">31</div><div class="line">32</div><div class="line">33</div><div class="line">34</div><div class="line">35</div><div class="line">36</div><div class="line">37</div><div class="line">38</div><div class="line">39</div><div class="line">40</div><div class="line">41</div><div class="line">42</div><div class="line">43</div><div class="line">44</div><div class="line">45</div><div class="line">46</div><div class="line">47</div><div class="line">48</div><div class="line">49</div><div class="line">50</div></pre></td><td class="code"><pre><div class="line"><span class="comment">## 构建一个 data.frame</span></div><div class="line">&gt;&gt;&gt; c.1 &lt;- rep(c(<span class="string">"f1"</span>, <span class="string">"f2"</span>), each = <span class="number">3</span>)</div><div class="line">&gt;&gt;&gt; c.2 &lt;- <span class="number">1</span>:<span class="number">6</span></div><div class="line">&gt;&gt;&gt; d &lt;- data.frame(c.1, c.2)</div><div class="line">&gt;&gt;&gt; d</div><div class="line">  c.1 c.2</div><div class="line"><span class="number">1</span>  f1   <span class="number">1</span></div><div class="line"><span class="number">2</span>  f1   <span class="number">2</span></div><div class="line"><span class="number">3</span>  f1   <span class="number">3</span></div><div class="line"><span class="number">4</span>  f2   <span class="number">4</span></div><div class="line"><span class="number">5</span>  f2   <span class="number">5</span></div><div class="line"><span class="number">6</span>  f2   <span class="number">6</span></div><div class="line"></div><div class="line"><span class="comment">## 将 c.1 转换为因子</span></div><div class="line">&gt;&gt;&gt; (d$c.1 &lt;- factor(c.1))</div><div class="line">[<span class="number">1</span>] f1 f1 f1 f2 f2 f2</div><div class="line">Levels: f1 f2</div><div class="line"></div><div class="line"><span class="comment">## 或者使用 as.factor</span></div><div class="line">&gt;&gt;&gt; as.factor(d$c.1)</div><div class="line">[<span class="number">1</span>] f1 f1 f1 f2 f2 f2</div><div class="line">Levels: f1 f2</div><div class="line"></div><div class="line"><span class="comment">## 进行一个排序</span></div><div class="line">&gt;&gt;&gt; factor(d$c.1, levels = c(<span class="string">"f2"</span>, <span class="string">"f1"</span>), ordered = <span class="literal">TRUE</span>)</div><div class="line">[<span class="number">1</span>] f1 f1 f1 f2 f2 f2</div><div class="line">Levels: f2 &lt; f1</div><div class="line"></div><div class="line">&gt;&gt;&gt; factor(d$c.1, levels = c(<span class="string">"f1"</span>, <span class="string">"f2"</span>), ordered = <span class="literal">TRUE</span>)</div><div class="line">[<span class="number">1</span>] f1 f1 f1 f2 f2 f2</div><div class="line">Levels: f1 &lt; f2</div><div class="line"></div><div class="line"><span class="comment">## 或者使用 ordered</span></div><div class="line">&gt;&gt;&gt; ordered(d$c.1, levels = c(<span class="string">"f1"</span>, <span class="string">"f2"</span>))</div><div class="line">[<span class="number">1</span>] f1 f1 f1 f2 f2 f2</div><div class="line">Levels: f1 &lt; f2</div><div class="line"></div><div class="line">&gt;&gt;&gt; ordered(d$c.1, levels = c(<span class="string">"f2"</span>, <span class="string">"f1"</span>))</div><div class="line">[<span class="number">1</span>] f1 f1 f1 f2 f2 f2</div><div class="line">Levels: f2 &lt; f1</div><div class="line"></div><div class="line"><span class="comment">## 转换因子的数据模式</span></div><div class="line">&gt;&gt;&gt; as.character(d$c.1)</div><div class="line">[<span class="number">1</span>] <span class="string">"f1"</span> <span class="string">"f1"</span> <span class="string">"f1"</span> <span class="string">"f2"</span> <span class="string">"f2"</span> <span class="string">"f2"</span></div><div class="line"></div><div class="line">&gt;&gt;&gt; as.numeric(d$c.1)</div><div class="line">[<span class="number">1</span>] <span class="number">1</span> <span class="number">1</span> <span class="number">1</span> <span class="number">2</span> <span class="number">2</span> <span class="number">2</span></div><div class="line"></div><div class="line">&gt;&gt;&gt; as.logical(d$c.1)</div><div class="line">[<span class="number">1</span>] <span class="literal">NA</span> <span class="literal">NA</span> <span class="literal">NA</span> <span class="literal">NA</span> <span class="literal">NA</span> <span class="literal">NA</span></div></pre></td></tr></table></figure>
<h1 id="list-列表">list 列表</h1>
<p>list 是 R 中比较宽松的数据类型，它可以由类型不一致的任意对象构成。list 非常适合用于封装函数的输出对象。list 的元素可以使用 <code>$</code> <code>[]</code> <code>[[]]</code> 访问。</p>
<p><code>l = list(tag1 = value1, tag2 = value2, ..., tagn = valuen)</code></p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div></pre></td><td class="code"><pre><div class="line">&gt;&gt;&gt; x = list(a = <span class="number">1</span>:<span class="number">10</span>, beta = exp(-<span class="number">3</span>:<span class="number">3</span>),</div><div class="line">+ logic = c(<span class="literal">TRUE</span>,<span class="literal">FALSE</span>,<span class="literal">FALSE</span>,<span class="literal">TRUE</span>))</div><div class="line"></div><div class="line">&gt;&gt;&gt; x</div><div class="line">$a</div><div class="line">[<span class="number">1</span>]  <span class="number">1</span>  <span class="number">2</span>  <span class="number">3</span>  <span class="number">4</span>  <span class="number">5</span>  <span class="number">6</span>  <span class="number">7</span>  <span class="number">8</span>  <span class="number">9</span> <span class="number">10</span></div><div class="line"></div><div class="line">$beta</div><div class="line">[<span class="number">1</span>]  <span class="number">0.04978707</span>  <span class="number">0.13533528</span>  <span class="number">0.36787944</span>  <span class="number">1.00000000</span>  <span class="number">2.71828183</span>  <span class="number">7.38905610</span></div><div class="line">[<span class="number">7</span>] <span class="number">20.08553692</span></div><div class="line"></div><div class="line">$logic</div><div class="line">[<span class="number">1</span>]  <span class="literal">TRUE</span> <span class="literal">FALSE</span> <span class="literal">FALSE</span>  <span class="literal">TRUE</span></div><div class="line"></div><div class="line">&gt;&gt;&gt; x$a</div><div class="line">[<span class="number">1</span>]  <span class="number">1</span>  <span class="number">2</span>  <span class="number">3</span>  <span class="number">4</span>  <span class="number">5</span>  <span class="number">6</span>  <span class="number">7</span>  <span class="number">8</span>  <span class="number">9</span> <span class="number">10</span></div><div class="line">&gt;&gt;&gt; x[<span class="number">1</span>]</div><div class="line">$a</div><div class="line">[<span class="number">1</span>]  <span class="number">1</span>  <span class="number">2</span>  <span class="number">3</span>  <span class="number">4</span>  <span class="number">5</span>  <span class="number">6</span>  <span class="number">7</span>  <span class="number">8</span>  <span class="number">9</span> <span class="number">10</span></div><div class="line"></div><div class="line">&gt;&gt;&gt; x[[<span class="number">1</span>]]</div><div class="line">[<span class="number">1</span>]  <span class="number">1</span>  <span class="number">2</span>  <span class="number">3</span>  <span class="number">4</span>  <span class="number">5</span>  <span class="number">6</span>  <span class="number">7</span>  <span class="number">8</span>  <span class="number">9</span> <span class="number">10</span></div><div class="line">&gt;&gt;&gt; x[[<span class="number">1</span>]][<span class="number">1</span>]</div><div class="line">[<span class="number">1</span>] <span class="number">1</span></div><div class="line">&gt;&gt;&gt;</div></pre></td></tr></table></figure>
<p><code>apply</code> 函数簇中 <code>lapply</code> 函数可以对 list 进行向量化运算，返回的数据类型也是 list。<code>sapply</code> 同样支持 list 类型的参数，但是默认返回的数据类型为 vector。但是当传入参数 <code>simplify = FALSE</code>，返回 list。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div><div class="line">28</div></pre></td><td class="code"><pre><div class="line">&gt;&gt;&gt; lapply(x, mean)</div><div class="line">$a</div><div class="line">[<span class="number">1</span>] <span class="number">5.5</span></div><div class="line"></div><div class="line">$beta</div><div class="line">[<span class="number">1</span>] <span class="number">4.535125</span></div><div class="line"></div><div class="line">$logic</div><div class="line">[<span class="number">1</span>] <span class="number">0.5</span></div><div class="line"></div><div class="line">&gt;&gt;&gt; sapply(x, mean)</div><div class="line">       a     beta    logic</div><div class="line"><span class="number">5.500000</span> <span class="number">4.535125</span> <span class="number">0.500000</span></div><div class="line"></div><div class="line">&gt;&gt;&gt; class(lapply(x,mean))</div><div class="line">[<span class="number">1</span>] <span class="string">"list"</span></div><div class="line"></div><div class="line">&gt;&gt;&gt; class(sapply(x,mean))</div><div class="line">[<span class="number">1</span>] <span class="string">"numeric"</span></div><div class="line"></div><div class="line">&gt;&gt;&gt; is.matrix(sapply(x,mean))</div><div class="line">[<span class="number">1</span>] <span class="literal">FALSE</span></div><div class="line"></div><div class="line">&gt;&gt;&gt; is.vector(sapply(x,mean))</div><div class="line">[<span class="number">1</span>] <span class="literal">TRUE</span></div><div class="line"></div><div class="line">&gt;&gt;&gt; is.list(sapply(x,mean,simplify = <span class="literal">FALSE</span>))</div><div class="line">[<span class="number">1</span>] <span class="literal">TRUE</span></div></pre></td></tr></table></figure>
<p>其他与 list 有关的函数</p>
<p><code>unlist(x, recursive = TRUE, use.names = TRUE)</code> 将 list 展开，转换为 vector</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div></pre></td><td class="code"><pre><div class="line">&gt;&gt;&gt; unlist(x)</div><div class="line">         a1          a2          a3          a4          a5          a6</div><div class="line"> <span class="number">1.00000000</span>  <span class="number">2.00000000</span>  <span class="number">3.00000000</span>  <span class="number">4.00000000</span>  <span class="number">5.00000000</span>  <span class="number">6.00000000</span></div><div class="line">         a7          a8          a9         a10       beta1       beta2</div><div class="line"> <span class="number">7.00000000</span>  <span class="number">8.00000000</span>  <span class="number">9.00000000</span> <span class="number">10.00000000</span>  <span class="number">0.04978707</span>  <span class="number">0.13533528</span></div><div class="line">      beta3       beta4       beta5       beta6       beta7      logic1</div><div class="line"> <span class="number">0.36787944</span>  <span class="number">1.00000000</span>  <span class="number">2.71828183</span>  <span class="number">7.38905610</span> <span class="number">20.08553692</span>  <span class="number">1.00000000</span></div><div class="line">     logic2      logic3      logic4</div><div class="line"> <span class="number">0.00000000</span>  <span class="number">0.00000000</span>  <span class="number">1.00000000</span></div></pre></td></tr></table></figure>
<p><code>as.list(x, all.names = FALSE, sorted = FALSE, ...)</code> 将对象数据类型转换为 list<br>
<code>is.list(x)</code> 检测数据类型是否为 list</p>
<h1 id="data-frame-数据框">Data.frame 数据框</h1>
<p>data.frame 是 R 中非常重要的数据类型，它长得像 matrix，但是又与 list 一样，可以存储不同类型的数据，但是有一个每列数据长度必须一致。同样，data.frame 数据的访问与 matrix 和 list 类似。</p>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div></pre></td><td class="code"><pre><div class="line">data.frame(<span class="keyword">...</span>, row.names = <span class="literal">NULL</span>, check.rows = <span class="literal">FALSE</span>,</div><div class="line">           check.names = <span class="literal">TRUE</span>,</div><div class="line">           stringsAsFactors = default.stringsAsFactors())</div></pre></td></tr></table></figure>
<p>参数</p>
<ul>
<li><code>...</code> 参数形式为 value 或者 tag = value，value 为数据内容，tag 为列名。</li>
<li><code>row.names</code> 行的名称</li>
<li><code>check.rows</code> 是否检测行数和名称是否一致。</li>
<li><code>check.names</code> 是否检测列的名称的合法性以及是否重复，有必要的话，通过 <code>make.names</code> 调整名称。</li>
<li><code>stringsAsFactors</code> 是否将字符向量转换为因子类型。如果不想让某个对象被强行转换类型，可以用 <code>I(a)</code> 包裹。</li>
</ul>
<figure class="highlight r"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div><div class="line">28</div><div class="line">29</div><div class="line">30</div><div class="line">31</div><div class="line">32</div><div class="line">33</div><div class="line">34</div><div class="line">35</div><div class="line">36</div><div class="line">37</div><div class="line">38</div></pre></td><td class="code"><pre><div class="line">&gt;&gt;&gt; d &lt;- data.frame(c1 = rep(c(<span class="string">"f1"</span>, <span class="string">"f2"</span>), each = <span class="number">3</span>), c2 = <span class="number">1</span>:<span class="number">6</span>)</div><div class="line">&gt;&gt;&gt; d</div><div class="line">  c1 c2</div><div class="line"><span class="number">1</span> f1  <span class="number">1</span></div><div class="line"><span class="number">2</span> f1  <span class="number">2</span></div><div class="line"><span class="number">3</span> f1  <span class="number">3</span></div><div class="line"><span class="number">4</span> f2  <span class="number">4</span></div><div class="line"><span class="number">5</span> f2  <span class="number">5</span></div><div class="line"><span class="number">6</span> f2  <span class="number">6</span></div><div class="line"></div><div class="line">&gt;&gt;&gt; d$c1</div><div class="line">[<span class="number">1</span>] f1 f1 f1 f2 f2 f2</div><div class="line">Levels: f1 f2</div><div class="line"></div><div class="line">&gt;&gt;&gt; d[<span class="number">1</span>]</div><div class="line">c1</div><div class="line"><span class="number">1</span> f1</div><div class="line"><span class="number">2</span> f1</div><div class="line"><span class="number">3</span> f1</div><div class="line"><span class="number">4</span> f2</div><div class="line"><span class="number">5</span> f2</div><div class="line"><span class="number">6</span> f2</div><div class="line"></div><div class="line">&gt;&gt;&gt; d[<span class="number">1</span>,<span class="number">2</span>]</div><div class="line">[<span class="number">1</span>] <span class="number">1</span></div><div class="line"></div><div class="line">&gt;&gt;&gt; d[<span class="string">"c1"</span>]</div><div class="line">c1</div><div class="line"><span class="number">1</span> f1</div><div class="line"><span class="number">2</span> f1</div><div class="line"><span class="number">3</span> f1</div><div class="line"><span class="number">4</span> f2</div><div class="line"><span class="number">5</span> f2</div><div class="line"><span class="number">6</span> f2</div><div class="line"></div><div class="line">&gt;&gt;&gt; d[<span class="string">"c1"</span>][<span class="number">1</span>,]</div><div class="line">[<span class="number">1</span>] f1</div><div class="line">Levels: f1 f2</div></pre></td></tr></table></figure>
]]></content>

    <summary type="html">

      本文总结了下 R 语言中基本的几种数据对象，包括：向量、矩阵、多维数组、因子、列表、数据框。内容几乎全部从 R 语言内置文档阅读获取。

    </summary>

      <category term="Original" scheme="blog.alexiangli.com/categories/Original/"/>

      <category term="R" scheme="blog.alexiangli.com/categories/Original/R/"/>


      <category term="R" scheme="blog.alexiangli.com/tags/R/"/>

      <category term="R-basics" scheme="blog.alexiangli.com/tags/R-basics/"/>

  </entry>

  <entry>
    <title>关于 EEGLAB 分析 EGI 数据</title>
    <link href="blog.alexiangli.com/eeg-eeglab-egi/"/>
    <id>blog.alexiangli.com/eeg-eeglab-egi/</id>
    <published>2016-11-20T16:00:00.000Z</published>
    <updated>2017-07-07T07:12:32.000Z</updated>

    <content type="html"><![CDATA[<p>EEGLAB 作为通用型脑电数据分析的工具包，理论上适用于所有脑电设备记录的数据。问题在于，如果 EEGLAB 未提供原生的 EGI (Electrical Geodesics Incorporated) 数据格式的导入，那么如何转换格式导入数据。另外就是获取并导入匹配的电极位置文件。除了原始数据的导入和电极位置文件的加载，后续分析同其他格式数据一致。</p>
<h1 id="数据导入">数据导入</h1>
<p>原生的2009版的 EGI 文件与 EEGLAB 并不兼容，需要转换为 Netstation 二进制文件（Netstation binary simple）。对于新版本的 EGI 文件的读取，有两种解决方案。</p>
<ul>
<li>第一种为使用 EEGLAB 插件；</li>
<li>第二种为使用 File-IO。</li>
</ul>
<p>EGI 文件也可以转换为 EDF 文件后导入 EEGLAB，但是这会导致“事件”（mark）的损坏或丢失。所以应该选择转换为 Netstation 二进制文件。</p>
<p><code>pop_readegi</code> 函数可用于读取 EGI 版本2和版本3的数据文件。EGI 格式文件的事件存储在专门的 EGI 数据通道。该通道的信息将被自动地导入到 EEGLAB 的事件表中，然后被 EEGLAB 从数据中剔除。如果导入事件通道失败，可以手动提取事件信息（<code>File &gt; Import event info &gt; From data channel</code>）。</p>
<p>当前新版原生的 Netstation 文件无法直接导入到 EEGLAB/Matlab 中。可以通过 Netstation software 将数据转换为 Netstation 二进制文件。虽然 Netstation 也可以导出为 EDF 文件，但是储存事件信息的通道丢失了。</p>
<p>如果单个被试的文件没分为多个数据段，可以在 EEGLAB 中一同导入 <code>File &gt; Import data &gt; From multiple seg. Nestation files</code>。</p>
<p>Nestation 也可以把数据导出为 Matlab 文件，然后再通过 EEGLAB 导入 Matlab 文件 <code>File &gt; Import data &gt; From Netstation epoch Matlab files</code>。这种方法仍然会丢失许多信息。</p>
<p><strong>总结：先将数据从 Netstation 中导出为二进制文件，然后通过 EEGLAB 导入。</strong></p>
<h1 id="电极位置文件">电极位置文件</h1>
<p>电极位置文件可以从 <a href="ftp://ftp.egi.com/pub/support/3rdPartySoftwareSupport/BESA/" target="_blank" rel="external">EGI ftp site</a> 获取。另外，EEGLAB 自带了一些电极位置文件，位于 EEGLAB 根目录下的 <code>sample_locs</code> 文件夹下的 <code>GSM*.sfp</code> 文件。</p>
<p>参考资料：</p>
<ul>
<li><a href="https://sccn.ucsd.edu/wiki/A01:_Importing_Continuous_and_Epoched_Data#Importing_Netstation.2FEGI_files" target="_blank" rel="external">Importing Continuous and Epoched Data: Importing Netstation/EGI files</a></li>
<li><a href="https://sccn.ucsd.edu/wiki/Channel_Location_Files" target="_blank" rel="external">Channel Location Files</a></li>
</ul>
]]></content>

    <summary type="html">

      最近需要指导一位访问学习的博士师兄分析脑电数据，涉及到 EGI 设备的数据格式，记录下通过 EEGLAB 如何导入 EGI 记录的脑电数据。

    </summary>

      <category term="Original" scheme="blog.alexiangli.com/categories/Original/"/>

      <category term="EEG" scheme="blog.alexiangli.com/categories/Original/EEG/"/>


      <category term="EEG" scheme="blog.alexiangli.com/tags/EEG/"/>

      <category term="EGI" scheme="blog.alexiangli.com/tags/EGI/"/>

      <category term="EEGLAB" scheme="blog.alexiangli.com/tags/EEGLAB/"/>

  </entry>

  <entry>
    <title>西服保养建议</title>
    <link href="blog.alexiangli.com/life-clean-suit/"/>
    <id>blog.alexiangli.com/life-clean-suit/</id>
    <published>2016-11-20T16:00:00.000Z</published>
    <updated>2016-11-28T12:40:42.000Z</updated>

    <content type="html"><![CDATA[<p>声明：内容整理自：</p>
<ul>
<li><a href="https://www.zhihu.com/question/23243060" target="_blank" rel="external">知乎：西装在平时应该如何保养？应该如何清洗？普先生、赵远方等人回答</a></li>
<li><a href="http://www.merino.com/cn/wool/care-instructions/cleaning-a-wool-suit/" target="_blank" rel="external">Merino：羊毛西装的清洁</a></li>
</ul>
<h1 id="清洗">清洗</h1>
<ul>
<li><strong>干洗</strong>。只在必要时才选择干洗。找专门而可靠的干洗店，减少染毛和毛料被烫坏的可能。</li>
<li><strong>按需干洗，一年一次</strong>。一般西服最高耐用年限为 4-5年，顶尖品牌的西服也只能保证干洗6次以内不起泡。一套西装每年最多干洗三次。</li>
<li><strong>及时清除污渍</strong>。小范围污渍使用局部清洗法。在产生污渍的最短时间内用蘸水或少许清洁剂的布轻轻擦拭，不能太用力，否则污渍会嵌入到面料中。干洗时，要告知干洗店污渍残留的位置，以免疏忽造成污渍的永久残留。</li>
<li><strong>使用专门的西服刷清洁尘污</strong>。西服刷尽量选择由真正动物毛发制成的刷子。尘污会使西服失去新鲜感。当西服沾上其他纤维或较不容易去除的尘污，可以用胶带吸附。</li>
</ul>
<h1 id="穿着频率">穿着频率</h1>
<p><strong>避免疲劳穿着，一套西服不要连续穿两天以上</strong>。西服在穿过后，会因局部张力而变形，需适当“休息”，故应准备两、三套换穿。一套西服可配两条西裤替换。</p>
<h1 id="悬挂">悬挂</h1>
<ul>
<li>使用专用的木衣架（宽柄圆弧形，与肩同宽）悬挂西服，换季时储存在袋子或箱子中防潮防尘。避免使用过窄衣架，以免破坏肩型。</li>
<li>裤子的吊挂可用衣裤联合衣架，也可用带夹子的西裤专用衣架，夹住裤脚自然倒挂，有助于长期保持裤线及裤型。<br>
<img src="/img/care-suit/hang-suit.jpg" alt="悬挂西裤方法"></li>
</ul>
<h1 id="折叠">折叠</h1>
<p><img src="/img/care-suit/fold-suit-1.jpg" alt="折叠西服方法一"><br>
<img src="/img/care-suit/fold-suit-2.jpg" alt="折叠西服方法二"></p>
<h1 id="熨烫">熨烫</h1>
<ul>
<li><strong>切勿直接熨烫，使用蒸汽熏烫法</strong>。调好温度，选择蒸汽档轻轻熏烫（高品质蒸汽清洁器价格不菲）。用蒸汽熨斗对准西装喷射足够两的蒸汽。</li>
<li><strong>重点关照活动多的部位</strong>。如胳膊肘、膝盖、袖子、腰部等部位。褶皱严重的地方用蒸汽让面料吸足水后，轻轻拉平有褶皱的地方。朝四周均衡拉平，而不应该只朝一个方向拉。</li>
<li><strong>盖一层薄棉布熨烫西裤裤线</strong>。盖上薄棉布后，慢慢移动熨斗。熨完之后不应立即撤掉棉布，等降温后再撤走。<br>
<img src="/img/care-suit/press-suit.jpg" alt="熨烫西服建议"></li>
</ul>
<h1 id="防潮">防潮</h1>
<ul>
<li>收藏存放前要晾干。</li>
<li>收藏存放期间适度进行通风和晾晒。</li>
<li>选择通风干燥处的合适地点或位置收藏存放。</li>
<li>在湿度较大的收藏间存放时可使用防潮剂。用干净的白沙布定制成小袋，装入块状的氯化钙封口，并放置在衣柜，切勿将防潮袋与服装接触。另外，需要经常检查防潮袋是否仍有效。</li>
<li>使用防潮袋（布料为佳）悬挂西服。</li>
</ul>
<h1 id="防虫">防虫</h1>
<ul>
<li>使用樟脑丸时用白纸或浅色纱布包好，散放在箱柜四周，或装入小布袋中悬挂在衣柜内。</li>
<li>在使用防蛀剂时要注意它的用量。仅在存放服装的箱柜中能嗅到樟脑丸的气味为宜。</li>
</ul>
<h1 id="去亮光">去亮光</h1>
<p>久穿的西装（尤其是光面面料），在肘部和膝部易产生亮光。准备半盆清水，并往水中滴上几滴醋，把毛巾蘸湿后，用毛巾按一个方向檫几下，便可除去亮光。</p>
<h1 id="其他">其他</h1>
<ul>
<li>仔细阅读西服上的护理标签，清楚标签的含义和指示。</li>
<li>减少衣物堆积。避免西服出现皱褶和混色。</li>
<li>坐在桌旁工作时，建议脱掉西服，以免袖口及肘部跟桌子长时间摩擦变得光亮。</li>
<li>西服外兜不要拆线，以免变形不服帖，钱包和手机等可放到西服内兜。</li>
<li>久穿的毛料西服，在相对湿度 35% - 40% 环境（如洗完澡后的浴室）中放置一晚，可除去衣服细小皱纹。</li>
</ul>
]]></content>

    <summary type="html">

      为了后面的求职面试，花“重金”买了套休闲西服。对于还未有正式收入的未毕业学生来说，略微昂贵。于是网上找了些与西服保养有关的资料，整理在此，方便以后温习。事事不能想当然，生活技能也需要学习。

    </summary>

      <category term="Notes" scheme="blog.alexiangli.com/categories/Notes/"/>


      <category term="西服保养" scheme="blog.alexiangli.com/tags/%E8%A5%BF%E6%9C%8D%E4%BF%9D%E5%85%BB/"/>

      <category term="Life style" scheme="blog.alexiangli.com/tags/Life-style/"/>

  </entry>

  <entry>
    <title>数据库面试题解答</title>
    <link href="blog.alexiangli.com/sql-interview/"/>
    <id>blog.alexiangli.com/sql-interview/</id>
    <published>2016-11-18T16:00:00.000Z</published>
    <updated>2016-11-28T12:42:11.000Z</updated>

    <content type="html"><![CDATA[<p>内容来源：<a href="https://zhuanlan.zhihu.com/p/23713529" target="_blank" rel="external">知乎专栏-学习编程：常见面试题整理–数据库篇</a></p>
<h1 id="存储过程">存储过程</h1>
<p>什么是存储过程？优缺点？</p>
<p>存储过程：一个预编译的代码块（T-SQL），实现一系列功能（对表单或多表增删改查），然后方便调用。<br>
优点：执行效率较高；降低网络通信量，提高通信效率；一定程度上确保数据安全。</p>
<p>我的问题：如何编写存储过程代码？</p>
<h1 id="索引">索引</h1>
<p>什么是索引？作用？优缺点？使用索引查询一定能提高查询的性能吗？</p>
<p>索引：对数据库表中一或多个列的值进行<strong>排序</strong>的结构，能够<strong>加快数据检索</strong>，允许数据库程序迅速找到表中数据，而不必扫描整个数据库。<br>
MySQL中几个基本的索引：普通索引、唯一索引、主键索引、全文索引。<br>
优缺点：</p>
<ul>
<li>加快数据检索速度</li>
<li>降低增删改等维护任务的速度</li>
<li>唯一索引可以确保每一行数据的唯一性</li>
<li>可以在查询过程中使用优化隐藏器，提高系统性能</li>
<li>占物理和数据空间（缺点）</li>
</ul>
<p>然而，在使用索引查询数据时，需要注意到它的代价。<strong>索引需要存储空间，也需要定期维护</strong>，当表中记录被增减，或者索引列被修改时，索引本身也会被修改，将会因此多付出4，5次磁盘I/O。有时候，<strong>不必要的索引反而会时查询反应时间变慢</strong>，所使用索引并不一定提高查询性能。</p>
<p>索引查询适合两种情况：</p>
<ul>
<li><strong>基于一定范围的检索</strong>，一般查询返回结果集小于表中记录数的30%；</li>
<li><strong>基于非唯一性索引的检索</strong>。</li>
</ul>
<p>我的问题：如何创建索引？</p>
<h1 id="事务">事务</h1>
<p>事务（Transaction）：并发控制的基本单位。它是一个操作序列，要么执行，要么都不执行，是一个不可分割的工作单位。事务是数据库维护数据一致性的单位，在每个事务结束时，保持数据一致性。</p>
<h1 id="并发控制">并发控制</h1>
<p>乐观锁与悲观锁</p>
<p><strong>并发控制</strong>确保多个事务在同时存取数据库中同一数据时，不破坏事务的隔离性和统一性，以及数据库的统一性。乐观锁和悲观锁为并发控制的主要技术手段。</p>
<ul>
<li>悲观锁为悲观并发控制，它假定会发生并发冲突，屏蔽一切可能违反数据完整性的操作；</li>
<li>乐观锁为乐观并发控制，它假定不会发生并发冲突，只在提交操作时检查是否违反数据完整性。</li>
</ul>
<h1 id="删除操作">删除操作</h1>
<p>drop, delete, truncate 的区别？分别在什么场景下使用？</p>
<ul>
<li>三者均为删除操作；</li>
<li>delete 和 truncate 只删除表的数据不删除表的结构</li>
<li>速度：drop &gt; truncate &gt; delete</li>
<li>delete 语句是 dml，这个操作会放到 rollback segement 中，事务提交之后才生效；如果有相应的 trigger，执行的时候将被触发。truncate，drop 是 ddl，操作立即生效，原数据不放到 rollback segment 中，不能回滚。操作不触发 trigger。</li>
</ul>
<p>场景：</p>
<ul>
<li>不再需要一张表，用 drop；</li>
<li>删除部分数据行，用 delete，带上 where 子句；</li>
<li>保留表而删除所有数据，用 truncate。</li>
</ul>
<h1 id="键">键</h1>
<p>超键、候选键、主键、外键是什么？</p>
<p>超键：在关系中<strong>能唯一标识元组的属性集</strong>称为关系模式的超键。一个属性可以为作为一个超键，多个属性组合在一起也可以作为一个超键。超键包含<strong>候选键</strong>和<strong>主键</strong>。<br>
候选键：最小超键，即没有冗余元素的超键。<br>
主键：数据库表中对储存数据对象予以<strong>唯一和完整标识</strong>的数据列或属性的组合。<strong>一个数据列只能有一个主键</strong>，且主键的取值不能缺失，即不能为空值（Null）。<br>
外键：在一个表中存在的另一个表的主键称此表的外键。</p>
<h1 id="视图">视图</h1>
<p>什么是视图？视图的使用场景？</p>
<p>视图是一种虚拟的表，具有和物理表相同的功能，可对视图进行增，改，查的操作。视图通常是一个表或者多个表的行或列的子集。相比多表查询，使得我们获取数据更容易。对视图的修改不影响基本表。</p>
<ul>
<li>只暴露部分字段给访问者；</li>
<li>查询的数据来源于不同的表，而查询者希望以统一的方式查询，创建视图能把多个表查询结果联合起来，查询者只需要直接从视图中获取数据，不必考虑数据来源于不同表所带来的差异</li>
</ul>
<h1 id="范式">范式</h1>
<p>范式（Normal Form）：符合某一种级别的关系模式的集合，表示一个关系内部各属性之间的联系的合理化程度。通俗地讲，就是一张数据表的表结构所符合的某种设计标准的级别。符合高一级范式的设计，必定符合低一级范式。范式作为设计的标准范式，其最大的意义就是为了避免数据的冗余和插入/删除/更新的异常。</p>
<p>第一范式 1NF：符合 1NF 关系中的每个属性都不可再分。也就是说，一个属性不能有子属性。1NF 是所有关系型数据库（RDB）的最基本要求。只要在关系型数据库管理系统中已经存在的数据表，一定是符合 1NF 的。如果仅仅符合 1NF 的设计，会存在数据冗余过大，插入异常，删除异常，修改异常等问题。故需要提高设计标准，解决导致上述问题的因素，使其符合更高一级范式，这就是所谓的“规范化”。</p>
<p>第二范式 2NF：2NF 在 1NF 的基础上，消除了非主属性对于码的部分函数依赖。2NF 要求，如果依赖于主键，则需要依赖于所有主键，不能存在依赖部分主键的情况。通俗说，就是不要把不相关的东西放到一个表中。不相关的东西不要放在一起，用多个小表连接来代替大表，减少修改时候的负担。</p>
<ul>
<li>函数依赖：若在一张表中，在属性（或属性组）X 的值确定的情况下，必定能确定属性 Y 的值，那就可以说 Y 函数依赖于 X，写作 $X\toY$。函数依赖包括三种依赖：完全函数依赖、部分函数依赖、传递函数依赖。</li>
<li>码：假设 K 为某表中的一个属性或属性组，若除 K 之外的所有属性都<strong>完全函数依赖</strong>于 K，那么我们称 K 为候选码，简称为码。换句话说，假如当 K 确定的情况下，该表除 K 之外的所有属性的值也就随之确定，那么 K 就是码。一张表中可以有超过一个码。实际应用中为了方便，通常选择其中的一个码作为主码。</li>
<li>非主属性：包含在任何一个码中的属性称为主属性。</li>
</ul>
<p>判断是否符合 2NF 就是看数据表中是否存在非主属性对于码的部分函数依赖。若存在，则数据表最高只符合 1NF 的要求，若不存在，则符合 2NF 的要求。判断方法为：</p>
<ul>
<li>第一步：找出数据表中所有的码。</li>
<li>第二步：根据第一步得到的码，找出所有的主属性。</li>
<li>第三步：数据表中，除去所有的主属性，剩下的就都是非主属性了。</li>
<li>第四步：查看是否存在非主属性对码的部分函数依赖。</li>
</ul>
<p>如何消除部分函数依赖，从而让表符合 2NF 的要求？</p>
<ul>
<li>将大数据表拆分成两个或者更多个更小的数据表，在拆分过程中，要达到更高一级范式的要求，这个过程叫做<strong>模式分解</strong>。模式分解方法不是唯一的。<strong>如何进行模式分解呢？</strong></li>
</ul>
<p>仅仅符合 2NF 的要求，仍然存在非主属性对于码的传递函数依赖。为了进一步解决问题，还需将数据表改进为符合 3NF 的要求。</p>
<p>第三范式 3NF：3NF 在 2NF 的基础之上，消除了非主属性对于码的传递函数依赖。如果存在非主属性对于码的传递函数依赖，则不符合 3NF 的要求。3NF 避免了查询路径过长而导致询问时间过长或者更新异常，提高查询效率。然而，在某些特殊情况下，即使关系模式符合 3NF 的要求，仍然存在这插入、修改、删除异常。造成问题的原因在于，存在着主属性对于码的部分函数依赖和传递函数依赖。解决办法是在 3NF 的基础上消除主属性对于码的部分与传递函数依赖。此时的设计为 BCNF 范式。BC 范式的每个表中只有一个候选键。</p>
<p><strong>所谓的范式，是用来学习参考的，设计的时候根据情况，未必一定要遵守。</strong></p>
<p>参考：</p>
<ul>
<li><a href="https://www.zhihu.com/question/24696366" target="_blank" rel="external">解释一下关系数据库的第一第二第三范式？刘慰、Lyken的回答</a></li>
</ul>
<p>拓展阅读：</p>
<ul>
<li><a href="http://www.cnblogs.com/CareySon/archive/2010/02/16/1668803.html" target="_blank" rel="external">数据库范式那些事</a></li>
</ul>
<hr>
<p>参考资料：</p>
<ul>
<li><a href="https://zhuanlan.zhihu.com/p/23713529" target="_blank" rel="external">知乎专栏-学习编程：常见面试题整理–数据库篇</a></li>
</ul>
]]></content>

    <summary type="html">

      看到一篇不错的关于数据库面试问题的解答的知乎专栏文章，拿过来咀嚼一番，记录在此，以便后面查阅和扩充相关内容。

    </summary>

      <category term="Notes" scheme="blog.alexiangli.com/categories/Notes/"/>


      <category term="Database" scheme="blog.alexiangli.com/tags/Database/"/>

      <category term="SQL" scheme="blog.alexiangli.com/tags/SQL/"/>

  </entry>

  <entry>
    <title>R：使用 tidyr 进行数据操作</title>
    <link href="blog.alexiangli.com/r-tidyr/"/>
    <id>blog.alexiangli.com/r-tidyr/</id>
    <published>2016-11-16T16:00:00.000Z</published>
    <updated>2016-11-19T03:33:16.000Z</updated>

    <content type="html"><![CDATA[<h1 id="宽转长-wide2long">宽转长 Wide2Long</h1>
<p><code>tidyr::gather(data, key, value, ..., na.rm = FALSE, convert = FALSE, factor_key = FALSE)</code></p>
<p>参数：</p>
<ul>
<li><code>data</code> 数据框</li>
<li><code>key</code> 键，新生成的因子列名（不需要加引号）</li>
<li><code>value</code> 值，新生成的数值列名（不需要加引号）</li>
<li><code>...</code> 所指定的需要转换的列，使用变量名，即原来数据框的列名。变量名将构成合并的键的因子。有以下几种表示方式：
<ul>
<li><code>x:z</code> 选择所有的 x 与 z 之间的变量（包括 x 和 z）</li>
<li><code>-z</code> 排除 z 变量</li>
<li><code>x,y,z</code> 选择 x y z 三个变量</li>
<li><code>2:4</code> 选择第2到4列</li>
</ul>
</li>
<li><code>na.rm</code> 是否去除缺失值</li>
<li><code>convert</code> 自动转换键列的值的数据类型，当需合并的列名为数值，整数，或逻辑值时非常有用</li>
<li><code>factor_key</code> 是否将键的内容转换为因子，默认为 <code>FALSE</code> 即存储为字符向量</li>
</ul>
<h1 id="长转宽-long2wide">长转宽 Long2Wide</h1>
<p><code>tidyr::spread(data, key, value, fill = NA, convert = FALSE, drop = TRUE, sep = NULL)</code></p>
<p>参数：</p>
<ul>
<li><code>key</code> 该列的因子转换为多个列的列名</li>
<li><code>value</code> 该列的内容将分配到多个列中</li>
<li><code>fill</code> 指定填充缺失值的值</li>
<li><code>convert</code> 数据类型转换</li>
<li><code>drop</code> 是否丢弃数据中不存在的因子水平，如果为 <code>FALSE</code>，那么将把没有数据的因子的值使用 <code>fill</code> 的参数值填充。</li>
<li><code>sep</code> 如果为 <code>NULL</code>，新列名为键所在的列的值；如果不为 <code>NULL</code>，列名将为 <code>&lt;key_name&gt;&lt;sep&gt;&lt;key_value&gt;</code>，即键所在的列名加上分隔字符加键所在的列的值。例如：如果 <code>key</code> 为 <code>A</code>，A 有两个因子水平，<code>A1,A2</code>，<code>sep=&quot;-&quot;</code>，那么列名将为 <code>A-A1,A-A2</code>。</li>
</ul>
<h1 id="分割单列为多列">分割单列为多列</h1>
<p><code>tidyr::separate(data, col, into, sep = &quot;[^[:alnum:]]+&quot;, remove = TRUE, convert = FALSE, extra = &quot;warn&quot;, fill = &quot;warn&quot;, ...)</code></p>
<p>参数：</p>
<ul>
<li><code>col</code> 列名（不需要引号）</li>
<li><code>into</code> 字符串向量，例如 <code>c(&quot;x&quot;, &quot;y&quot;)</code>，分隔的列的列名就是 x 和 y。</li>
<li><code>sep</code> 分隔符，取值可以为正则表达式，也可以为数值。分隔符的长度必须比 <code>into</code> 少1。
<ul>
<li>如果为字符串，那么被当作正则表达式处理。默认值为匹配任意非字母和数字的符号。</li>
<li>如果为数值，被当作分隔的位置。正数表示从第1个字符开始数，负数表示从右边第1个字符开始数。</li>
</ul>
</li>
<li><code>remove</code> 是否剔除输入列，默认为 <code>TRUE</code>，表示将 <code>col</code> 和 <code>into</code> 剔除。</li>