Skip to content

Commit e40a081

Browse files
committed
2 different datasets (mystery data 4 and 7)
1 parent 4941852 commit e40a081

File tree

6 files changed

+148
-172
lines changed

6 files changed

+148
-172
lines changed

mystery-data.zip

3.62 KB
Binary file not shown.

mystery-data/data4.csv

Lines changed: 9 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,9 @@
1-
Group1,Group2
2-
0.7,1.9
3-
-1.6,0.8
4-
-0.2,1.1
5-
-1.2,0.1
6-
-0.1,-0.1
7-
3.4,4.4
8-
3.7,5.5
9-
0.8,1.6
10-
0,4.6
11-
2,3.4
1+
Before,After
2+
85,75
3+
70,50
4+
40,50
5+
65,40
6+
80,20
7+
75,65
8+
55,40
9+
20,25

mystery-data/data6.csv

Lines changed: 100 additions & 100 deletions
Original file line numberDiff line numberDiff line change
@@ -1,101 +1,101 @@
11
Before,After
2-
3.86742471266244,5.34799509821092
3-
5.44302072581142,5.77083934238885
4-
4.12977055055094,6.61885785412577
5-
4.06814827579831,4.46992888351995
6-
4.92875436478702,5.38977954456214
7-
4.79670083448242,6.07263960878491
8-
5.87229736132638,6.91549077926941
9-
5.4937227998083,5.79285908049972
10-
5.10932787141978,6.28349699623494
11-
6.2971607968844,4.99422074069174
12-
5.14654249435327,4.59631393462502
13-
5.2344234049385,5.10671744008003
14-
5.19430439604953,5.65318319218658
15-
5.78029423051111,6.18117747718444
16-
6.52144852377806,6.79406477313316
17-
4.06007991427792,4.11525679841329
18-
5.09435796626848,5.00411321384976
19-
6.92341292193378,5.77679163982642
20-
4.10570033483655,5.01285247307354
21-
7.66574504365385,6.17564541448577
22-
5.35701627037348,7.36365973213449
23-
3.48538110719407,4.87369219415644
24-
5.3787872045258,5.59867364600418
25-
5.09958921081938,5.03073971958723
26-
4.55004770644439,6.35396020800683
27-
3.43488066584604,4.70750516265895
28-
4.51884153513657,6.1711917079173
29-
4.91754378065697,6.67682012499472
30-
5.10195198023253,4.86430777409148
31-
4.16829187781692,6.14408219272787
32-
4.4529196513749,4.43694175963381
33-
3.11437162198838,5.29139274287177
34-
6.21893020204033,5.56745111020949
35-
3.46296439841883,4.33535230630586
36-
6.55716667636759,6.95588096556676
37-
4.79442297398042,5.70617618776457
38-
5.85153302305026,6.19075238380252
39-
4.56985564179787,5.66916570308593
40-
4.21700106105276,5.96285492445076
41-
5.38359385422245,6.22032399536323
42-
5.78740678301614,6.06719878119245
43-
4.30865355311008,6.06472277471777
44-
3.81319891113306,8.59413831534837
45-
5.95598039700588,6.36637729855923
46-
4.89346224745456,5.26049327188695
47-
6.76170722577193,4.65183831191129
48-
5.54678681746598,5.92034123651943
49-
4.48710981464151,5.43969262944199
50-
5.41084510183402,6.1904491137286
51-
3.80267135908683,5.4631941723757
52-
3.8574974691253,6.4093810054597
53-
3.05551581623659,4.74285185008093
54-
4.39851056811042,5.93135485709729
55-
4.7888479019937,5.47661251534808
56-
5.22346488385762,3.05224458517395
57-
6.0225233909491,6.32507192601445
58-
5.4567625348103,5.10207037280747
59-
4.42525655354525,4.44444420815608
60-
4.14851900975177,5.71301675169265
61-
5.26497576684805,7.34727589177775
62-
6.98256881952284,5.79518116950373
63-
6.79142552786418,4.92165995540776
64-
4.30424890153367,5.92651067221646
65-
4.45207440549117,5.05342013952313
66-
5.86924751750422,3.2950465165048
67-
3.5481065702105,6.02448813246303
68-
5.37075784995012,6.11356347844423
69-
2.26094292510698,5.58834584575776
70-
5.76544986346151,5.31443832646173
71-
5.11780945051587,4.33778361770661
72-
7.42946098449296,6.34418368074845
73-
4.21214662418593,5.1944262218357
74-
3.1431956178091,6.33575757671344
75-
3.87282601175461,6.24664989707643
76-
5.89362947195591,5.45299940234452
77-
4.04440873807453,6.27538778075636
78-
4.66128150601909,5.06421511502248
79-
4.70962606102851,4.20971978902807
80-
4.25752616881211,4.24629538885353
81-
4.65207526252791,5.40492570263223
82-
3.97847823665323,5.67616434244428
83-
3.03081913902277,5.32593071888426
84-
5.13640514974908,6.21509248160154
85-
4.72449339207167,4.92995593876198
86-
5.11303391470569,5.13284664102819
87-
5.1430326711442,5.51807404228089
88-
5.16208317014053,7.30377567580694
89-
4.30207014116144,5.06375059947036
90-
5.79888639624305,6.88173873946018
91-
3.95525166767843,5.8958631596413
92-
5.33088719373288,5.79203516042205
93-
5.80211370048751,5.1392873909505
94-
5.36210745277515,4.76092981766756
95-
3.44266410754941,5.89496336371729
96-
5.00323812023262,5.37159272331979
97-
5.17028063699446,6.64767976857032
98-
5.50963670033692,4.84994583059593
99-
5.94246568674972,6.31973694223943
100-
3.92355954362466,5.94901785956371
101-
4.31167351228822,6.08795654522241
2+
4.26966258937941,5.92516286094901
3+
3.77252900710419,4.82457978356335
4+
7.01129265960208,5.37990015789653
5+
3.7376150930703,4.63176070072204
6+
4.81770615280243,7.12129611704166
7+
5.64414253552135,5.19626081148799
8+
5.29826400684533,4.40787133850167
9+
5.84323685564281,5.21579679321744
10+
3.63022059291777,5.65744738049738
11+
5.9404834825618,5.97956748351097
12+
3.38656860392474,4.45010938204416
13+
7.11120629749673,5.67264575642377
14+
4.81831394076906,7.10145860949749
15+
4.59661718556736,4.9678374166008
16+
5.01283216138533,5.14848909792138
17+
3.65580330996474,4.46849617080025
18+
7.19958322395298,5.36175787547516
19+
6.50684605426548,5.65549116213647
20+
5.94232633634117,3.23545092194219
21+
6.75495120001045,5.01313277570429
22+
5.47162470791784,3.96191667228983
23+
3.3396955642073,5.56434216199771
24+
4.8165674033112,4.26624750213829
25+
3.7886654269429,5.78053888784902
26+
4.12750505199339,4.82485375350003
27+
5.18437133297427,5.36482443781866
28+
4.42228019822818,4.20305032136328
29+
3.5043635621442,4.69970344111892
30+
3.96809562688034,5.84293532470908
31+
5.9725889266402,5.70621121521026
32+
4.71146501654787,5.5492178213097
33+
5.53634626872862,5.90425944872174
34+
3.58783607786553,4.1941481494595
35+
3.45998874162922,4.27055087534582
36+
5.49640306749457,5.77549649677766
37+
5.61127973318879,6.39189658587217
38+
5.01204797175893,4.76611957321555
39+
5.36369007893939,4.63953178715955
40+
5.29333598951372,4.98304232842366
41+
6.26050142708007,6.56326577561772
42+
7.00401178902151,4.48963740061468
43+
4.3854449316488,4.18717024333839
44+
5.8917918601956,6.13381612419914
45+
4.09898432189678,7.44969842382997
46+
3.9739759706041,4.85082251693038
47+
3.77832177435661,4.96425900693006
48+
5.61480863425842,5.78516300221936
49+
5.21354979392656,6.4528919024955
50+
5.87195797987612,4.20164068718401
51+
4.33109539059665,6.16978651541144
52+
5.92216582103262,4.33764151638569
53+
5.34898543819138,7.30279006077873
54+
4.52663136157609,5.07220013727927
55+
5.23022047771636,5.42499709843795
56+
4.88786041424353,4.59102815876287
57+
4.96543249949444,4.79614921418964
58+
6.12723932776391,5.89710752710511
59+
5.44234517309018,6.86568672477544
60+
5.96801697255707,6.34447196263256
61+
6.04000497436364,6.3221428964113
62+
4.596422728516,5.27363457431021
63+
3.66234000670527,5.73951481331435
64+
3.89758835343045,6.01908480874724
65+
2.8857803307234,4.83299511575399
66+
4.39140749308814,4.85869097565217
67+
5.31579322033981,3.53612798344628
68+
6.10482124433367,5.22908505488085
69+
3.92999140583584,4.81131621688507
70+
6.93380672404583,5.56257421916021
71+
4.08560800099396,4.04073932465996
72+
5.37851322285364,6.12005644184124
73+
5.67835668690078,4.80426814019064
74+
5.56840043240553,6.33966003107779
75+
4.30738250309661,4.55586047865873
76+
5.66765809521458,6.10579363795745
77+
3.46418753586412,4.91481573064849
78+
3.69897424119634,5.87215147448101
79+
5.33040411728839,5.98904917588096
80+
5.82214377861419,6.40203834925397
81+
5.02966947595126,7.98642258766233
82+
5.11202352171473,4.52942185505885
83+
6.47948655816117,5.51046876065764
84+
4.22815217571145,4.11282385909362
85+
4.77078099030668,6.40170722448669
86+
4.79927458956966,4.62083353790189
87+
6.38142323877427,5.77049008959382
88+
6.20644642425947,6.71623724051249
89+
5.34607669671352,5.27919652859588
90+
5.44776714372478,6.17332828548977
91+
4.64308794758906,5.80842604455848
92+
5.47520943585245,6.26692676068933
93+
5.9985390833877,6.331903154441
94+
4.7773436271923,5.26542834739014
95+
4.785186951215,6.37104084492817
96+
4.14957848224147,4.77031284010497
97+
5.33546276661176,4.55804929534339
98+
5.83335353225774,5.31491763143331
99+
7.14815156164517,5.6288914113136
100+
5.69380538251409,5.15375902816261
101+
5.73125907903491,4.85245301575252

mystery-data/data7.csv

Lines changed: 14 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,41 +1,14 @@
1-
type,weightgain
2-
High,73
3-
High,102
4-
High,118
5-
High,104
6-
High,81
7-
High,107
8-
High,100
9-
High,87
10-
High,117
11-
High,111
12-
High,98
13-
High,74
14-
High,56
15-
High,111
16-
High,95
17-
High,88
18-
High,82
19-
High,77
20-
High,86
21-
High,92
22-
Low,90
23-
Low,76
24-
Low,90
25-
Low,64
26-
Low,86
27-
Low,51
28-
Low,72
29-
Low,90
30-
Low,95
31-
Low,78
32-
Low,107
33-
Low,95
34-
Low,97
35-
Low,80
36-
Low,98
37-
Low,74
38-
Low,74
39-
Low,67
40-
Low,89
41-
Low,58
1+
August,November
2+
8.1,11.2
3+
10,16.3
4+
16.5,15.3
5+
13.6,15.6
6+
9.5,10.5
7+
8.3,15.5
8+
18.3,12.7
9+
13.3,11.1
10+
7.9,19.9
11+
8.1,20.4
12+
8.9,14.2
13+
12.6,12.7
14+
13.4,36.8

practical.Rmd

Lines changed: 25 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -15,17 +15,17 @@ opts_chunk$set(tidy=FALSE,dev="png",fig.show="as.is",
1515
```
1616

1717
# Introduction
18-
In this practical, we will use several 'read-life' datasets to demonstrate some of the concepts you have seen in the lectures. We will guide you through how to analyse these datasets in Shiny and the kinds of questions you should be asking yourself when faced with similar data.
18+
In this practical, we will use several 'real-life' datasets to demonstrate some of the concepts you have seen in the lectures. We will guide you through how to analyse these datasets in Shiny and the kinds of questions you should be asking yourself when faced with similar data.
1919

2020
To answer the questions in this practical we will be using apps that we have developed using the [***Shiny***](http://shiny.rstudio.com/gallery/) add-on for the *R* statistical package. **R** is a freely-available open-source software that is popular within academic and commercial communities. The functionality within the software compares favourably with other statistical packages (SAS, SPSS and Stata). The downside is that **R** has a steep learning-curve and requires a basic familiarity with command-line software. To ease the transition we have chosen to present this course using a series of online tools that will allow you to perform statistical analysis without having to worry about learning R. At the same time, the R code required for the analysis will be recorded in the background. You will therefore be able to repeat the analysis at a later date, or pass-on to others. As you gain familiarity with R through other courses, you will see how the code generated by Shiny can be adapted to your own needs.
2121

22-
The datasets you will need for this practical should be [***downloaded and unzipped now***](https://rawgit.com/bioinformatics-core-shared-training/IntroductionToStats/master/CourseData.zip)
22+
The datasets you will need for this practical should be ***downloaded and unzipped now***:- https://rawgit.com/bioinformatics-core-shared-training/IntroductionToStats/master/CourseData.zip
2323

2424

2525
# T-tests practical: Parametric Tests
2626

2727
## The effect of disease on height
28-
A scientist knows that the mean height of females in England is 165cm and wants to know whether her patients with disease X have heights that differ significantly from the population mean - we will use a one-sample t-test to test this. The data are contained in the file **`diseaseX.csv`** and can be analysed online at:-
28+
A scientist knows that the mean height of females in England is ***165cm*** and wants to know whether her patients with disease X have heights that differ significantly from the population mean - we will use a one-sample t-test to test this. The data are contained in the file **`diseaseX.csv`** and can be analysed online at:-
2929

3030
[http://bioinformatics.cruk.cam.ac.uk/stats/OneSampleTest/](http://bioinformatics.cruk.cam.ac.uk/stats/OneSampleTest/)
3131

@@ -41,7 +41,7 @@ cat("Alternative hypothesis: The mean height of female patients with disease X !
4141

4242
To import the file `diseaseX.csv` into ***Shiny*** you will need to select the `Choose File` option from the `Data Input` tab and navigate to where the course data are located on your laptop. The right-hand panel of the `Data Input` tab should update to show the Heights of various individuals in the study.
4343

44-
Also, on the `Data Input` tab you will need to change the value of ***True mean***.
44+
Also, on the `Data Input` tab you will need to change the value of ***Hypothesized mean***.
4545

4646
```{r}
4747
myfile <- "Practical/diseaseX.csv"
@@ -56,7 +56,7 @@ data <- read.csv(myfile, header=header, sep=sep, quote=quote,skip=skip)
5656

5757
b) A histogram and boxplot of the `Height` variable will be automatically generated for you. To view it, click on the ***Data Distribution***. You can toggle whether to overlay a density plot on top of the boxplot, or choose different bin sizes for the histogram.
5858

59-
Do the data look normally distributed? Based on the plots, is the one-sample t –test appropriate?
59+
Do the data look normally distributed? Based on the plots, is the parametric one-sample t –test appropriate?
6060

6161
```{r}
6262
datacol <- 2
@@ -117,8 +117,9 @@ biological process for the two cell types.")
117117

118118
Import the data using ***Choose File*** as before. Make sure that the ***1st column is a factor?*** checkbox is ticked.
119119

120-
b) Histograms and boxplots to compare the two groups will be created for you automatically. Do the data look normally distributed for each cell-type? Is the independent t-test appropriate?
120+
b) Histograms and boxplots to compare the two groups will be created for you automatically. You can also see a basic numerical summary of the data distribution.
121121

122+
Do the data look normally distributed for each cell-type? Is the independent t-test appropriate? What statistics are appropriate to report the location (mean or median) and spread (sd or IQR) of the data?
122123

123124
```{r}
124125
myfile <- "Practical/bp_times.csv"
@@ -566,19 +567,19 @@ t.test(value~variable,data=data,alternative=alternative,paired=paired,var.equal=
566567
```
567568

568569

569-
## Dataset4: Sleep Data `data4.csv`
570+
## Dataset4: Effect of Autism drug `data4.csv`
570571

571-
Data which show the effect of two soporific drugs (increase in hours of sleep compared to control) on 10 patients.
572+
A new chemotherapy treatment is proposed for patients with breast cancer. Investigators are concerned with patient's ability to tolerate the treatment and assess their quality of life both before and after receiving the new chemotherapy treatment. Quality of life (QOL) is measured on an ordinal scale and for analysis purposes, numbers are assigned to each response category as follows: 1=Poor, 2= Fair, 3=Good, 4= Very Good, 5 = Excellent.
572573

573-
*Does the drug have an effect on the amount of sleep?*
574+
*Is there statistically significant improvement in repetitive behavior after 1 week of treatment?*
574575

575576
```{r echo=FALSE,eval=FALSE}
576-
data(sleep)
577-
sleep <- spread(sleep, group,extra)[,-1]
578-
colnames(sleep) <- c("Group1","Group2")
579-
write.csv(sleep, file="mystery-data/data4.csv",quote=FALSE,row.names=FALSE)
577+
###http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704_Nonparametric/BS704_Nonparametric5.html
578+
data <- data.frame(Before=c(85,70,40,65,80,75,55,20),After = c(75,50,50,40,20,65,40,25))
579+
580+
write.csv(data, file="mystery-data/data4.csv",quote=FALSE,row.names=FALSE)
581+
580582
581-
t.test(sleep[,1],sleep[,2],paired=TRUE)
582583
```
583584

584585
```{r}
@@ -656,7 +657,7 @@ t.test(value~variable,data=data,alternative=alternative,paired=paired,var.equal=
656657

657658
Drunk driving is one of the main causes of car accidents. Interviews with drunk drivers who were involved in accidents and survived revealed that one of the main problems is that drivers do not realize that they are impaired, thinking “I only had 1-2 drinks … I am OK to drive.”
658659

659-
A sample of 100 drivers was chosen, and their reaction times in an obstacle course were measured before and after drinking two beers. The purpose of this study was to check whether drivers are impaired after drinking two beers
660+
A sample of 100 drivers was chosen, and their reaction times in an obstacle course were measured *before* and *after* drinking two beers. The purpose of this study was to check whether drivers are impaired after drinking two beers
660661

661662
*Does drinking beer alter the reaction time of the driver?*
662663

@@ -698,16 +699,20 @@ t.test(value~variable,data=data,alternative=alternative,paired=paired,var.equal=
698699
699700
```
700701

701-
## Dataset7: Weight gain in Rats `data7.csv`
702+
## Dataset7: Pollution in Trees `data7.csv`
702703

703-
The data arise from an experiment to study the gain in weight of rats fed on four different diets, distinguished by amount of protein (low and high) and by source of protein (beef and cereal).
704+
Laureysens et al. (2004) measured metal content in the wood of 13 poplar clones growing in a polluted area, once in August and once in November. Concentrations of aluminum (in micrograms of Al per gram of wood) are shown below.
704705

705-
*Does the high protein diet increase the weight of the rats?*
706+
*Is there any evidence for an increase in pollution between November and August?*
706707

707708
```{r eval=FALSE,echo=FALSE}
708-
data <- read.csv("http://vincentarelbundock.github.io/Rdatasets/csv/HSAUR/weightgain.csv")[,c(3,4)] %>% arrange(type)
709+
## http://www.biostathandbook.com/wilcoxonsignedrank.html
710+
##"The differences are somewhat skewed; the Wolterson clone, in particular, has a much larger difference than any other clone. To be safe, the authors analyzed the data using a Wilcoxon signed-rank test"
711+
712+
data <- data.frame(August = c(8.1,10,16.5,13.6,9.5,8.3,18.3,13.3,7.9,8.1,8.9,12.6,13.4), November = c(11.2,16.3,15.3,15.6,10.5,15.5,12.7,11.1,19.9,20.4,14.2,12.7,36.8))
713+
709714
write.csv(data, file="mystery-data/data7.csv",quote=FALSE,row.names=FALSE)
710-
boxplot(weightgain ~ type,data)
715+
boxplot(data)
711716
```
712717

713718
```{r}

practical.pdf

2.98 KB
Binary file not shown.

0 commit comments

Comments
 (0)