From c4a4248ad38a05f8b2d48c8e3374b9637faa026e Mon Sep 17 00:00:00 2001 From: Hytham Date: Thu, 2 Jul 2015 11:53:06 +0300 Subject: [PATCH 1/6] quantile option probs not props -- typo mistake --- 2_RPROG/R Programming Course Notes.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/2_RPROG/R Programming Course Notes.Rmd b/2_RPROG/R Programming Course Notes.Rmd index 9a7ba87..7acb671 100644 --- a/2_RPROG/R Programming Course Notes.Rmd +++ b/2_RPROG/R Programming Course Notes.Rmd @@ -360,7 +360,7 @@ $\pagebreak$ * ***examples*** * `apply(x, 1, sum)` or `apply(x, 1, mean)` = find row sums/means * `apply(x, 2, sum)` or `apply(x, 2, mean)` = find column sums/means - * `apply(x, 1, quantile, props = c(0.25, 0.75))` = find 25% 75% percentile of each row + * `apply(x, 1, quantile, probs = c(0.25, 0.75))` = find 25% 75% percentile of each row * `a <- array(rnorm(2*2*10), c(2, 2, 10))` = create 10 2x2 matrix * `apply(a, c(1, 2), mean)` = returns the means of 10 From 808cca4cd96dc614e44a0a2626f13bf0336e527c Mon Sep 17 00:00:00 2001 From: Kerredai Date: Thu, 9 Jul 2015 16:19:28 -0700 Subject: [PATCH 2/6] Fixes two small typos --- 1_DATASCITOOLBOX/Data Scientists Toolbox Course Notes.Rmd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/1_DATASCITOOLBOX/Data Scientists Toolbox Course Notes.Rmd b/1_DATASCITOOLBOX/Data Scientists Toolbox Course Notes.Rmd index 8370812..89e5cfb 100644 --- a/1_DATASCITOOLBOX/Data Scientists Toolbox Course Notes.Rmd +++ b/1_DATASCITOOLBOX/Data Scientists Toolbox Course Notes.Rmd @@ -102,7 +102,7 @@ $\pagebreak$ * **Big data** = now possible to collect data cheap, but not necessarily all useful (need the right data) ## Experimental Design -* Formulate you question in advance +* Formulate your question in advance * **Statistical inference** = select subset, run experiment, calculate descriptive statistics, use inferential statistics to determine if results can be applied broadly * ***[Inference]*** **Variability** = lower variability + clearer differences = decision * ***[Inference]*** **Confounding** = underlying variable might be causing the correlation (sometimes called Spurious correlation) @@ -118,5 +118,5 @@ $\pagebreak$ * **Accuracy** = Pr(correct outcome) * **Data dredging** = use data to fit hypothesis * **Good experiments** = have replication, measure variability, generalize problem, transparent -* Prediction is not inference, and be ware of data dredging +* Prediction is not inference, and beware of data dredging From b79fbe285333d6bc7800736879a002153edfe075 Mon Sep 17 00:00:00 2001 From: Kerredai Date: Thu, 9 Jul 2015 16:24:13 -0700 Subject: [PATCH 3/6] Clarifies the meaning of ls -a, and adds renaming a file to cp --- 1_DATASCITOOLBOX/Data Scientists Toolbox Course Notes.Rmd | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/1_DATASCITOOLBOX/Data Scientists Toolbox Course Notes.Rmd b/1_DATASCITOOLBOX/Data Scientists Toolbox Course Notes.Rmd index 89e5cfb..22e87ba 100644 --- a/1_DATASCITOOLBOX/Data Scientists Toolbox Course Notes.Rmd +++ b/1_DATASCITOOLBOX/Data Scientists Toolbox Course Notes.Rmd @@ -20,13 +20,14 @@ $\pagebreak$ * `pwd` = print working directory (current directory) * `clear` = clear screen * `ls` = list stuff - * `-a` = see all (hidden) + * `-a` = see all (including hidden files) * `-l` = details * `cd` = change directory * `mkdir` = make directory * `touch` = creates an empty file * `cp` = copy * `cp ` = copy a file to a directory + * `cp ` = rename a file * `cp -r ` = copy all documents from directory to new Directory * `-r` = recursive * `rm` = remove From c69ac7d9d28fc4f9eee215b17c2fd26b0e7f2208 Mon Sep 17 00:00:00 2001 From: ak2703 Date: Sat, 18 Jul 2015 18:51:05 +0530 Subject: [PATCH 4/6] fix a typo --- 2_RPROG/R Programming Course Notes.Rmd | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/2_RPROG/R Programming Course Notes.Rmd b/2_RPROG/R Programming Course Notes.Rmd index 9a7ba87..f77adb4 100644 --- a/2_RPROG/R Programming Course Notes.Rmd +++ b/2_RPROG/R Programming Course Notes.Rmd @@ -2,13 +2,13 @@ title: "R Programming Course Notes" author: "Xing Su" output: - pdf_document: - toc: yes - toc_depth: 3 html_document: highlight: pygments theme: spacelab toc: yes + pdf_document: + toc: yes + toc_depth: 3 --- $\pagebreak$ @@ -551,7 +551,7 @@ $\pagebreak$ ### Larger Tables * ***Note**: help page for read.table important* * need to know how much RAM is required $\rightarrow$ calculating memory requirements - * `numRow` x `numCol` x 8 bytes/numeric value = size required in bites + * `numRow` x `numCol` x 8 bytes/numeric value = size required in bytes * double the above results and convert into GB = amount of memory recommended * set `comment.char = ""` to save time if there are no comments in the file * specifying `colClasses` can make reading data much faster From e9428be2a8deffaa4b5b0678c73cad8776d45c1d Mon Sep 17 00:00:00 2001 From: Paul Adamson Date: Sat, 14 Nov 2015 23:05:58 -0500 Subject: [PATCH 5/6] fix missing parenthesis and clean up syntax --- 3_GETDATA/Getting and Cleaning Data Course Notes.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/3_GETDATA/Getting and Cleaning Data Course Notes.Rmd b/3_GETDATA/Getting and Cleaning Data Course Notes.Rmd index ad298f3..5e26fe1 100644 --- a/3_GETDATA/Getting and Cleaning Data Course Notes.Rmd +++ b/3_GETDATA/Getting and Cleaning Data Course Notes.Rmd @@ -63,7 +63,7 @@ $\pagebreak$ * ***Relative***: `setwd("./data")`, `setwd("../")` = move up in directory * ***Absolute***: `setwd("/User/Name/data")` * **Check if file exists and download file** - * `if(!file.exists("data"){dir.create("data")}` + * `if(!file.exists("./data")) {dir.create("./data")}` * **Download file** * `download.file(url, destfile= "directory/filname.extension", method = "curl")` * `method = "curl"` [mac only for https] From 3c1adcfab630ffeefb8bde038f590c3cba1d6ab7 Mon Sep 17 00:00:00 2001 From: Oleh Khoma Date: Sat, 9 Jan 2016 00:33:06 +0200 Subject: [PATCH 6/6] Corrected beta1 variance derivation formulas in "Intervals/Tests for Coefficients" section --- 7_REGMODS/Regression Models Course Notes.Rmd | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/7_REGMODS/Regression Models Course Notes.Rmd b/7_REGMODS/Regression Models Course Notes.Rmd index 16539ed..299954e 100644 --- a/7_REGMODS/Regression Models Course Notes.Rmd +++ b/7_REGMODS/Regression Models Course Notes.Rmd @@ -743,13 +743,14 @@ $\pagebreak$ ### Intervals/Tests for Coefficients * standard errors for coefficients $$\begin{aligned} -Var(\hat \beta_1) & = Var\left(\frac{\sum_{i=1}^n (Y_i - \bar Y)(X_i - \bar X)}{((X_i - \bar X)^2)}\right) \\ -(expanding) & = Var\left(\frac{\sum_{i=1}^n Y_i (X_i - \bar X) - \bar Y \sum_{i=1}^n (X_i - \bar X)}{((X_i - \bar X)^2)}\right) \\ -& Since~ \sum_{i=1}^n X_i - \bar X = 0 \\ -(simplifying) & = \frac{\sum_{i=1}^n Y_i (X_i - \bar X)}{(\sum_{i=1}^n (X_i - \bar X)^2)^2} \Leftarrow \mbox{denominator taken out of } Var\\ +Var(\hat \beta_1) & = Var\left(\frac{\sum_{i=1}^n (Y_i - \bar Y)(X_i - \bar X)}{(\sum_{i=1}^n (X_i - \bar X)^2)^2}\right) \\ +(expanding) & = Var\left(\frac{\sum_{i=1}^n Y_i (X_i - \bar X) - \bar Y \sum_{i=1}^n (X_i - \bar X)}{(\sum_{i=1}^n (X_i - \bar X)^2)^2}\right) \\ +& Since~ \sum_{i=1}^n (X_i - \bar X) = 0 \\ +(simplifying) & = \frac{Var\left(\sum_{i=1}^n Y_i (X_i - \bar X)\right)}{(\sum_{i=1}^n (X_i - \bar X)^2)^2} \Leftarrow \mbox{denominator taken out of } Var\\ +& Since~ Var\left(\sum aY\right) = \sum a^2 Var\left(Y\right) \\ (Var(Y_i) = \sigma^2) & = \frac{\sigma^2 \sum_{i=1}^n (X_i - \bar X)^2}{(\sum_{i=1}^n (X_i - \bar X)^2)^2} \\ \sigma_{\hat \beta_1}^2 = Var(\hat \beta_1) &= \frac{\sigma^2 }{ \sum_{i=1}^n (X_i - \bar X)^2 }\\ -\Rightarrow \sigma_{\hat \beta_1} &= \frac{\sigma}{ \sum_{i=1}^n X_i - \bar X} \\ +\Rightarrow \sigma_{\hat \beta_1} &= \frac{\sigma}{ \sqrt {\sum_{i=1}^n (X_i - \bar X)^2}} \\ \\ \mbox{by the same derivation} \Rightarrow & \\ \sigma_{\hat \beta_0}^2 = Var(\hat \beta_0) & = \left(\frac{1}{n} + \frac{\bar X^2}{\sum_{i=1}^n (X_i - \bar X)^2 }\right)\sigma^2 \\