I used R for two of my classes in college: Probability and Statistics and Stochastic Probabilities. During that time, I was in reverence with the way it handled lists. However, I did not use it much after getting done with those classes. I have decided to relearn it.

In this post, I will start off with some simple programs. After ascertaining some comfort with the syntax and semantics of R, I will try some complex ones. Also, since I am in the process of learning myself, if you decide to use some of these codes, just keep in mind that there might be several better ways of doing the same thing.

Vector implementation:

This implementation gets rid of the for loop. It is similar to the code in one dimensional random walk code above.

How a R programmer would have done it (I think)

Result:

In this post, I will start off with some simple programs. After ascertaining some comfort with the syntax and semantics of R, I will try some complex ones. Also, since I am in the process of learning myself, if you decide to use some of these codes, just keep in mind that there might be several better ways of doing the same thing.

### 1. Dice Rolls: Probability of getting a side f number of times in n dice rolls

# DiceRolls.R # Probability of getting a side f number of times in n dice rolls #inputs side <- 4 f <- 2 n <- 5 # Binomial Distribution # using R's function for binomial distribution # where P(X=k) probability <- dbinom(frequency, n, 1/6) probability # Monte Carlo Method rep <- 10000 # Number of repetitions list <- numeric(rep) # Initialize vector for (i in 1:rep) { trial <- sample(1:6, n, replace=TRUE) #1 if number of times a side appears is f success <- if ((sum(trial==side))==f) 1 else 0 list[i] <- success } mean(list)

> probability [1] 0.160751 > mean(list) [1] 0.1647

### 2. One dimentional random walks

# Five random walks in one dimension starting at 0 # creating a color string vector color <- c("red","green","blue","black","darkgoldenrod1") # cleaning the plot; this is necessary because we are using par(new=TRUE) plot.new() for (i in 1:5) { x0<-0 I<-1 T <-1000 x=c(x0,rnorm(T-1)) # T-1 randomly generated normal distributions y=cumsum(x) par(new=TRUE) # so that the previous graphs are not deleted plot(y,type="l",col=color[i], main="Five random walks in one dimension starting at 0", xlab="", ylim=c(-60, 60)) }

### 3. Two dimensional random walk

I stumbled upon a presentation slideshow that discusses how R can be used more efficiently. They perform analysis on several implementations that simulate two dimensional random walk. I started out with what they call "naive implementation". According to them, this is how a person who codes in C, C++ or Java would think of doing it. I totally agree. This exercise is great because it helps understand the positives of R in comparison to other languages that I am comfortable with. Link to the aforementioned presentation slideshow.

Naive implementation:

# naive implementation (loops are slow in R) naive <- function(n){ # numeric creates a double-precision vector of the specified length with each element equal to 0 xpos = ypos = numeric(n) # will be used to randomly generate direction xdir = c(TRUE,FALSE) nextdir = c(1,-1) # xpos and ypos are already 0 (see comment on numeric) for(i in 2:n){ if(sample(xdir,1)){ xpos[i]=xpos[i-1]+sample(nextdir,1) ypos[i]=ypos[i-1] } else{ xpos[i]=xpos[i-1] ypos[i]=ypos[i-1]+sample(nextdir,1) } } plot(xpos,ypos,type="l",col="blue",main="Two Dimensional Random Walk", xlab="x", ylab="y") # starting point points(0,0,pch = 19,col="red") }

> system.time(naive(50000)) user system elapsed 0.84 0.17 1.02

> system.time(naive(100000)) user system elapsed 1.70 0.32 2.03

This implementation gets rid of the for loop. It is similar to the code in one dimensional random walk code above.

# Vectorization Process factorization = function(n){ steps = sample(c(-1,1), n-1, replace = T) xdir=sample(c(TRUE,FALSE), n-1,replace=T) xpos=c(0,cumsum(ifelse(xdir,steps,0))) ypos=c(0,cumsum(ifelse(xdir,0,steps))) plot(xpos,ypos,col="green",type="l",main="Two Dimensional Random Walk", xlab="x", ylab="y") points(0,0,pch = 19,col="red") }

> system.time(vectorization(50000)) user system elapsed 0.05 0.19 0.25

> system.time(vectorization(100000)) user system elapsed 0.09 0.31 0.40

### 4. Birthday Probability

Click here to go to my post on Birthday Probability using Java. Using R, I was really surprised how easy it was.

How a java programmer would have done it in R

# Birthday Probability birthday <- function(numofpeople, numofrooms){ count <- 0 for (i in 1:numofrooms){ bday <- sample(1:365,numofpeople, replace=T) if (any(duplicated(bday))) count<-count+1; } probability = count/numofrooms probability }

How a R programmer would have done it (I think)

# Birthday Probability birthday <- function(numofpeople, numofrooms){ list <- numeric(numofrooms) for (i in 1:numofrooms){ bday <- sample(1:365,numofpeople, replace=T) list[i] <- if (any(duplicated(bday))) 1 else 0 } mean(list) }

Result:

> source("birthday.r") > birthday(23,5000) [1] 0.5044

### 5. Mandelbrot Set Area

Click here to go to my previous blog post on the area of the Mandelbrot Set using Java and Javascript.

monte.Carlo <- 5000 x <- runif(monte.Carlo, -2, 2) y <- runif(monte.Carlo, -2, 2) list <- numeric(monte.Carlo) for (j in 1:monte.Carlo){ list[j] <- if (inmandelbrotset(complex(real = x[j], imaginary = y[j]))) 1 else 0 } area<-mean(list)*16

# function that checks if a point E mandelbrot set inmandelbrotset <- function(c) { dwell.limit <- 2048 z <- 0 for (i in 1:dwell.limit) { z <- z ** 2 + c if (Mod(z) > 2) { return(FALSE) } } return(TRUE) }

### Result:

> system.time(source("mandelbrot.r")) user system elapsed 1.91 0.00 1.90 > area [1] 1.5168