Read R4DS chapter 19.
Solve the course Introduction to Writing Functions in R at Datacamp.
Complete the exercises in R4DS sections 19.2.1, 19.3.1, 19.4.4, 19.5.5.
Functional programming is a vast (and quite complicated) subject. It is quite different to the usual concept of object oriented programming. Although you should know what functional programming is, we wont cover much about the concept other than the basics.
As the name entails, functions are a big part of functional programming. In R, functions are objects and consist of three components. The arguments
(or formals
if you want to be fancy), a body
and the environment
. The arguments are exactly what the describe, though they may not need be defined! The body is what the function is supposed to do and its environment is what is used
With the question below we are going to take a look at how functions can help with readability but also adaptability.
...
!If you take a look at the R help for the mean
function you can see the following entry under Arguments
... further arguments passed to or from other methods.
The ...
is essentially a “anything and everything”-type of argument. If you are used to Python this is kind of the same as **kwargs
. Anything can go in there, which is really powerful but also quite worrying. How do we know if the arguments we pass to ...
will ever be used? Take the following (trivial) example
x <- runif(10)
mean(x)
mean(x, something_something = FALSE)
Try to answer the following questions: - Do you expect the same output and is it? - Do functions need arguments? - What happens if you mispell the actual argument trim
with say, trm
?
We are going to use this feature to play games. We use it to set up how a game is played and with what probabilities. Implement the function below according to its instructions.
simulate_round <- function(probability_fun, ...){
# Function which specifies a round of a game. It will
# simulate an outcome from a Bernoulli distribution whose
# probability of sucess is determined by the function
# probability_fun. The arguments of the function is not yet
# determined which must be accounted for.
}
We are going to simulate a game where there are two teams who play against each other. The game consists of a number of rounds. If a team wins one round then the team gets one point. The win conditions (e.g. when one team wins against the other) are
16, 19, ,22, 25, ...
,To clarify, the first team who gets 16 points wins, given that the other team has less than 15 points. If the score is 15-15 then the teams will play an additional 6 rounds. Given that the score is 15-15, the first team to get 19 points wins, given that the other team has less than 18. If the score is 18-18 then another 6 rounds are played and so the match continues.
Implement the following function which determines if a game is finished or not
win_condition_achieved <- function(scoreA, scoreB) {
# Function which determines if a win condition is achieved.
# It takes the score of player A and B and returns a bool
# TRUE/FALSE if the game has ended, e.g. someone has won.
}
Our interest is in score distribution when a team wins and the probability of any given team winning given a score. For us to determine these quantities we need a little background of Monte Carlo integration. If you feel like you know how Monte carlo integration works, feel free to skip it.
A very brief introduction to Monte Carlo integration
If the stochastic variable \(X\) is follows a standard uniform distribution, then \(E(f(X)) =\int_a ^b f(x)dx\). This can be used in order to approximate integrals since \[
E(f(X)) \approx \frac{1}{N} \sum_i^N f(x_i)
\] That is, we simulate a large number \(x_1,…,x_N\) and evaluate the function \(f\) accordingly. By the Law of Large Numbers, $ _i^N f(x_i)/N$ will converge to the expecation and can thus be viewed as a numerical approximation of the integral. Hence, a probability can be evaluated according to
\[
P(X\leq x) = E(\mathbf{1}(X \leq x)) \approx \frac{1}{N} \sum_i^N \mathbf{1}(x_i \leq x).
\]
The key takeaway is that to approximate anything involving an integral, we can write it as as an expectation and compute it through simulation Monte Carlo integration. So if the event is “complicated enough” we can always approximate its probability by simulation! We are going to use our functions to simulate the game. First thing is to determine how the probability is going to be calculated. Remember that its always best to start simple.
probability_fun <- function(scoreA, scoreB, ...) {
# code
}
For each simulation iteration we want to keep track of the score of the team A, the score of team B and who won. Store these in the matrix results
. Use all functions we have described above to simulate a number of repetitions (simulations) of the game. While no one has won, we need to
break
, save the outcomes otherwise continue with 1.nsims <- 100 # change to a larger number if you want though it might take some time
results <- matrix(nrow=nsims, ncol=3)
for (x in 1:nsims) {
game_won <- FALSE
scoreA <- 0
scoreB <- 0
while (!game_won) {
# code
}
# save outcomes
}
colMeans(results)
Try to answer the following questions:
colMeans(results)
represent? What are they trying to approximate?probability_fun
return 0.5
or 0.51
regardless of the score? How does the distribution of number of rounds played look? What happens if probability_fun
depends on the score (e.g. it might return scoreA / (scoreA + ScoreB)
)? Could you somehow implement momentum in the probability?