TD2: Générateurs pseudo-aléatoires (RICM4, 2013) ================================================ L'objectif de cette fiche est de présenter les méthodes principales de génération de nombres pseudo-aléatoires, de comprendre leurs limitations, d'apprendre à s'en méfier et de voir comment éventuellement les corriger... Générateurs à base de congruence -------------------------------- ```{r} congruent_random <- function(n = 100, a=4, b=2, m=9, x1=1) { res = c(x1) for(i in 2:n) { res[i] = (a * res[i-1] + b) %% m } res } ``` Let's check whether this seems correct or not: ```{r} congruent_random(10) ``` Now, let's make a few simple graphical checks ```{r fig.height=3} graphical_check <- function(X) { par(mfrow=c(1,2)) hist(X) plot(X[1:length(X)-1],X[2:length(X)]) } graphical_check(congruent_random(100)) ``` Here is the ggplot2 version, just in case: ```{r fig.height=3} library(ggplot2) require(gridExtra) graphical_check_ggplot <- function(X) { par(mfrow=c(1,2)) df <- data.frame(x=X[1:length(X)-1],y=X[2:length(X)]) p1 <- ggplot(df,aes(x=x)) + geom_histogram() + theme_bw() p2 <- ggplot(df,aes(x=x,y=y)) + geom_point() + theme_bw() grid.arrange(p1, p2, ncol=2) } graphical_check_ggplot(congruent_random(100)) ``` OK, obviously, one cannot expect much from a generator whose space state is so small. Let's try with another one: ```{r fig.height=3} graphical_check_ggplot(congruent_random(n=100,a=11,b=1,m=71)) ``` Yet, it you ever computed the correlation of {x_n} and {x_{n+1}} , you could have oversee this relation: ```{r} X <- congruent_random(n=1000,a=11,b=1,m=71) cor(X[1:length(X)-1],X[2:length(X)]) ``` Co-variance would have helped though. ```{r} cov(X[1:length(X)-1],X[2:length(X)]) ``` Fortunately, some generator are better than others: ```{r fig.height=3} graphical_check_ggplot(congruent_random(n=1000,a=24298,b=99991,m=199017)) ``` Anyway, you should always be suspicious with random number generators. After all, although the previous two graphical checks seem comforting, there are probably other biases left... Générateurs à décalage de registres ----------------------------------- ```{r} X=xor_random <- function(n = 100, filter=c(1,0,1,0), seed=c(1,0,1,0)) { res = seed m = length(filter) print(m) for(i in (length(seed)):(n-1)) { # Yes, you need to put parenthesis around n-1 :( res[i+1] = sum(filter*(res[(i-m+1):i]))%%2 # Yes, you need to put parenthesis around i-m+1 :( } res } xor_random() xor_random(seed=c(1,0,0,1)) ``` Yuck. This is definitely not random... Actually, there are two interleaved sequences. The length of the cycle is 8. Let's try something different. ```{r} xor_random(filter=c(1,1,0,0),seed=c(1,1,0,1)) ``` That's better. Now the length of the cycle is 16. ![Well, maybe those numbers are just intrinsically better!](http://imgs.xkcd.com/comics/ayn_random.png)