# chap14 mc

```Monte Carlo Integration
Digital
g Image
g Synthesis
y
Yung-Yu Chuang
with slides by Pat Hanrahan and Torsten Moller
Introduction
• The integral equations generally don’t have
analytic solutions
solutions, so we must turn to numerical
methods.
• Standard
St d d methods
th d lik
like T
Trapezoidal
id l iintegration
t
ti
or Gaussian quadrature are not effective for
hi h di
high-dimensional
i
l and
d di
discontinuous
ti
iintegrals.
t
l
Lo (p, ωo )  Le (p, ωo )
  2 f (p, ωo , ωi )Li (p, ωi ) cos θ i dωi
s
b
• Suppose we want to calculate I  a f ( x)dx , but
can’tt solve it analytically.
can
analytically The approximations
through quadrature rules have the form
n
Iˆ   wi f ( xi )
i 1
which is essentially the weighted sum of
samples
p
of the function at various p
points
Midpoint rule
convergence
Trapezoid rule
convergence
Simpson’s rule
• Similar to trapezoid but using a quadratic
polynomial approximation
convergence
assuming f has a continuous fourth derivative.
Curse of dimensionality and discontinuity
• For an sd
f
function
i f,
f
• If the 1d rule has a convergence rate of O(n-r),
the sd rule would require a much larger number
(ns) of samples to work as well as the 1d one.
Thus, the convergence rate is only O(n-r/s).
g
is O(n
( -1/s) for
• If f is discontinuous,, convergence
sd.
Randomized algorithms
• Las Vegas v.s. Monte Carlo
• Las
L V
Vegas: always
l
gives
i
the
h right
by
using randomness.
• Monte Carlo: gives the right answer on the
average. Results depend on random numbers
used, but statistically likely to be close to the
Monte Carlo integration
• Monte Carlo integration: uses sampling to
estimate the values of integrals
integrals. It only
requires to be able to evaluate the integrand at
arbitrary points,
points making it easy to implement
and applicable to many problems.
• If n samples
l are used,
d it
its converges att th
the rate
t
of O(n-1/2). That is, to cut the error in half, it is
necessary to
t evaluate
l t ffour ti
times as many
samples.
• Images by Monte Carlo methods are often noisy.
Most current methods try to reduce noise.
Monte Carlo methods
– E
Easy tto iimplement
l
t
– Easy to think about (but be careful of statistical bias)
– Robust
R b t when
h used
d with
ith complex
l iintegrands
t
d and
d
domains (shapes, lights, …)
– Efficient for high dimensional integrals
– Noisy
– Slow (many samples needed for convergence)
Basic concepts
• X is a random variable
• Applying
A l i a function
f
i to a random
d
variable
i bl gives
i
another random variable, Y=f(X).
• CDF (cumulative distribution function)
P ( x)  Pr{{ X  x}
• PDF (probability density function): nonnegative,
sum to 1
dP ( x)
p( x) 
dx
• canonical
i l uniform
if
random
d
variable
i bl ξ ((provided
id d
by standard library and easy to transform to
other
th di
distributions)
t ib ti )
Discrete probability distributions
• Discrete events Xi with probability pi
pi  0
n
p
i 1
i
1
pi
• Cumulative PDF (distribution)
j
1
Pj   pi
i 1
U
• Construction of samples:
To randomly select an event,
Select Xi if Pi 1  U  Pi
Uniform random variable
Pi
0
X3
Continuous probability distributions
• PDF p( x )
Uniform
p( x )  0
• CDF P( x )
1
x
P(1)  1
P( x )   p( x )dx
0
P( x )  Pr( X  x )

P (  X   )   p( x ) dx
Pr(
d

 P(  )  P( )
0
0
1
Expected values
• Average value of a function f(x) over some
distribution of values p(x) over its domain D
E p  f ( x)   f ( x) p ( x)dx
D
• Example:
p
cos function over [[0,, π],
], p is uniform

1
0

E p cos( x)   cos x
dx  0
Variance
• Expected deviation from the expected value
• Fundamental
F d
l concept off quantifying
if i the
h error
in Monte Carlo methods

V [ f ( x)]  E  f ( x)  E  f ( x)
2

Properties
E af ( x)  aE  f ( x)


E  f ( X i )    E  f ( X i ) 
 i
 i
V af ( x)  a 2V  f ( x)


V  f ( x)  E  f ( x)   E  f ( x)
2
2
Monte Carlo estimator
• Assume that we want to
evaluate the integral of
f(x) over [a,b]
• Given
Gi
a uniform
if
random
d
variable Xi over [a,b],
M t C
Monte
Carlo
l estimator
ti t
ba N
FN 
f (Xi)

N i 1
says that the expected
value E[FN] of the
estimator FN equals the
integral
General Monte Carlo estimator
• Given a random variable X drawn from an
arbitrary PDF p(x),
p(x) then the estimator is
1
FN 
N
N

i 1
f (Xi)
p( X i )
• Although the converge rate of MC estimator is
( 1/2)), slower than other integral methods, its
O(N
converge rate is independent of the dimension,
making it the only practical method for high
dimensional integral
Convergence of Monte Carlo
• Chebyshev’s inequality: let X be a random
variable with expected value μ and variance σ2.
For any real number k&gt;0,
1
Pr{| X   | k }  2
k
• For example, for k= 2 , it shows that at least
half of the value lie in the interval (   2 ,   2 )
• Let Yi  f ( X i ) / p( X i ) , the MC estimate FN becomes
1
FN 
N
N
Y
i 1
i
Convergence of Monte Carlo
• According to Chebyshev’s inequality,
1
2

V
[
F
]

N  
Pr | FN  E[ FN ] | 
 
   

1
V [ FN ]  V 
N
 1 N  1
Yi   2 V  Yi   2

i 1
 N  i 1  N
N
N
1
V Yi   V Y 

N
i 1
• Plugging into Chebyshev’s inequality,
inequality
1

1  V [Y ]  2 
Pr | FN  I |

 
N    

So,, for a fixed threshold,, the error decreases at
the rate N-1/2.
Properties of estimators
• An estimator FN is called unbiased if for all N
E[ FN ]  Q
That is, the expected value is independent of N.
• Otherwise, the bias of the estimator is defined
as
 [ FN ]  E[ FN ]  Q
• If the bias goes to zero as N increases,
increases the
estimator is called consistent
li  [ FN ]  0
lim
N 
lim E[ FN ]  Q
N 
Example of a biased consistent estimator
• Suppose we are doing antialiasing on a 1d pixel,
to determine the pixel value,
value we need to
1
evaluate I   w( x) f ( x)dx , where w(x) is the
1
0
filter function with  w( x)dx
d 1
0
• A common way to evaluate this is
N
w( X i ) f ( X i )

i 1
FN 
N
i 1 w( X i )
• When N=1, we have
1
 w( X 1 ) f ( X 1 ) 
E[ F1 ]  E 
  E[ f ( X 1 )]  0 f ( x)dx  I
 w( X 1 ) 
Example of a biased consistent estimator
• When N=2, we have
1 1 w( x ) f ( x )  w( x ) f ( x )
1
1
2
2
E[ F2 ]   
dx1dx2  I
0 0
w( x1 )  w( x2 )
• However, when N is very large, the bias
approaches
pp
to zero
1
FN  N

1
N
N
i 1
w( X i ) f ( X i )

N
i 1
w( X i )
1 N
li
lim
i 1 w( X i ) f ( X i )
N  N
lim E[ FN ] 

N 
1 N
lim i 1 w( X i )
N  N
1
 w( x) f ( x)dx  w( x) f ( x)dx  I

 w( x)dx
1
0
1
0
0
Choosing samples
1 N f (Xi)
FN  
N i 1 p ( X i )
• Carefully choosing the PDF from which samples
are drawn is an important technique to reduce
variance. We want the f/p to have a low
variance Hence
variance.
Hence, it is necessary to be able to
draw samples from the chosen PDF.
• How to sample an arbitrary distribution from a
variable of uniform distribution?
•
– Inversion
ve s o
– Rejection
– Transform
Inversion method
• Cumulative probability distribution function
P( x )  Pr( X  x )
• Construction of samples
Solve for X
X=P
P-1(U)
1
U
• Must know:
1. The integral of p(x)
2. The inverse function P-11(x)
0
X
Proof for the inversion method
• Let U be an uniform random variable and its
CDF is Pu(x)=x.
(x) x We will show that Y=P
Y P-1(U) has
the CDF P(x).
Proof for the inversion method
• Let U be an uniform random variable and its
CDF is Pu(x)=x.
(x) x We will show that Y=P
Y P-1(U) has
the CDF P(x).


PrY  x  Pr P 1 (U )  x  PrU  P ( x)  Pu ( P ( x))  P ( x)
because P is monotonic,
P( x1 )  P( x2 ) if x1  x2
Thus, Y’s CDF is exactly P(x).
Inversion method
• Compute CDF P(x)
• Compute P-1(x)
• Obtain ξ
• Compute Xi=P-1(ξ)
Example: power function
It is used in sampling Blinn’s microfacet model.
p( x)  x
n
Example: power function
It is used in sampling Blinn’s microfacet model.
• Assume
A
1
p( x )  ( n  1) x
n
n 1
1
x
1
0 x dx  n  1  n  1
0
n
P( x )  x n 1
X ~ p( x )  X  P 1 (U )  n 1 U
Trick (It only works for sampling power distribution)
Y  max(U1, U 2 ,, U n , U n 1 )
n1
Pr(Y  x )   Pr(U  x )  x n 1
i 1
Example: exponential distribution
p ( x)  ce  ax
useful for rendering participating media.
• Compute CDF P(x)
• Compute P-1(x)
• Obtain
Obt i ξ
• Compute Xi=P-1(ξ)
Example: exponential distribution
p ( x)  ce  ax


0
ce
 ax
useful for rendering participating media.
dx  1
• Compute CDF P(x)
ca
x
P( x)   ae ds  1  e
 as
 ax
0
• Compute P-1(x)
1
P ( x)   ln(1  x)
a
• Obtain
Obt i ξ
1
1
-1
• Compute Xi=P (ξ) X   ln((1   )   ln 
a
a
1
Rejection method
• Sometimes, we can’t integrate into CDF or
invert CDF
1
I   f ( x ) dx
0


dx dy
y  f ( x)
y f ( x )
• Algorithm
Pick U1 and U2
Accept U1 if
U2 &lt; f(U1)
• Wasteful? Efficiency = Area / Area of rectangle
Rejection method
• Rejection method is a dart-throwing method
without performing integration and inversion.
inversion
1. Find q(x) so that p(x)&lt;Mq(x)
2. Dart throwing
a. Choose a pair (X, ξ), where X is sampled from q(x)
b. If (ξ&lt;p(X)/Mq(X)) return X
• Equivalently,
qu vale tly, we p
pick
c a
point (X, ξMq(X)). If
it lies beneath p(X)
(X)
then we are fine.
Why it works
• For each iteration, we generate Xi from q. The
sample is returned if ξ&lt;p(X)/Mq(X),
ξ&lt;p(X)/Mq(X) which
happens with probability p(X)/Mq(X).
• So,
S the
th probability
b bilit tto return
t
x is
i
p( x)
p( x)
q( x)

Mq( x)
M
whose integral is 1/M
• Thus, when a sample is returned (with
probability 1/M),
1/M) Xi is distributed according to
p(x).
Example: sampling a unit disk
void RejectionSampleDisk(float *x, float *y) {
float sx,
, sy;
y;
do {
sx = 1.f -2.f * RandomFloat();
sy = 1.f -2.f * RandomFloat();
} while (sx*sx + sy*sy &gt; 1.f)
*x = sx; *y = sy;
}
π/4～78.5% good samples, gets worse in higher
dimensions, for example, for sphere, π/6
π/6～52.4%
52.4%
Transformation of variables
• Given a random variable X from distribution px(x)
to a random variable Y=y(X),
Y y(X) where y is one-toone to
one, i.e. monotonic. We want to derive the
distribution of Y,
Y py(y).
(y)
• Py ( y ( x))  Pr{Y  y ( x)}  Pr{ X  x}  Px ( x)
Px(x)
• PDF:
dP ( y ( x)) dP ( x)
y
dx
dyy dPy ( y ) dyy
p y ( y)

dx
dy dx

x
x
Py(y)
dx
p x (x)
1
 dyy 
p y ( y )    p x ( x)
 dx 
y
Example
p x ( x)  2 x
Y  sin X
1
2x
2 sin y
p y ( y )  (cos x) p x ( x) 

2
cos x
1 y
1
Transformation method
• A problem to apply the above method is that
we usually have some PDF to sample from
from, not
a given transformation.
• Given
Gi
a source random
d
variable
i bl X with
ith px(x)
( ) and
d
a target distribution py(y), try transform X into
t another
to
th random
d
variable
i bl Y so that
th t Y has
h th
the
distribution py(y).
• We first have to find a transformation y(x) so
that Px(x)=Py(y(x)). Thus,
1
y
y ( x)  P ( Px ( x))
Transformation method
• Let’s prove that the above transform works.
W first
We
fi prove that
h the
h random
d
variable
i bl Z= Px(x)
( )
has a uniform distribution. If so, then Py1 ( Z )
should have distribution Px(x) from the inversion
method.
PrZ  x  PrPx ( X )  x  PrX  Px1 ( x)  Px ( Px1 ( x))  x
Thus Z is uniform and the transformation works.
Thus,
works
• It is an obvious generalization of the inversion
method in which X is uniform and Px(x)=x.
method,
(x)=x
Example
p x ( x)  x
y
p y ( y)  e y
Example
p x ( x)  x
2
x
Px ( x) 
2
y
p y ( y)  e y
Py ( y )  e
y
Py1 ( y )  ln y
2
x
y ( x)  P ( Px ( x))  ln( )  2 ln x  ln 2
2
1
y
Thus, if X has the distribution p x ( x)  x , then the
random variable Y  2 ln X  ln 2 has the distribution
p y ( y)  e y
Multiple dimensions
We often need the other way around,
Spherical coordinates
• The spherical coordinate representation of
i  cos 
directions is x  r sin
y  r sin  sin 
z  r cos 
| J T | r 2 sin 
p (r ,  ,  )  r 2 sin p ( x, y, z )
Spherical coordinates
• Now, look at relation between spherical
directions and a solid angles
d  sin dd
 ,
p ( ,  )dd  p ( )d
• Hence, the density in terms of
p ( ,  )  sin p ( )
Multidimensional sampling
• Separable case: independently sample X from px
and Y from py. p(x,y)=p
p(x y) px(x)py(y)
• Often, this is not possible. Compute the
marginal
i l density
d it ffunction
ti p(x)
( ) first.
fi t
p ( x)   p ( x, y )dy
• Then,
Then compute the conditional density function
p ( x, y )
p ( y | x) 
p ( x)
• Use 1D sampling with p(x) and p(y|x).
Sampling a hemisphere
• Sample a hemisphere uniformly, i.e. p( )  c
1
c
2
1   p ( )

• Sample  first
2
2
1
p ( ) 
2
sin 
p ( ,  ) 
2
sin 
p ( )   p( ,  )d  
d  sin 
2
0
0
• Now sampling 
p ( ,  ) 1
p ( |  ) 

p ( )
2
Sampling a hemisphere
• Now, we use inversion technique in order to
sample the PDF
PDF’ss

P( )   sin  ' d '  1  cos 
0

1

P( |  )  
d ' 
2
2
0
• Inverting these:
  cos 1 1
  2 2
Sampling a hemisphere
• Convert these to Cartesian coordinate
  cos 1
  2 2
1
x  sin  cos   cos(21 ) 1  12
y  sin  sin   sin( 21 ) 1  
2
1
z  cos   1
• Similar
Si il derivation
d i ti ffor a ffull
ll sphere
h
Sampling a disk
WRONG  Equi
Equi-Areal
Areal
RIGHT  Equi
Equi-Areal
Areal
  2 U1
  2 U1
r  U2
r  U2
Sampling a disk
WRONG  Equi
Equi-Areal
Areal
RIGHT  Equi
Equi-Areal
Areal
  2 U1
  2 U1
r  U2
r  U2
Sampling a disk
• Uniform p( x, y )  1

• Sample r first.
• Then,
Then sample  .
• Invert the CDF.
p (r ,  )  rp ( x, y ) 
r

2
p(r ) 
 p ( r ,  ) d  2 r
0
p(r , ) 1
p ( | r ) 

p (r
(r )
2
P(r )  r
r  1
2

P ( | r ) 
2
  2 2
Shirley’s mapping
r  U1

 U2
4 U1
Sampling a triangle
u0
v0
uv 1
(u,1  u)
v
u
A
1
0

1u
0
p(u, v)  2
uv 1
(1  u)
dv du   (1  u) du  
0
2
1
2
1
0
1

2
Sampling a triangle
• Here u and v are not independent! p(u, v)  2
• Conditional probability
p (u )   p (u , v)dv
d
1 u
p (u , v)
p (u | v) 
p (v )
p(u)  2  dv
d  2(1  u)
0
u0
P(u0 )   2(1  u) du  (1  u0 )2
u0  1  U1
0
1
p(v | u) 
(1  u)
vo
vo
P(v0 | u0 )   p(v | u0 ) dv  
0
0
v0  U1U 2
v0
1
dv 
(1  u0 )
(1  u0 )
Cosine weighted hemisphere
p ( )  cos 
1   p ( )d

1 
2
0


c cos  sin dd
2
0

1  c 2  cos  sin d
2
0
c  1
p ( ,  ) 
1

cos  sin 
d  sin dd
Cosine weighted hemisphere
p ( ,  ) 
1

cos  sin 
2
1
0

p ( )  
cos  sin
i d  2 cos  sin
i   sin
i 2
p ( ,  ) 1
p ( |  ) 

2
p ( )
1
1
P ( )   cos 2   1
2
2

P ( |  ) 
 2
2
1
  cos 1 (1  21 )
2
  2 2
Cosine weighted hemisphere
• Malley’s method: uniformly generates points on
the unit disk and then generates directions by
projecting them up to the hemisphere above it.
Vector CosineSampleHemisphere(float u1
u1,float
float u2){
Vector ret;
ConcentricSampleDisk(u1 u2,
ConcentricSampleDisk(u1,
u2 &amp;ret.x,
&amp;ret x &amp;ret.y);
&amp;ret y);
ret.z = sqrtf(max(0.f,1.f - ret.x*ret.x ret.y ret.y));
ret.y*ret.y));
return ret;
}
r

Cosine weighted hemisphere
• Why does Malley’s method work?
r
p(r ,  ) 
• Unit
U i di
disk
k sampling
li

• Map to hemisphere (r ,  )  (sin  ,  )
T
Y  (r ,  )
X  ( ,  )
r  sin
i 
 
p y (T ( x))  J T ( x)
1
 cos 
J T ( x)  
 0
0
  cos 
1
p x ( x)


Cosine weighted hemisphere
T
Y  (r ,  )
X  ( ,  )
r  sin 
 
p y (T ( x))  J T ( x)
1
 cos 
J T ( x)  
 0
0
  cos 
1
p x ( x)
p ( ,  )  J T p (r ,  ) 
cos  sin 

Sampling Phong lobe
p ( )  cos n 
2  / 2
  c cos
p ( )  c cos 
n
n
0 0
 2c
0
 cos
n
d cos   1
cos 1
n 1
c
2
n 1
n
p ( ,  ) 
cos  sin 
2
 sin dd  1
2c
1
n 1
Sampling Phong lobe
n 1 n
cos  sin 
2
2
n 1 n
p ( )  
cos  sin d (n  1) cos n  sin 
  0 2
p ( ,  ) 
P( ' ) 
'
n
(
n

1
)
cos
 sin d

 0
'
 (n  1)  cos n d cos   (n  1)
 0
 1  cos n 1  '
  cos 1

n 1
1

cos 
n 1
n 1
cos '
cos 1
Sampling Phong lobe
n 1
p ( ,  ) 
cos n  sin 
2
n
n 1
 sin 
cos
p ( ,  )
1
2
p ( |  ) 


p ( )
(n  1) cos n  sin  2
'
1
'
P( ' |  )  
d 
2
  0 2
  2 2
Sampling Phong lobe
When n=1, it is actually equivalent to cosine-weighted hemisphere
n  1, ( ,  )  (cos 1 1 ,2 2 )
P( )  1  cos n 1   1  cos 2 
1

1
( ,  )   cos (1  21 ),2 2 
2

1
1
P ( )   cos 2 
2
2
1
1
1
1
2
 cos 2    (1  2 sin  )   sin 2   1  cos 2 
2
2
2
2
```