causal inference - VUB Parallel Computing Laboratory

advertisement
Learning Causal Models of
Multivariate Systems
and the Value of it for the Performance
Modeling of Computer Programs
Supervisor: Prof. dr. ir. Erik Dirkx
Jan Lemeire
December 19th 2007
Overview
Learning causal models for the performance
analysis of programs executed on various computer
systems.
Intermezzo I: Causal inference.
Practical deployment of the causal learning
algorithms.
Philosophical and theoretical study of causal
inference.
Intermezzo II: Kolmogorov Minimal Sufficient Statistics.
The importance of qualitative properties.
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 2 / 49
Overview
Learning causal models for the performance
analysis of programs executed on various computer
systems.
Intermezzo I: Causal Inference.
Practical deployment of the causal learning
algorithms.
Philosophical and theoretical study of causal
inference.
Intermezzo II: Kolmogorov Minimal Sufficient Statistics
The importance of qualitative properties.
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 3 / 49
What is Parallel Processing?
Computational work:
Time
M
Parallel
system
CPU
M
CPU
N
M
CPU
Ideally: Speedup = number of processors
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 4 / 49
Parallel Overhead
Time
Speedup = 2.55
Overhead = time the processors are not spending
on useful work
= lost processor cycles
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 5 / 49
Overhead Analysis
overhead time
Overhead ratio 
runtime
number of processors
Speedup 
1   overhead ratios
Causal Inference & Performance Analysis
Jan Lemeire
Impact of overhead
on speedup
Pag. 6 / 49
Experimental Parallel Performance
Analysis: Data Acquisition
EPPA
Parallel Program
EPPA
instrumentation
library
Executable
EPPA
Database
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 7 / 49
EPDA: Multivariate Analysis
Modeling
Visualization
1.5
2.5
0.9
2
4
1
184
836
1043
Causal Inference
Causal Model
Specify
context
Curve fitting
CPT
compression
Analytical Model
Augmented Model
User-defined
variables
Derivatives of
variables
EPDA
Outlier
identification
Database
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 8 / 49
Intermezzo I: Causal Inference
C
A
System under
study
E
D
B
Data
Experiments
A
experiment 1
experiment 2
experiment 3
experiment 4
A
B
2
1
4
2
12
73
8
27
D
TRUE
FALSE
TRUE
TRUE
E
blue
green
red
black
C
Causal model
E
B
Causal Inference & Performance Analysis
Jan Lemeire
C
0.42
1.93
0.03
2.84
D
Pag. 9 / 49
Causal Inference for Performance
Analysis
PROGRAM
array size
array size
element type
element type
element size
size
memory
memory
#instrop
#op
#op
#instrop
Cmem
??
Cinstr
cache
misses
cache
misses
TTcomp
comp
Cmem
fclock Cinstr
fclock
Utility based on the following properties:
P
E
R
F
O
R
M
A
N
C
E
1. Dependency analysis: how variables relate.
2. Markov property.
3. A causal model corresponds to a decomposition.
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 10 / 49
Execution of program gives cache misses
PROGRAM
x?
Datastructure
4
x?
4
datatype (integer, float, double,…)
data size in Bytes
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 11 / 49
Markov Property
datatype
cache misses
Correlated
With information about the data size:
datatype
datatype
data size
cache misses
cache misses data size
Provides explanations
Differentiate direct from indirect relations
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 12 / 49
Cannon angle
OK, but:
Causal Inference & Performance Analysis
Jan Lemeire
~
distance
or
???
Pag. 13 / 49
What is Causality?
A causal relation denotes a mechanism, that a variable
is `produced’ by its causes.
However… not directly observable.
Mmmh
Causality is a relic of a
bygone age
Bertrand Russell
Judea Pearl
But: we want to learn something about underlying
system (goal of statistics)
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 14 / 49
gunpowder
Causal Inference & Performance Analysis
Jan Lemeire
~
distance
Pag. 15 / 49
V-structure Property
cannon angle
gunpowder
distance
angle independent
from gunpowder
but dependent when
distance is known
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 16 / 49
Conditional Independencies Make
Causal Inference Possible
A
B
mechC
mechD
C
D
mechE
E
From a causal structure follow conditional
independencies, irrespective of the
mechanisms.
– Markov
– V-structure
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 17 / 49
Graph is a Description of Independencies
A
C
E
B
D
Graphical criterion: d-separation
– Intuitive
Faithfulness property:
independencies
in graph
Causal Inference & Performance Analysis
Jan Lemeire
independencies
in reality
Pag. 18 / 49
Causal Structure Learning
A
A
B
A
D
C
B
A
B
A
B
A
B
C
D
C
D
C
D
C
D
C
(a)
In two steps:
1. Undirected
graph
2. Orientation
(c)
(b)
A
B
A
B
A
B
C
D
C
D
C
D
A
A
(d)
Causal Inference & Performance Analysis
Jan Lemeire
B
D
D
D
(e)
B
A
A
C
C
B
(f)
Pag. 19 / 49
Result
Partially directed acyclic graph
“We know what parts are unknown.”
Faithfulness assumption:
all independencies follow from the
causal structure
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 20 / 49
Contribution 1
Experimental Results
(1) Automatic learning of accurate performance models
(2) Model validation
(3) Identification of
unexpected dependencies
(4) Explanations for outliers
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 21 / 49
Overview
Learning causal models for the performance
analysis of programs executed on various computer
systems.
Intermezzo I: Causal Inference.
Practical deployment of the causal learning
algorithms.
Philosophical and theoretical study of causal
inference.
Intermezzo II: Kolmogorov Minimal Sufficient Statistics
The importance of qualitative properties.
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 22 / 49
Practical Causal Inference
The following limitations had to be overcome:
Non-linear relations: form-free independence test
Mixture of continuous, discrete and categorical data:
general independence test
Deterministic relations: augmented causal model
and extended learning algorithms
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 23 / 49
Form-Free and General
Dependency Test
Example
Y
Pearson:
Rxy=0.083 => X and Y linearly
independent
Kernel density estimation
X
Y
Mutual information
P(X, Y)
X
I(X;Y)=0.90 bits => dependent
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 24 / 49
Deterministic Relations
datatype
cache misses data size
Data size and data type
are information
equivalent with respect
to cache misses
During learning connect
least complex relation
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 25 / 49
Contribution 2a
Complexity Criterion
Correct models are learned under the
Complexity Increase Assumption
X
mech1
Y
mech2
Z
Complexity( X – Z ) ≥ Complexity( X – Y )
Complexity( X – Z ) ≥ Complexity( Y – Z )
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 26 / 49
Contribution 2b
Reestablishment of Faithfulness
Y
Z
Z
X
A
Information is added to the model
Basic information equivalences
Consequences are considered
X and Y eq. for A
Information equivalences
Independence and simplicity
Y
Z
D-separation extension
Y
Z
X
X
Y
S
Y
eq Z
Z X
X
Faithful model: represents all independencies
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 27 / 49
Contribution 2c
Extension of PC Learning Algorithm
Detection of information equivalences
Among information equivalent relations, the
simplest one is chosen
Orientation rules remain the same
Correct models are learned from data
containing deterministic relations.
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 28 / 49
Overview
Learning causal models for the performance
analysis of programs executed on various computer
systems.
Intermezzo I: Causal Inference.
Practical deployment of the causal learning
algorithms.
Philosophical and theoretical study of causal
inference.
Intermezzo II: Kolmogorov Minimal Sufficient Statistics
The importance of qualitative properties.
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 29 / 49
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 30 / 49
Inductive Inference
Occam’s Razor
“Among equivalent models
choose the simplest one.”
E
x
y
h
 m.
 m.
 m.
F
vx
v y
vh
F  g.
William of Ockham
vx2  v y2
E  m. F  H
c
3
E  m.c 2
H  d x2  d y2
BUT: Objective measure of complexity?
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 31 / 49
Kolmogorov Complexity
Kolmogorov Complexity of a binary string:
the length of the shortest program that
computes the string and halts
Andrey Kolmogorov
PROGRAM
REPEAT 11 TIMES
PRINT "001"
001001001001001001001
001001001001
Universal Turing Machine
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 32 / 49
Shortest Programs
001001001001001001001001001001001
PROGRAM
REPEAT 11 TIMES
PRINT "001"
regularity of repetition
allows compression
011000110101101010111001001101000
PROGRAM
PRINT "01100011010110
1010111001001101000"
Causal Inference & Performance Analysis
Jan Lemeire
random information
= incompressible
Pag. 33 / 49
Randomness versus Regularity
Kolmogorov Minimal Sufficient Statistics (KMSS):
formal separation
Meaningful information
regularities
Accidental information
randomness
001001001001001001001001001001001
001
011000110101101010111001001101000
repetition
11 times,
Only random information (incompressible)
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 34 / 49
Learning = finding regularities
= maximal compression
Structure of
a diamond
Exact size
regularities
Causal Inference & Performance Analysis
Jan Lemeire
random
random
Pag. 35 / 49
Contribution 3a
Meaningful Information
of Probability Distributions
GRAPH
Joint Probability
Distribution
P(A, B, C, D, E)
=
A
B
CPDs
C
E
E
D
P(A)
P(B)
P(C|A, B)
P(D|B)
P(E|C, D)D)
P(E|C,
meaningful information (Theorem 1)
Kolmogorov Minimal Sufficient Statistic if graph and
CPDs are incompressible (Theorem 2)
a graph with random CPDs is faithful (Theorem 4)
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 36 / 49
Causal Aspect of Causal Models
= Decomposition
A
B
A
B
mechC
mechD
C
C
D
D
mechE E
E
Canonical decomposition: quasi-unique and minimal
decomposition into atomic and independent
components (the CPDs)
Corresponds to reality (mechanisms)
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 37 / 49
Causal Component Relies on
Reductionism
The world can be studied in parts.
Or, even more:
The world is made up of indivisible parts.
A
C
E
B
D
When DAG of Bayesian
network is a complete graph
 no meaningful information
 holism
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 38 / 49
Contribution 3b
Validity of Causal Inference
How OK is the learned causal model?
Minimal model?
Faithful?
Other regularities?
{
Causal model
≠
minimal model
Do CPD components correspond
to physical mechanisms?
Unfaithful
Other regularities
Conform
reality
Wrong
decomposition
3, 6
2, 5, 7, 8
1
4
Counterexamples from literature
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 39 / 49
Well-known Example of Unfaithfulness
’Normally’:
A and D
correlate
A
B
1
2
D
C
D
A
A and D get independent
if influences along paths
1 and 2 cancel each
other out
Mechanisms are related
Regularity among them
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 40 / 49
Overview
Learning causal models for the performance
analysis of programs executed on various computer
systems.
Intermezzo I: Causal Inference.
Practical deployment of the causal learning
algorithms.
Philosophical and theoretical study of causal
inference.
Intermezzo II: Kolmogorov Minimal Sufficient Statistics
The importance of qualitative properties.
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 41 / 49
Regularities are Qualitative
Properties
Different from quantitative information.
Allow for qualitative reasoning.
Qualitative properties determine behavior.
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 42 / 49
Communication Schemes on
Network Topologies
1
2
8
3
Communication Scheme
7
4
6
5
1
2
8
Communication time?
3
Network Topology
7
4
6
5
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 43 / 49
Contribution 4a
Generic Performance Model
Scheme1
Scheme2
model
Topology1
Topology2
Scheme3
Tcomm
Topology3
Good predictions for combinations of random
schemes and random topologies
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 44 / 49
Contribution 4b
Combinations of Patterns
Communication Schemes
broadcast
shift
Performance
depends on
match!
Network Topologies
star
Causal Inference & Performance Analysis
Jan Lemeire
ring
Pag. 45 / 49
Qualitative Properties
Faithfulness: ”graph should describe all
independencies”
KMSS: ”model should describe all regularities”
Qualitative information
explicitly describe
regularities
Causal Inference & Performance Analysis
Jan Lemeire
Quantitative information
contains no more
regularities
Pag. 46 / 49
Explicitly Mention Qualitative Properties!
(2,12)
Stone
(12,61)
(9,41)
(5,21)
(12,61)
??
(9,41)
(12,61)
(19,24)
(9,41)
(12,61)
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 47 / 49
Conclusions
Contribution to performance analysis.
Automatic causal analysis.
Useful add-on in combination with other
techniques.
The value of causal inference is underlined.
The importance of regularities or qualitative
properties.
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 48 / 49
Future Work
Application of the learned performance
models for optimization.
Is the failure of generic performance models
only due to regularities?
Augment models with qualitative properties.
But: how define, recognize and reason with
regularities?
Causal Inference & Performance Analysis
Jan Lemeire
Pag. 49 / 49
Download