Learning Causal Models of Multivariate Systems and the Value of it for the Performance Modeling of Computer Programs Supervisor: Prof. dr. ir. Erik Dirkx Jan Lemeire December 19th 2007 Overview Learning causal models for the performance analysis of programs executed on various computer systems. Intermezzo I: Causal inference. Practical deployment of the causal learning algorithms. Philosophical and theoretical study of causal inference. Intermezzo II: Kolmogorov Minimal Sufficient Statistics. The importance of qualitative properties. Causal Inference & Performance Analysis Jan Lemeire Pag. 2 / 49 Overview Learning causal models for the performance analysis of programs executed on various computer systems. Intermezzo I: Causal Inference. Practical deployment of the causal learning algorithms. Philosophical and theoretical study of causal inference. Intermezzo II: Kolmogorov Minimal Sufficient Statistics The importance of qualitative properties. Causal Inference & Performance Analysis Jan Lemeire Pag. 3 / 49 What is Parallel Processing? Computational work: Time M Parallel system CPU M CPU N M CPU Ideally: Speedup = number of processors Causal Inference & Performance Analysis Jan Lemeire Pag. 4 / 49 Parallel Overhead Time Speedup = 2.55 Overhead = time the processors are not spending on useful work = lost processor cycles Causal Inference & Performance Analysis Jan Lemeire Pag. 5 / 49 Overhead Analysis overhead time Overhead ratio runtime number of processors Speedup 1 overhead ratios Causal Inference & Performance Analysis Jan Lemeire Impact of overhead on speedup Pag. 6 / 49 Experimental Parallel Performance Analysis: Data Acquisition EPPA Parallel Program EPPA instrumentation library Executable EPPA Database Causal Inference & Performance Analysis Jan Lemeire Pag. 7 / 49 EPDA: Multivariate Analysis Modeling Visualization 1.5 2.5 0.9 2 4 1 184 836 1043 Causal Inference Causal Model Specify context Curve fitting CPT compression Analytical Model Augmented Model User-defined variables Derivatives of variables EPDA Outlier identification Database Causal Inference & Performance Analysis Jan Lemeire Pag. 8 / 49 Intermezzo I: Causal Inference C A System under study E D B Data Experiments A experiment 1 experiment 2 experiment 3 experiment 4 A B 2 1 4 2 12 73 8 27 D TRUE FALSE TRUE TRUE E blue green red black C Causal model E B Causal Inference & Performance Analysis Jan Lemeire C 0.42 1.93 0.03 2.84 D Pag. 9 / 49 Causal Inference for Performance Analysis PROGRAM array size array size element type element type element size size memory memory #instrop #op #op #instrop Cmem ?? Cinstr cache misses cache misses TTcomp comp Cmem fclock Cinstr fclock Utility based on the following properties: P E R F O R M A N C E 1. Dependency analysis: how variables relate. 2. Markov property. 3. A causal model corresponds to a decomposition. Causal Inference & Performance Analysis Jan Lemeire Pag. 10 / 49 Execution of program gives cache misses PROGRAM x? Datastructure 4 x? 4 datatype (integer, float, double,…) data size in Bytes Causal Inference & Performance Analysis Jan Lemeire Pag. 11 / 49 Markov Property datatype cache misses Correlated With information about the data size: datatype datatype data size cache misses cache misses data size Provides explanations Differentiate direct from indirect relations Causal Inference & Performance Analysis Jan Lemeire Pag. 12 / 49 Cannon angle OK, but: Causal Inference & Performance Analysis Jan Lemeire ~ distance or ??? Pag. 13 / 49 What is Causality? A causal relation denotes a mechanism, that a variable is `produced’ by its causes. However… not directly observable. Mmmh Causality is a relic of a bygone age Bertrand Russell Judea Pearl But: we want to learn something about underlying system (goal of statistics) Causal Inference & Performance Analysis Jan Lemeire Pag. 14 / 49 gunpowder Causal Inference & Performance Analysis Jan Lemeire ~ distance Pag. 15 / 49 V-structure Property cannon angle gunpowder distance angle independent from gunpowder but dependent when distance is known Causal Inference & Performance Analysis Jan Lemeire Pag. 16 / 49 Conditional Independencies Make Causal Inference Possible A B mechC mechD C D mechE E From a causal structure follow conditional independencies, irrespective of the mechanisms. – Markov – V-structure Causal Inference & Performance Analysis Jan Lemeire Pag. 17 / 49 Graph is a Description of Independencies A C E B D Graphical criterion: d-separation – Intuitive Faithfulness property: independencies in graph Causal Inference & Performance Analysis Jan Lemeire independencies in reality Pag. 18 / 49 Causal Structure Learning A A B A D C B A B A B A B C D C D C D C D C (a) In two steps: 1. Undirected graph 2. Orientation (c) (b) A B A B A B C D C D C D A A (d) Causal Inference & Performance Analysis Jan Lemeire B D D D (e) B A A C C B (f) Pag. 19 / 49 Result Partially directed acyclic graph “We know what parts are unknown.” Faithfulness assumption: all independencies follow from the causal structure Causal Inference & Performance Analysis Jan Lemeire Pag. 20 / 49 Contribution 1 Experimental Results (1) Automatic learning of accurate performance models (2) Model validation (3) Identification of unexpected dependencies (4) Explanations for outliers Causal Inference & Performance Analysis Jan Lemeire Pag. 21 / 49 Overview Learning causal models for the performance analysis of programs executed on various computer systems. Intermezzo I: Causal Inference. Practical deployment of the causal learning algorithms. Philosophical and theoretical study of causal inference. Intermezzo II: Kolmogorov Minimal Sufficient Statistics The importance of qualitative properties. Causal Inference & Performance Analysis Jan Lemeire Pag. 22 / 49 Practical Causal Inference The following limitations had to be overcome: Non-linear relations: form-free independence test Mixture of continuous, discrete and categorical data: general independence test Deterministic relations: augmented causal model and extended learning algorithms Causal Inference & Performance Analysis Jan Lemeire Pag. 23 / 49 Form-Free and General Dependency Test Example Y Pearson: Rxy=0.083 => X and Y linearly independent Kernel density estimation X Y Mutual information P(X, Y) X I(X;Y)=0.90 bits => dependent Causal Inference & Performance Analysis Jan Lemeire Pag. 24 / 49 Deterministic Relations datatype cache misses data size Data size and data type are information equivalent with respect to cache misses During learning connect least complex relation Causal Inference & Performance Analysis Jan Lemeire Pag. 25 / 49 Contribution 2a Complexity Criterion Correct models are learned under the Complexity Increase Assumption X mech1 Y mech2 Z Complexity( X – Z ) ≥ Complexity( X – Y ) Complexity( X – Z ) ≥ Complexity( Y – Z ) Causal Inference & Performance Analysis Jan Lemeire Pag. 26 / 49 Contribution 2b Reestablishment of Faithfulness Y Z Z X A Information is added to the model Basic information equivalences Consequences are considered X and Y eq. for A Information equivalences Independence and simplicity Y Z D-separation extension Y Z X X Y S Y eq Z Z X X Faithful model: represents all independencies Causal Inference & Performance Analysis Jan Lemeire Pag. 27 / 49 Contribution 2c Extension of PC Learning Algorithm Detection of information equivalences Among information equivalent relations, the simplest one is chosen Orientation rules remain the same Correct models are learned from data containing deterministic relations. Causal Inference & Performance Analysis Jan Lemeire Pag. 28 / 49 Overview Learning causal models for the performance analysis of programs executed on various computer systems. Intermezzo I: Causal Inference. Practical deployment of the causal learning algorithms. Philosophical and theoretical study of causal inference. Intermezzo II: Kolmogorov Minimal Sufficient Statistics The importance of qualitative properties. Causal Inference & Performance Analysis Jan Lemeire Pag. 29 / 49 Causal Inference & Performance Analysis Jan Lemeire Pag. 30 / 49 Inductive Inference Occam’s Razor “Among equivalent models choose the simplest one.” E x y h m. m. m. F vx v y vh F g. William of Ockham vx2 v y2 E m. F H c 3 E m.c 2 H d x2 d y2 BUT: Objective measure of complexity? Causal Inference & Performance Analysis Jan Lemeire Pag. 31 / 49 Kolmogorov Complexity Kolmogorov Complexity of a binary string: the length of the shortest program that computes the string and halts Andrey Kolmogorov PROGRAM REPEAT 11 TIMES PRINT "001" 001001001001001001001 001001001001 Universal Turing Machine Causal Inference & Performance Analysis Jan Lemeire Pag. 32 / 49 Shortest Programs 001001001001001001001001001001001 PROGRAM REPEAT 11 TIMES PRINT "001" regularity of repetition allows compression 011000110101101010111001001101000 PROGRAM PRINT "01100011010110 1010111001001101000" Causal Inference & Performance Analysis Jan Lemeire random information = incompressible Pag. 33 / 49 Randomness versus Regularity Kolmogorov Minimal Sufficient Statistics (KMSS): formal separation Meaningful information regularities Accidental information randomness 001001001001001001001001001001001 001 011000110101101010111001001101000 repetition 11 times, Only random information (incompressible) Causal Inference & Performance Analysis Jan Lemeire Pag. 34 / 49 Learning = finding regularities = maximal compression Structure of a diamond Exact size regularities Causal Inference & Performance Analysis Jan Lemeire random random Pag. 35 / 49 Contribution 3a Meaningful Information of Probability Distributions GRAPH Joint Probability Distribution P(A, B, C, D, E) = A B CPDs C E E D P(A) P(B) P(C|A, B) P(D|B) P(E|C, D)D) P(E|C, meaningful information (Theorem 1) Kolmogorov Minimal Sufficient Statistic if graph and CPDs are incompressible (Theorem 2) a graph with random CPDs is faithful (Theorem 4) Causal Inference & Performance Analysis Jan Lemeire Pag. 36 / 49 Causal Aspect of Causal Models = Decomposition A B A B mechC mechD C C D D mechE E E Canonical decomposition: quasi-unique and minimal decomposition into atomic and independent components (the CPDs) Corresponds to reality (mechanisms) Causal Inference & Performance Analysis Jan Lemeire Pag. 37 / 49 Causal Component Relies on Reductionism The world can be studied in parts. Or, even more: The world is made up of indivisible parts. A C E B D When DAG of Bayesian network is a complete graph no meaningful information holism Causal Inference & Performance Analysis Jan Lemeire Pag. 38 / 49 Contribution 3b Validity of Causal Inference How OK is the learned causal model? Minimal model? Faithful? Other regularities? { Causal model ≠ minimal model Do CPD components correspond to physical mechanisms? Unfaithful Other regularities Conform reality Wrong decomposition 3, 6 2, 5, 7, 8 1 4 Counterexamples from literature Causal Inference & Performance Analysis Jan Lemeire Pag. 39 / 49 Well-known Example of Unfaithfulness ’Normally’: A and D correlate A B 1 2 D C D A A and D get independent if influences along paths 1 and 2 cancel each other out Mechanisms are related Regularity among them Causal Inference & Performance Analysis Jan Lemeire Pag. 40 / 49 Overview Learning causal models for the performance analysis of programs executed on various computer systems. Intermezzo I: Causal Inference. Practical deployment of the causal learning algorithms. Philosophical and theoretical study of causal inference. Intermezzo II: Kolmogorov Minimal Sufficient Statistics The importance of qualitative properties. Causal Inference & Performance Analysis Jan Lemeire Pag. 41 / 49 Regularities are Qualitative Properties Different from quantitative information. Allow for qualitative reasoning. Qualitative properties determine behavior. Causal Inference & Performance Analysis Jan Lemeire Pag. 42 / 49 Communication Schemes on Network Topologies 1 2 8 3 Communication Scheme 7 4 6 5 1 2 8 Communication time? 3 Network Topology 7 4 6 5 Causal Inference & Performance Analysis Jan Lemeire Pag. 43 / 49 Contribution 4a Generic Performance Model Scheme1 Scheme2 model Topology1 Topology2 Scheme3 Tcomm Topology3 Good predictions for combinations of random schemes and random topologies Causal Inference & Performance Analysis Jan Lemeire Pag. 44 / 49 Contribution 4b Combinations of Patterns Communication Schemes broadcast shift Performance depends on match! Network Topologies star Causal Inference & Performance Analysis Jan Lemeire ring Pag. 45 / 49 Qualitative Properties Faithfulness: ”graph should describe all independencies” KMSS: ”model should describe all regularities” Qualitative information explicitly describe regularities Causal Inference & Performance Analysis Jan Lemeire Quantitative information contains no more regularities Pag. 46 / 49 Explicitly Mention Qualitative Properties! (2,12) Stone (12,61) (9,41) (5,21) (12,61) ?? (9,41) (12,61) (19,24) (9,41) (12,61) Causal Inference & Performance Analysis Jan Lemeire Pag. 47 / 49 Conclusions Contribution to performance analysis. Automatic causal analysis. Useful add-on in combination with other techniques. The value of causal inference is underlined. The importance of regularities or qualitative properties. Causal Inference & Performance Analysis Jan Lemeire Pag. 48 / 49 Future Work Application of the learned performance models for optimization. Is the failure of generic performance models only due to regularities? Augment models with qualitative properties. But: how define, recognize and reason with regularities? Causal Inference & Performance Analysis Jan Lemeire Pag. 49 / 49