Multifidelity Modeling Approaches in Simulation-Based Optimization Anthony A. Giunta1, Joseph P. Castro Jr.2, Patricia D. Hough3, Genetha A. Gray3, and Michael S. Eldred4 Sandia National Laboratories* Albuquerque, New Mexico and Livermore, California USA Validation and Uncertainty Quantification Dept. (NM) 2 Computational Sciences Dept. (NM) 3 Computational Sciences and Mathematics Dept. (CA) 4 Optimization and Uncertainty Estimation Dept. (NM) 1 8th SIAM Conference on Optimization May 15-19, 2005 Stockholm, Sweden *Sandia is a multi-program laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energys National Nuclear Security Administration under contract DE-AC04-94AL85000. Minisymposium on Simulation-Based Design Optimization: Parts I and II Part I (MS16) 3:00-3:25 Multifidelity Modeling Approaches in Simulation-Based Optimization Anthony A. Giunta, Joseph P. Castro, Patricia Hough, Genetha Gray, and Michael S. Eldred, Sandia National Laboratories 3:30-3:55 Optimization of Large Simulation-Based Systems Natalia Alexandrov, NASA Langley Research Center

4:00-4:25 Multiobjective Global Optimization Algorithms in Simulation Based Design Emilio Campana and Daniele Peri, INSEAN, Italy 4:30-4:55 Optimization in Reservoir Simulation: Models and Challenges Amr El-Bakry and Jeffrey Davidson, ExxonMobil Upstream Research Company Part II (MS64 May 19th) Cancelled 2 Outline Background What is simulation-based optimization? What is multifidelity optimization? What is space mapping? Description of Multifidelity Optimization Scheme Asynchronous Parallel Pattern Search (APPS) APPS+Space Mapping Space Mapping Example Problems Summary 3 What is Simulation-Based Design Optimization? Generic constrained optimization problem statement: minimize: f(x), subject to: g(x) 0, h(x) = 0 For many simulation-based optimization applications, f(), g(), and h() can

have some/all of the following characteristics: Black-box computer models with little or no access to source code Codes can be failure prone Computationally expensive Nonlinear Nonconvex Multiple feasible regions Multiple local minima (global optimization) Analytic gradients unavailable x variables can be deterministic (real/discrete/mixed) and/or probabilistic For my applications, Im fortunate because: x n, where n typically is O(101-102); usually f(), g(), and h() are deterministic Other engineering applications are not so lucky: x is discrete-valued and/or x is probabilistic and/or n is large and/or f(), g(), h() are stochastic: 4 e.g. transportation planning and logistics, oil well location planning and production planning, shape optimization, etc. Motivation for a Multifidelity Optimization (MFO) Approach fH(x)

0 .2 0.4 0 .6 x 2 0 .8 1.0 1.2 1.0 0 .8 0 .6 1.0 0.8 0.6 0 .4 0.2 0 .0 0 .4 x1 fL(x)

In many engineering applications, the objective function f(x) is expensive to calculate and derivatives may be inaccurate if they exist at all. Many of these applications have a high fidelity ( truth) model and a low fidelity (surrogate) model which simplifies the high fidelity model in some way. An MFO approach optimizes an inexpensive, low fidelity model while making periodic corrections using the expensive, high fidelity model. 1.00 0.80 0.2 0.60 0.4 0.6 x2 5 0.8 1.0 1.2

1.0 0.8 0.6 x1 0.4 The low-fidelity surrogate, fL, is cheaper and/or smoother than the high-fidelity model fH: fL is approximate, but captures general trends of fH. fL is easy to optimize (use local or global opt) Approaches to Multifidelity Optimization The idea of using multiple models during optimization is not new: Sequential Linear/Quadratic Programming Use a sequence of Taylor series approximations to approximate objective and constraint function needs one gradient evaluation at each x 0 Response surface approximations, aka: surrogate models, meta-models Use a sequence of general function approximations (polynomials, splines, neural networks, etc) to approximate objective and constraint functions Typically involve some method of data sampling to generate data, may or may not use gradients. Reduced-order models Use orthogonal decomposition approaches to construct a low-fidelity model Multifidelity modeling, aka: variable-complexity modeling Use a low-fidelity physics model rather than a function approximation

The use of multiple models works most reliably when performed within a provably-convergent optimization algorithm: AMMO (Alexandrov, et al.): NLP framework for physics-based models DAKOTA (Giunta & Eldred): NLP framework for function approximations or physics-based models (or combined) NOMAD (Audet, Dennis, Abramson, et al.): pattern search framework for function approximations Many other approaches - both provably-convergent and heuristic 6 Issues in Multifidelity Optimization The MFO approaches listed on the previous page all require that the same design parameters exist in the high- and low-fidelity models. Not an issue for function approximations. Is an issue when using multiple physics-based models. The low-fidelity model usually has fewer design parameters than the high-fidelity model, e.g.: Low Fidelity Low Fidelity High Fidelity The Space Mapping method of Bandler, Madsen, et al., provides an MFO approach than can accommodate an m-parameter low-fidelity model and an n-parameter high fidelity model (mn). 7

Circuit figures from: Bowdens Hobby Circuits, http://ourworld.compuserve.com/homepages/Bill_Bowden/homepage.htm High Fidelity What is Space Mapping? Predicated on low- and high-fidelity trends having the same general shape, but offset in some way. The space map alters the low-fidelity parameters so that the responses of the low- and high-fidelity models agree. Linear transformation and shift (for n=m or nm): xL = W*xH + , where W is an m by n transformation matrix and is an m by 1 shift vector Our initial research has followed this approach. Nonlinear rescaling and shift (only for n=m): xL = *(xH) + fL(xL) fH(xH) = xL-xH fL xL 8 simpler, fewer unknowns, but limited to n=m fH xL

xH Example: let ==1 and find in xL=(+xxH) so that fH(xH) fL(xL) *use least squares when there are many f H data points xH Our Approach: Use Space Mapping within a Pattern Search Algorithm Pattern search is a non-gradient local optimization method. APPS Asynchronous Parallel Pattern Search APPSPACK toolkit: Developed by Tammy Kolda, et al. http://csmr.ca.sandia.gov/projects/apps.html Accepts bound constraints parent Linearnode & nonlinear constraints in the works child node APPS provides a provably-convergent optimization framework in which we employ low-fidelity models: Oracle concept added to APPS As APPS runs, it periodically queries an Oracle to retrieve candidate optima. APPS evaluates the candidate optima, xHtrial with the high-fidelity model APPS accepts the candidate if fH(xHtrial) < fH (xH*) pruned[trial node = candidate optimum, * = current APPS best point) node Space Mapping (SM) within an Oracle: dislocated

We implemented 9 High-fidelity data evaluated by APPS is held in a cache SM uses this data to calibrate the low-fidelity model Candidate optima are generated by optimizing the low-fidelity model Note #1: the oracles calculations are trivial compared to a real-world highfidelity model evaluation. Note #2: the oracle does note affect APPS convergence proof. The APPS/Space Mapping Scheme le p i t l mu (x H) x H,f H Outer Loop 10 High Fidelity Mode Optimization via APPSPACK Space Mapping

Via Nonlinear Least Squares Calculation xH trial Low Fidelity Model Optimization fLxH Inner Loop Formulation of Simple Test Problems High Fidelity Model: Low Fidelity Model: f H ( x0 , x1 ) x 0 0 0 0 2 f L ( x0 , x1 ) x0 x1 2 x 1 1 1 1

2 2 Mapped Space (*,*,* calculated via Least Squares): xL x H * 0* 0 0 * 0 f mapped ( x0 , x1 ) x 2 x * 1* 1 1 Ideally * , *, etc... (fH = fmapped) 11

* 1 Studied the space mapping sensitivities to various inputs number of high fidelity response values used for the mapping scaling of the mapping parameters (size of offset between the low and high fidelity models) starting point of APPS optimization Compare the optimum function value and the number of high fidelity runs required to reach the optimum: Hypothesis: APPS + Space Mapping should converge better than APPS alone. 2 Case 1: Comparison of Design Space of High and Low Fidelity Polynomial Models with , ~O(1);=1 View of High Fidelity Design Space f H ( x0 , x1 ) ( x0 0.5) 2 ( x1 0.83) 2 12 Optimum at x = (-0.5,0.83) View of Unmapped Low Fidelity Design Space f L ( x0 , x1 ) x02 x12 Case 1 Iteration History: APPS+Space Mapping vs. APPS Only ~; =1 Starting Point = (-2.0,-2.0) f H ( x0 , x1 ) ( x0 0.5) 2 ( x1 0.83) 2

13 Num. of fH Data Points in Space Map Final (x0,x1) f L ( x0 , x1 ) x02 x12 ( x0 0 ) 2 ( x1 1 ) 2 x* = (-0.5,0.83) Objective Value f(x*) = 0.0 Num. of fH 2 (-0.50, 0.83) 3.12e-10 13 3.31 4

(-0.50, 0.83) 3.12e-10 17 2.52 6 (-0.49, 0.81) 4.27e-4 27 2.52 8 (-0.48, 0.81) 1.03e-3 27 1.59 APPS only (-0.40, 0.80) 1.09e-2

43 - Evals. *Notes: We assume that fL evaluations are computationally cheap (free) as compared to fH calculations. Speed Up* Case 1 Iteration History: APPS+Space Mapping vs. APPS Only Case 1: starting point = (-2,-2), ==1, is O(1) 2 unknowns in the Space Map In all cases the first Oracle call finds an improved point. 13 17 27 27 43 14 All Oracle results beyond this do not find a best point (APPS dominates at this point). Case 2: Comparison of Design Space of High and Low

Fidelity Polynomial Models with , ~O(1);=1 View of High Fidelity Design Space f H ( x0 , x1 ) (0.8 x0 0.5) 2 (0.5 x1 0.83) 2 15 Optimum at x = (-0.625,1.660) View of Unmapped Low Fidelity Design Space f L ( x0 , x1 ) x02 x12 Case 2 Iteration History: APPS+Space Mapping vs. APPS Only Case 2: starting point = (-2, 2), =1, and are O(1) 4 unknowns in the Space Map 47 51 34 53 16 APPS+Space Mapping converges faster than APPS, but APPS had the lowest optimal value. Note that optimal function values are only 10-3 for Case 2. Case 2:

Best Results with 8 Data Points in Space Map (2 calls to Oracle) Approximate Oracle Call Locations in fH space (-0.8,-1.2) 1 2 (-0.76,2.0) Color Contours = Space Mapped fL 1 Grey Contours = fH The numbered white boxes show approximately where the Oracle was called The point in red brackets is where APPS is before the Oracle call The point in green was found by the Oracle 17 f H ( x0 , x1 ) (0.8 x0 0.5) 2 (0.5 x1 0.83) 2 Optimum at x = (-0.625,1.660) (-0.56,1.6) 2 (-0.61,1.25) Case 2 Worst Results: 6 Points in Space Map, 5 Oracle Calls Color Contours

= Space Mapped fL 4,5 3 2 Hi-Fi Model 3 1 (-0.8,-1.2) 1 (-0.87,.006) (-0.47,0.39) 2 (-0.47,1.19) (-0.53,0.29) (-0.67,1.60) 4 (-0.64,1.36) (-0.67,1.71) 5

(-0.35,-.013) (-0.63,1.32) 18 Optimum at x = (-0.625,1.660) Grey Contours = fH High Velocity Impact Case: Steel Penetrator Striking a Solid Target Velocity Vector Steel Penetrator composed of multiple materials hitting a solid target High Fidelity Model = elastic finite element model ~40 minute calculation time noisy response data w.r.t. density variations c Low Fidelity Model = rigid finite element model ~10 second calculation time smooth response data w.r.t. density variations N A computational cost ratio of 1:240

The low-fidelity model gives the same general response trends as the high-fidelity model -y 19 Target These factors makes these models prime candidates for multifidelity optimization Acceleration Comparison with Varying Mesh P aram eter S tu d y - Max. A cceleratio n V ariatio n w ith resp ect to D en sity (L o g S cale) 1.0 0 E+x0 9 Acceleration (scaled) 1.0 0 E+x0 8 1.0 0 E+x0 7 1.0 0 E+x0 6 a c c e le ra tio n-fine (rigid) a c c e le ra tio n-fine (e la s tic ) 1.0 0 E+x0 5 0 0 .5

1 1 .5 2 2 .5 3 3 .5 4 D en sity (scaled ) 20 Note: Rigid body and elastic body data trends are similar, but the rigid (low-fidelity) model is in error by 2 orders of magnitude 4 .5 Optimization Problem: Minimize Acceleration by Varying Density APPS+Space Mapping vs. APPS A series of calculations were done to minimizing acceleration by changing penetrator density: simple 1-parameter optimization run w/ bound constraints various initial density starting points, similar results for all runs: APPS made some progress then

terminated (as expected) APPS/SM did as well or better w.r.t. to minimizing objective function Ex. For this case, the APPS/Space Mapping scheme took longer to converge but a better optimum was found APPS/SM provides a limited global search capability to get past the local noise 21 Current Work: Space Mapping with nm Recall: Space Mapping approach with a linear transformation and a shift vector (for n=m or nm): xL = W*xH + , where W is an m by n transformation matrix and is an m by 1 shift vector m*n unknowns in W, m unknowns in Example Problem: Low-fidelity function: 2-dimensional Rosenbrock function High-fidelity function: multidimensional Rosenbrock function Minimum is f(x)=0 at x = {1,1,1,...,1} 2-d variant, 4 unknowns in W 3-d variant, 6 unknowns in W 4-d variant, 8 unknowns in W n 1

2 f ( x) 100 xk 1 xk2 1 xk k 1 22 2 Preliminary Results: Space Mapping for nm 23 m=2, n=2, not used W has 4 unknowns m=2, n=3, not used W has 6 unknowns Preliminary Results: Space Mapping for nm 24 m=2, n=4, not used W has 8 unknowns

m=2, n=3, is used W has 6 unknowns, 8 total Summary and Future Work Summary: As expected, the APPS+xSpace Mapping approach provides a useful multifidelity optimization tool. Inherits the provably-convergent APPS framework Space Map provides and Oracle to APPS for heuristic local/global search APPS+xSpace Map performs equivalent to, or better than, APPS alone: Needs fewer high-fidelity function evaluations to reach convergence, or Finds a better optimum, or Sometimes both. Future Technical Work: Continue space mapping studies for nm Implement APPS+xSpace Mapping into Sandias DAKOTA software framework (APPS already integrated w/ DAKOTA) Future Applications: Apply our multifidelity optimization schemes to some more engineering design problems: Additional penetrator design and analysis studies Groundwater contamination and remediation problems including well field design & hydraulic capture Circuit system design 25 References and Contact Information APPSPACK: Software Website

APPSPACK 4.0 http://software.sandia.gov/appspack/version4.0 This website includes the software itself (open-source) and instructions for downloading, installing, and using it. It also has a complete list of references to papers on the software development and convergence analysis. Pattern Search Optimization: Kolda, Lewis, and Torczon, "Optimization by Direct Search: New Perspectives on Some Classical and Modern Methods," SIAM Review, 45(3):385-482, 2003. SPACE MAPPING: Bakr, Bandler, Madsen, and Sondergaard, "An Introduction to Space Mapping Technique," Optimization & Engineering, 2:369-384, 2001. DAKOTA: Software Website http://endo.sandia.gov/DAKOTA Contact Info: Anthony Giunta: [email protected], phone: 1-505-844-4280 26