Exotic Option Pricing with CUDA-Accelerated Monte Carlo for HPC Applications

A high-performance computing project on GPU-based Monte Carlo simulation for exotic options. The project emphasized psuedorandom number generation, pricing accuracy, and large-scale throughput for Asian, barrier, and lookback-style options.

Available materials: GitHub Project paper (PDF)

$10^4$ GPU speed-up factor on large path counts

5M paths Benchmark scale used for CPU vs GPU comparisons

96–98% Agreement with CPU baselines in reproduced tests

Abstract

Many Monte Carlo methods are embarrassingly parallel due to independence, and can be computationally dense, and thus benefit greatly from GPU architecture. The canonical Monte Carlo problem is the generation of pseudo-random numbers, in particular samples from the normal distribution, which is a key component in financial simulations.

Various non-standard methods exploit fully programmable GPUs for efficiency, including the Ziggurat method, the Wallace method, and related hybrid generators. Standard methods such as Box–Muller remain robust in the GPU setting, but speed–accuracy trade-offs still need to be considered.

This project investigates the transferability of Monte Carlo methods from sequential to parallel execution, and then studies the pricing of exotic options such as Asian, lookback, and barrier contracts as a representative application. The emphasis is on the CUDA platform, with discussion of implementation details, experiments, and numerical results within that framework.

Main contributions

Built Monte Carlo pricing routines in C++/CUDA for Asian, lookback, and barrier-style options.
Benchmarked GPU (NVIDIA GeForce RTX 4090 Series) and CPU (AMD Ryzen 7 7800X3D 8-Core Processor) implementations against large-path experiments and reference results discussed in the NVIDIA literature.
Studied the Wallace method in the context of efficiency, variance behavior, and pathwise pricing error.
Examined how GPU memory-access patterns and implementation choices affect end-to-end performance.

Conclusions

Monte Carlo methods have enormous applications in GPU usage for high-performance financial applications. The Wallace Method and Hybrid generators provide reasonable alternatives to traditional PRNGs for GPU-based systems, and contingencies related to architecture and algorithm design must be considered, and speed-accuracy tradeoffs are crucial. The advantages of GPUs were demonstrated using numerical experiments for various methods, problems and hardware, using the CUDA platform provided by NVIDIA, and results from literature were presented and expanded upon. Efficiency and reliability of methods in CUDA libraries was discussed and evaluated.

Bibliography

Nguyet Nguyen, Linlin Xu, and Giray Ökten. A quasi-monte carlo implementation of the ziggurat method. Monte Carlo Methods and Applications, 24(2):93–99, 2018.
NVIDIA Corporation. CUDA C++ Programming Guide, Version 12.9. NVIDIA, 2024. Accessed: 2025-05-01, Introduction section.
John D. Owens, Mike Houston, David Luebke, Simon Green, John E. Stone, and James C. Phillips. Chapter 37: Efficient random number generation and application, 2007. Accessed: 2025-03-21.
Christoph Riesinger, Tobias Neckel, and Florian Rupp. Non-standard pseudo random number generators revisited for gpus. Future Generation Computer Systems, 82:482–492, 2018.
Andrew Sheppard. Cuda-accelerated monte-carlo for hpc: A practitioner's guide. Presentation at SC11 Conference, November 2011. Presented by Andrew Sheppard, Fountainhead.
David B. Thomas. Monte carlo implementations on gpus. Presentation, Imperial College London, 2009. Accessed: 2025-03-24.
Linlin Xu and Giray Ökten. High-performance financial simulation using randomized quasi-monte carlo methods. Quantitative Finance, 15(8):1425–1436, 2015.