A scientific computing GUI agent for parallel Monte Carlo in a distributed environment

Mike Hongbo Zhou


Monte Carlo applications, such as Brownian Langevin simulations of biomedical molecules, are often computationally intensive, since reducing stochastic errors calls for more samplings and the order of this reduction is proportional to the square root of the number of samples. For the last 15 years, there has been a continuing trend away from main frame computers and towards distributed PCs. In this new environment (parallel PC pools), the network throughput bottleneck, PC instability, heterogeneousness, and security issues must be addressed anew. We developed a web-based graphical user interface (WB-GUI) agent for distributed Monte Carlo computations that significantly improves our Brownian Langevin simulation time. The GUI agent is built upon the Condor cycle scavenging system and consists of the Scalable Parallel Random Number Generator library (SPRNG), the remote compiler, and the cycle server. The WB-GUI reduces the complexity of using the underlying distributed computer systems. Users only need upload codes and download results via the WB-GUI. Parallel random number support (via SPRNG), preparing tasks for heterogeneous platforms (via remote compiler), and using Condor to migrate a job from a busy or dead computer to an idle one, are all integrated seamlessly; they are transparent to users. At the same time, the GUI agent acts as a gate-keeper for the PC pool: no user account is needed on pool computers; users are assigned web accounts instead. Jobs only run at machine idle time and the executing machine file systems are never touched by them. With the aid of this tool, we studied the sensitivity of the results of a Brownian Langevin application to a suite of random number generators. We were able to run the application in parallel on available hardware in a reasonable time.