SAHARA: A Tool to Utilize Self-Organizing Maps to Generate Compressed Climate Datasets for Hazard Area Analysis

Luca Delle Monache
National Center for Atmospheric Research, Colorado, United States

Keywords: Toxic materials Hazard Area Analysis, Self-Organizing Maps, Chemical and Biological Defense, Nuclear Defense, Artificial Intelligence

We have developed a software tool (SAHARA) to reduce large climate datasets to more manageable sizes while retaining statistically similar results when used to produce ensembles of potential outcomes. We do this by employing the Self-Organizing Map (SOM) algorithm to analyze large-scale patterns of meteorological fields over spatial domains of interest. This reduced, SOM-derived dataset is then used as input to the Second-order Closure Integrated Puff (SCIPUFF) model runs, to generate the most likely dispersion patterns of toxic material at a location of interest. SAHARA generates area contours of the probability of exceeding a user-specified hazard value (such as radiation exposure, or chemical concentration) occur. The software is designed to efficiently utilize multicore CPUs, and when available, multi-node computing clusters, including Cloud-based platforms. Using a 30-node SOM we have been able to reduce 30 years of the Climate Forecast System Reanalysis (CFSR) data for a single month to 150 “typical days”. We find that our SOM-derived climate subset produces statistics that fall within 85-90% overlap with the full set while using 15% of the computational time required to run the T&D model for the full period.