SEER-Medicare Linked Data and Deep Learning: Help Advance Evidence-based Care to Breast Cancer Patients

N.R. Adam, and R. Wieder
Phalcon, LLC, New York, United States

Poster stand number: W124

Keywords: Artificial intelligence, deep learning, cancer treatment, adverse events, SEER-Medicare

Less than 5% of cancer patients participate in randomized clinical trials (RCTs). These patients are younger and healthier than average, resulting in outcomes that are not generalizable. Moreover, patients with stage IV disease often receive treatment far beyond the point of efficacy and suffer unnecessary toxicity. Health data analytics is a viable approach to complement RCTs and advance evidence-based cancer care. With the advancement of Artificial Intelligence, the prediction of disease trajectory and patient outcomes using electronic health records (EHRs) has gained significant interest. Deep learning (DL) enables representations of the raw data by non-linear transformations in each layer, thus facilitating the learning of hidden patterns in the data. DL approaches can analyze the ever-growing large-scale EHR data characterized by high dimensionality, heterogeneity, temporal dependency, sparsity, and irregularity. The Surveillance, Epidemiology, and End Results (SEER)–Medicare-linked database (S-M) offers rich, well-structured information that reflects disease progression in each patient. It is a valuable resource for health data analytics. We demonstrate steps to preprocess, merge, customize, clean, and harmonize the S-M. We show how DL can generate a new level of evidence, level 1.1, derived from real-world data and provide the first survival and adverse events predictions in several scenarios.