Prediction Models, Nomograms, and
Staging Validation with the
Surveillance, Epidemiology, and End
Using Surveillance, Epidemiology and End Results
(SEER) Program to Create Prediction Models
any clinical researchers have devised prognostic models using the
demographic and clinicopathologic variables included in SEER.
The large number of patients available from the SEER database
endows robustness and external validity to such models. Likewise, a
relatively high level of internal validity is achievable by controlling for
the demographic and clinicopathologic factors.
Common end points used in predictive models derived from SEER are
overall survival (OS) and cancer-speciﬁc survival (CSS). OS is deﬁned as
the time from diagnosis until the patient’s death. In cases for which no
death is recorded, cases are typically censored at last follow-up. CSS is
typically derived from SEER by counting any cancer-related death as an
event and censoring patients at last follow-up or at the time of a
The cause of death in SEER is derived from
death certiﬁcates. The site listed as the cause of a cancer-related death is
often ignored in computing CSS because death certiﬁcates often list a
metastatic site as the cause of death.
A typical approach for using SEER to develop a predictive model can
be illustrated in an analysis of which factors inﬂuence CSS in patients
with primary penile squamous carcinoma as reported by Zini et al.
Variables such as age, race, year of diagnosis, type of surgery (excisional
biopsy vs partial penectomy vs radical penectomy), stage (localized vs
regional vs metastatic), and grade were examined in univariate and
multivariate Cox regression analyses. Stage and grade were indepen-
dently predictive of survival on univariate analysis. Multivariate analysis
was then conducted with stepwise variable removal, which resulted in a
Curr Probl Cancer 2012;36:200-207.
0147-0272/$36.00 ϩ 0
200 Curr Probl Cancer, July/August 2012