中国全科医学

• •

病例队列设计下乳腺癌患者雌二醇水平及其生存数据的联合建模研究

吴梦娟1,张涛1,高春洁1,赵婷2,王蕾3*   

  1. 1.830017 新疆维吾尔自治区乌鲁木齐市,新疆医科大学公共卫生学院流行病与卫生统计学教研室 2.830011 新疆维吾尔自治区乌鲁木齐市,新疆医科大学附属肿瘤医院病案管理科 3.830017 新疆维吾尔自治区乌鲁木齐市,新疆医科大学医学工程技术学院数学教研室
  • 收稿日期:2023-12-21 修回日期:2024-03-04 接受日期:2024-03-20
  • 通讯作者: 王蕾,教授;E-mail:wlei81@126.com
  • 基金资助:
    国家自然科学基金资助项目(12061079);“天山英才”青年科技创新人才培养(2022TSYCCX0108);新疆自然科学基金资助项目(2022D01C287)

Joint-modeling of Estradiol Levels and Survival Data of Breast Cancer Patients in the Casee-Cohort Design

WU Mengjuan1,ZHANG Tao1,GAO Chunjie1,ZHAO Ting2,WANG Lei3*   

  1. 1.College of Public Health,Xinjiang Medical University,Urumqi 830017,China 2.Department of Medical Record Management,the Affiliated Cancer Hospital of Xinjiang Medical University,Urumqi 830011,China 3.Department of Medical Engineering and Technology,Xinjiang Medical University,Urumqi 830017,China
  • Received:2023-12-21 Revised:2024-03-04 Accepted:2024-03-20
  • Contact: WANG Lei,Professor;E-mail:wlei81@126.com

摘要: 背景 乳腺癌是一种性激素受体依赖的恶性肿瘤,雌二醇(E2)的动态变化在乳腺癌发展过程中起着非常重要的作用;经典病例队列设计完全忽略未选入样本的信息,容易产生估计偏倚。目的 探究乳腺癌患者E2水平动态变化对其生存预后的影响,评估改良病例队列设计的优良性。方法 选取2015—2019年于新疆医科大学附属肿瘤医院经病理学检查确诊为乳腺癌的8226例患者进行随访,以患者确诊时间作为随访时间起点、患者因乳腺癌死亡为结局事件,随访截止日期为2021-12-31。收集患者的人口学特征、免疫组化指标、临床病理特征以及生存状态等,并对患者的血清E2水平进行纵向测量。基于经典病例队列设计,通过纳入病例队列样本外患者的生存数据改良病例队列设计。在经典及改良病例队列设计下采用线性混合效应模型和Cox比例风险模型分别拟合乳腺癌患者的纵向数据(纵向子模型)和生存数据(生存子模型),并建立纵向与时间-事件数据的联合模型;进一步采用马尔可夫链蒙特卡罗算法对联合模型参数进行估计;此外,通过受试者工作特征曲线下面积(AUC)以及预测误差(PE)比较经典及改良病例队列设计下联合模型的区分度与校准度。结果 基于纳入与排除标准,本研究全队列中共纳入895例乳腺癌患者作为研究对象,其中53例患者因乳腺癌死亡。患者中位随访时间约为28个月。从全队列中抽取1/4的患者作为随机子队列,与随机子队列外在随访期间死亡的患者合并作为经典病例队列设计的样本,其中,包含236例患者的生存数据、1062人次E2水平的测量值。此外,在经典病例队列设计的基础上,纳入经典病例队列样本之外在随访期间存活的乳腺癌患者(G4)的生存数据,作为改良病例队列设计的样本(共包含895例患者的生存数据、236例患者1062人次的E2水平测量值,其中认为存在2958人次E2水平测量的纵向缺失值)。经典和改良病例队列设计下的联合模型结果均显示E2水平动态变化是乳腺癌患者预后的影响因素,且lg(E2)纵向每增加1个单位,患者的死亡风险将分别增加23%(HR=1.23,R^=1.015)和8%(HR=1.08,R^=1.020)。此外,改良病例队列设计下的联合模型展现出更好的区分度与校准度(AUC=0.706~0.962,PE=0.0012~0.0108)。结论 乳腺癌患者E2水平纵向升高可能会导致患者生存概率降低。病例队列设计下联合模型能够对纵向与生存数据同时进行分析,且改良病例队列设计优于经典病例队列设计。

关键词: 乳腺癌, 雌二醇, 病例队列设计, 联合模型, 生存数据

Abstract: Background Breast cancer is a hormone receptor-dependent malignant tumor,and the dynamical changes of estradiol(E2) play a critical role in the development of breast cancer. The classical case-cohort design completely ignores the information of non-selected samples,which could easily lead to biased estimating. Objective To explore the effect of dynamical changes of E2 levels on the survival prognosis in breast cancer patients,and evaluate the superiority of improved case-cohort design. Methods In this study,we selected 8 226 patients who were diagnosed as breast cancer by pathological examination at the Affiliated Cancer Hospital of Xinjiang Medical University from 2015 to 2019,by using the time of patient diagnosis as the follow-up start date,and defining the death of patients due to breast cancer as the outcome event. The followup end date was December 31,2021. The demographic characteristics,immunohistochemical indicators,clinicopathological characteristics and survival status of patients were gathered,and their serum E2 levels were longitudinally monitored. Based on the classical case-cohort design,the improved case-cohort design was achieved by incorporating survival data from patients outside of the case-cohort sample. Under the classical and improved case-cohort designs,linear mixed effects model and Cox proportional risk model were used to fit the longitudinal data(longitudinal submodel) and survival data(survival submodel) of breast cancer patients,respectively,and two joint models for longitudinal and time-to-event data were further established. Moreover,Markov chain Monte Carlo algorithm was used to estimate the parameters of two joint models. The area under the receiver operating characteristic curves(AUC) and prediction errors(PE) were further applied to compare the discrimination and calibration of two joint models under the classical and improved case-cohort designs. Results Based on the inclusion and exclusion criteria,a total of 895 breast cancer patients were included in the full cohort,of which 53 patients died of breast cancer. The median followup time for patients was approximately 28 months. The samples of classical case-cohort design were concluded two parts:one was one quarter of the patients selected from the full cohort as a random subcohort,the other was patients who died during the follow-up period outside the random subcohort,of which included survival data from 236 patients and 1 062 measurements of E2 levels. Moreover,on the basis of the classical case-cohort design,the survival data of breast cancer patients who were outside of the classical case-cohort samples and survived during the follow-up period(G4)were included as the samples of the improved case-cohort design that included survival data from 895 patients,1 062 measurements of E2 levels from 236 patients(in which it was assumed that there were 2 958 longitudinally missing measurements of E2 levels).The results of two joint models under classical and improved case-cohort designs both revealed that dynamical change of E2 levels was identified as the influencing prognostic factors for breast cancer patients. For one-unit longitudinal increment of lg(E2),the mortality risks for patients would increase by about 23%(HR=1.23,R ^ =1.015)and 8%(HR=1.08,R ^ =1.020),respectively. Moreover,the joint model under the improved case-cohort design showed better discrimination and calibration(AUC=0.706-0.962,PE=0.001 2-0.010 8). Conclusion The longitudinal increment of E2 levels could cause a decrease of the survival probability for breast cancer patients. The joint model under case-cohort design could both analyze longitudinal and survival data,and the improved case-cohort design would be superior to that of the classical case-cohort design.

Key words: Breast cancer, Estradiol, Case-cohort design, Joint model, Survival data

中图分类号: