题目:多源数据融合统计建模方法及应用
报告人:方匡南
报告时间:2023年5月19日 14:00-14:50
地点:综合楼644会议室
报告人简介:
方匡南,厦门大学经济学院统计学与数据科学系教授、博士生导师、耶鲁大学博士后,厦门大学经济学院统计学与数据科学系副主任,厦门大学信用大数据与智能风控研究中心主任,国际统计学会 elected member,国家社科基金重大项目首席专家。主要从事统计机器学习、经济管理统计、金融科技等。入选国家级高层次青年拔尖人才、福建省高层次人才A类、福建省“特支双百计划”青年拔尖人才等。兼全国工业统计教学研究会副会长、中国商业统计学会常务理事、《统计研究》、《数理统计与管理》编委等。在国内外权威期刊共发表学术论文100余篇论文,著有学术专著和教材等6部。获省部级以上科研成果奖项10多项,多项科研成果被省部级以上领导批示。主持国家社科基金重大项目1项,国家自然科学基金4项,以及教育部人文社科、国家统计局重大项目等10多项纵向项目以及承担了华为、华星光电等30多项企事业横向项目。
报告摘要:
In diverse fields ranging from finance to omics, it is increasingly common that data is distributed and with multiple individual sources (referred to as \clients" in some studies). Integrating raw data, although powerful, is often not feasible, for example, when there are considerations on privacy protection. Distributed learning techniques have been developed to integrate summary statistics as opposed to raw data. In many of the existing distributed learning studies, it is stringently assumed that all the clients have the same model. To accommodate data heterogeneity, some federated learning methods allow for client-specific models. In this article, we consider the scenario that clients form clusters, those in the samecluster have the same model, and different clusters have different models. Further considering the clustering structure can lead to a better understanding of the \interconnections" among clients and reduce the number of parameters. To this end, we develop a novel penalization approach. Specifically, group penalization is imposed for regularized estimation and selection of important variables, and fusion penalization is imposed to automatically cluster clients. An effective ADMM algorithm is developed, and the estimation, selection, and clustering consistency properties are established under mild conditions. Simulation and data analysis further demonstrate the practical utility and superiority of the proposed approach.
上一条: 没有了 |
下一条: 没有了 |