Heterogeneity-aware and communication-efficient distributed statistical inference

Duan, Rui; Ning, Yang; Chen, Yong

doi:10.1093/biomet/asab007

Summary

In multicentre research, individual-level data are often protected against sharing across sites. To overcome the barrier of data sharing, many distributed algorithms, which only require sharing aggregated information, have been developed. The existing distributed algorithms usually assume the data are homogeneously distributed across sites. This assumption ignores the important fact that the data collected at different sites may come from various subpopulations and environments, which can lead to heterogeneity in the distribution of the data. Ignoring the heterogeneity may lead to erroneous statistical inference. We propose distributed algorithms which account for the heterogeneous distributions by allowing site-specific nuisance parameters. The proposed methods extend the surrogate likelihood approach (Wang et al. 2017; Jordan et al. 2018) to the heterogeneous setting by applying a novel density ratio tilting method to the efficient score function. The proposed algorithms maintain the same communication cost as existing communication-efficient algorithms. We establish a nonasymptotic risk bound for the proposed distributed estimator and its limiting distribution in the two-index asymptotic setting, which allows both sample size per site and the number of sites to go to infinity. In addition, we show that the asymptotic variance of the estimator attains the Cramér–Rao lower bound when the number of sites is smaller in rate than the sample size at each site. Finally, we use simulation studies and a real data application to demonstrate the validity and feasibility of the proposed methods.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

You do not currently have access to this article.

Download all slides

Month:	Total Views:
February 2021	99
March 2021	108
April 2021	50
May 2021	31
June 2021	45
July 2021	26
August 2021	32
September 2021	35
October 2021	55
November 2021	54
December 2021	42
January 2022	45
February 2022	134
March 2022	148
April 2022	120
May 2022	118
June 2022	67
July 2022	45
August 2022	29
September 2022	58
October 2022	58
November 2022	68
December 2022	58
January 2023	53
February 2023	76
March 2023	101
April 2023	105
May 2023	62
June 2023	52
July 2023	53
August 2023	33
September 2023	88
October 2023	87
November 2023	101
December 2023	77
January 2024	85
February 2024	73
March 2024	65
April 2024	34

Heterogeneity-aware and communication-efficient distributed statistical inference

Summary

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Heterogeneity-aware and communication-efficient distributed statistical inference

Summary

Sign in

Personal account

Institutional access

Institutional account management

Get help with access

Institutional access

IP based access

Sign in through your institution

Sign in with a library card

Society Members

Sign in through society site

Sign in using a personal account

Personal account

Viewing your signed in accounts

Signed in but can't access content

Institutional account management

Purchase

Short-term Access

Rental

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only