Discussion of ‘Network cross-validation by edge sampling’

Chang, Jinyuan; Kolaczyk, Eric D; Yao, Qiwei

doi:10.1093/biomet/asaa017

Extract

1. Edge cross-validation for network model selection

We thank the authors for their new contribution to network modelling. Data reuse, encompassing methods such as bootstrapping and cross-validation, is an area that to date has largely resisted obvious and rapid development in the network context. One of the major reasons is that mimicking the original sampling mechanisms is challenging if not impossible. To avoid deleting edges and destroying some of the network structure, the resampling strategy proposed in Li et al. (2020) based on splitting node pairs rather than nodes is therefore insightful and effective. Matrix completion is the key technique involved, with its use here providing a new perspective for network analysis.

The proposed edge cross-validation procedure operates effectively on an adjacency matrix |$A=(A_{ij})_{n\times n}$| instead of on the original network, as described in the following algorithm.

Algorithm 1.

The general edge cross-validation procedure.

Step 1. Give a loss function |$L$| and select the rank |$\hat{K}$| for matrix completion.

Step 2. For |$m=1$| to |$m=N$|⁠:

|$\qquad$|(b) Obtain |$\skew6\hat{A}$| from |$(A, \Omega)$| by a low-rank matrix completion algorithm with rank |$\hat{K}$|⁠.

|$\qquad$|(c) For each candidate model |$q=1,\ldots,Q$|⁠, fit the model on data |$\skew6\hat{A}$| and evaluate its loss |$L_q^{(m)}$| by comparing the resulting estimated parameters over the held-out set |$\{A_{ij}:(i,j) \not \in \Omega\}$|⁠.

Step 3. Let |$L_q=N^{-1}\sum_{m=1}^NL_q^{(m)}$| and select the candidate model |$\hat{q}=\arg\min_{1\leq q\leq Q}L_q$|⁠.

The authors proposed using |$p=0.9$| and replicated the validation |$N=3$| times. Therefore only about 30% of the edges are used for cross-validation. Borrowing from the noisy network setting of Chang et al. (2018), the proposal below will use all of the edges for validation. It is in the spirit of jittered bootstrapping or resampling via jittering. Here jittering means that a small amount of noise is added to every single data point; see, for example, Henning (2007, § 3.3). Interestingly, the low-rank assumption does not appear to be necessary in this approach.

You do not currently have access to this article.

Download all slides

Month:	Total Views:
May 2020	58
June 2020	49
July 2020	22
August 2020	23
September 2020	20
October 2020	19
November 2020	12
December 2020	10
January 2021	18
February 2021	7
March 2021	7
April 2021	7
May 2021	4
June 2021	13
July 2021	23
August 2021	5
September 2021	8
October 2021	9
November 2021	17
December 2021	12
January 2022	6
February 2022	3
March 2022	5
April 2022	2
May 2022	5
June 2022	4
July 2022	5
August 2022	3
September 2022	3
October 2022	9
November 2022	4
January 2023	2
February 2023	4
March 2023	1
April 2023	2
June 2023	8
July 2023	4
August 2023	2
September 2023	4
October 2023	5
November 2023	3
January 2024	5
March 2024	3

Discussion of ‘Network cross-validation by edge sampling’

Extract

1. Edge cross-validation for network model selection

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Discussion of ‘Network cross-validation by edge sampling’

Extract

1. Edge cross-validation for network model selection

Sign in

Personal account

Institutional access

Institutional account management

Get help with access

Institutional access

IP based access

Sign in through your institution

Sign in with a library card

Society Members

Sign in through society site

Sign in using a personal account

Personal account

Viewing your signed in accounts

Signed in but can't access content

Institutional account management

Purchase

Short-term Access

Rental

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only