Sparse least trimmed squares regression with compositional covariates for high-dimensional data

Table 1.

Means and standard deviations of various performance measures among different methods, based on 100 simulations

		PE		PE (10%)		Loss 1		Loss 2		Loss $\infty$
		Mean	SD	Mean	SD	Mean	SD	Mean	SD	Mean	SD
(A)	Lasso	0.415	0.104	0.300	0.080	1.282	0.445	0.177	0.101	0.227	0.074
	ZS	0.393	0.107	0.283	0.080	1.182	0.427	0.145	0.078	0.200	0.059
	RobL	0.615	0.342	0.443	0.247	1.869	0.784	0.427	0.420	0.338	0.142
	RobZS	1.031	0.487	0.739	0.364	3.115	1.077	0.841	0.555	0.449	0.171
	SLTS	1.423	0.775	1.033	0.582	3.019	1.107	1.327	0.800	0.640	0.205
	ZS (B&T)	0.476	0.237	0.347	0.187
(B)	Lasso	5.212	2.192	5.850	2.433	5.725	1.987	4.190	1.765	1.143	0.254
	ZS	5.129	2.219	5.716	2.405	5.555	1.805	3.936	1.720	1.084	0.258
	RobL	1.156	0.954	1.318	1.104	2.623	1.429	1.020	1.095	0.506	0.276
	RobZS	0.760	0.327	0.861	0.380	2.383	0.813	0.525	0.290	0.377	0.114
	SLTS	0.932	0.475	1.057	0.551	2.257	0.827	0.759	0.476	0.497	0.153
	ZS (B&T)	5.699	2.416	6.175	2.453
(C)	Lasso	17.911	5.668	14.600	3.586	13.183	2.824	15.463	5.921	2.021	0.511
	ZS	17.115	6.136	14.300	3.889	12.818	2.833	14.435	5.325	1.941	0.392
	RobL	0.822	0.604	0.927	0.684	2.304	1.158	0.650	0.629	0.414	0.177
	RobZS	0.762	0.403	0.860	0.458	2.416	1.012	0.541	0.417	0.367	0.129
	SLTS	0.931	0.489	1.050	0.564	2.294	0.839	0.749	0.487	0.490	0.174
	ZS (B&T)	20.836	6.259	17.015	4.070

		PE		PE (10%)		Loss 1		Loss 2		Loss $\infty$
		Mean	SD	Mean	SD	Mean	SD	Mean	SD	Mean	SD
(A)	Lasso	0.415	0.104	0.300	0.080	1.282	0.445	0.177	0.101	0.227	0.074
	ZS	0.393	0.107	0.283	0.080	1.182	0.427	0.145	0.078	0.200	0.059
	RobL	0.615	0.342	0.443	0.247	1.869	0.784	0.427	0.420	0.338	0.142
	RobZS	1.031	0.487	0.739	0.364	3.115	1.077	0.841	0.555	0.449	0.171
	SLTS	1.423	0.775	1.033	0.582	3.019	1.107	1.327	0.800	0.640	0.205
	ZS (B&T)	0.476	0.237	0.347	0.187
(B)	Lasso	5.212	2.192	5.850	2.433	5.725	1.987	4.190	1.765	1.143	0.254
	ZS	5.129	2.219	5.716	2.405	5.555	1.805	3.936	1.720	1.084	0.258
	RobL	1.156	0.954	1.318	1.104	2.623	1.429	1.020	1.095	0.506	0.276
	RobZS	0.760	0.327	0.861	0.380	2.383	0.813	0.525	0.290	0.377	0.114
	SLTS	0.932	0.475	1.057	0.551	2.257	0.827	0.759	0.476	0.497	0.153
	ZS (B&T)	5.699	2.416	6.175	2.453
(C)	Lasso	17.911	5.668	14.600	3.586	13.183	2.824	15.463	5.921	2.021	0.511
	ZS	17.115	6.136	14.300	3.889	12.818	2.833	14.435	5.325	1.941	0.392
	RobL	0.822	0.604	0.927	0.684	2.304	1.158	0.650	0.629	0.414	0.177
	RobZS	0.762	0.403	0.860	0.458	2.416	1.012	0.541	0.417	0.367	0.129
	SLTS	0.931	0.489	1.050	0.564	2.294	0.839	0.749	0.487	0.490	0.174
	ZS (B&T)	20.836	6.259	17.015	4.070

Note: Parameter configuration: $(n, p)$ =(50, 30), $ρ = 0.2$ ⁠. The best values (of “mean”) among the different methods are presented in bold.

PE, prediction error.

Table 2.

Means and standard deviations of various performance measures among different methods, based on 100 simulations

		PE		PE (10%)		Loss 1		Loss 2		Loss $\infty$
		Mean	SD	Mean	SD	Mean	SD	Mean	SD	Mean	SD
(A)	Lasso	0.390	0.073	0.275	0.054	1.602	0.518	0.164	0.066	0.207	0.059
	ZS	0.380	0.072	0.268	0.053	1.561	0.601	0.147	0.064	0.185	0.051
	RobL	0.772	0.804	0.548	0.577	2.474	1.474	0.581	0.856	0.350	0.236
	RobZS	1.179	0.483	0.836	0.353	3.582	0.875	1.031	0.517	0.521	0.146
	SLTS	1.367	0.813	0.974	0.601	3.377	1.262	1.280	0.798	0.603	0.191
	ZS (B&T)	0.377	0.176	0.267	0.129
(B)	Lasso	4.549	1.637	4.999	1.660	6.468	2.662	3.609	1.197	1.000	0.195
	ZS	4.366	1.366	4.827	1.417	6.202	1.952	3.443	1.017	0.975	0.210
	RobL	1.514	0.957	1.680	1.044	3.700	1.671	1.395	1.017	0.600	0.253
	RobZS	0.771	0.383	0.865	0.437	2.709	0.958	0.575	0.405	0.388	0.135
	SLTS	0.790	0.398	0.885	0.451	2.380	0.853	0.609	0.416	0.417	0.130
	ZS (B&T)	4.736	1.863	5.191	1.907
(C)	Lasso	5.163	1.182	3.834	0.837	10.414	1.780	10.477	1.271	1.674	0.172
	ZS	8.821	1.727	6.519	1.168	10.935	1.865	7.405	1.165	1.339	0.197
	RobL	0.953	0.691	1.073	0.788	2.978	1.395	0.806	0.779	0.461	0.213
	RobZS	0.672	0.318	0.752	0.357	2.526	0.936	0.480	0.377	0.346	0.116
	SLTS	0.733	0.345	0.822	0.384	2.304	0.887	0.580	0.376	0.406	0.131
	ZS (B&T)	12.928	3.297	9.247	2.243

		PE		PE (10%)		Loss 1		Loss 2		Loss $\infty$
		Mean	SD	Mean	SD	Mean	SD	Mean	SD	Mean	SD
(A)	Lasso	0.390	0.073	0.275	0.054	1.602	0.518	0.164	0.066	0.207	0.059
	ZS	0.380	0.072	0.268	0.053	1.561	0.601	0.147	0.064	0.185	0.051
	RobL	0.772	0.804	0.548	0.577	2.474	1.474	0.581	0.856	0.350	0.236
	RobZS	1.179	0.483	0.836	0.353	3.582	0.875	1.031	0.517	0.521	0.146
	SLTS	1.367	0.813	0.974	0.601	3.377	1.262	1.280	0.798	0.603	0.191
	ZS (B&T)	0.377	0.176	0.267	0.129
(B)	Lasso	4.549	1.637	4.999	1.660	6.468	2.662	3.609	1.197	1.000	0.195
	ZS	4.366	1.366	4.827	1.417	6.202	1.952	3.443	1.017	0.975	0.210
	RobL	1.514	0.957	1.680	1.044	3.700	1.671	1.395	1.017	0.600	0.253
	RobZS	0.771	0.383	0.865	0.437	2.709	0.958	0.575	0.405	0.388	0.135
	SLTS	0.790	0.398	0.885	0.451	2.380	0.853	0.609	0.416	0.417	0.130
	ZS (B&T)	4.736	1.863	5.191	1.907
(C)	Lasso	5.163	1.182	3.834	0.837	10.414	1.780	10.477	1.271	1.674	0.172
	ZS	8.821	1.727	6.519	1.168	10.935	1.865	7.405	1.165	1.339	0.197
	RobL	0.953	0.691	1.073	0.788	2.978	1.395	0.806	0.779	0.461	0.213
	RobZS	0.672	0.318	0.752	0.357	2.526	0.936	0.480	0.377	0.346	0.116
	SLTS	0.733	0.345	0.822	0.384	2.304	0.887	0.580	0.376	0.406	0.131
	ZS (B&T)	12.928	3.297	9.247	2.243

Note: Parameter configuration: $(n, p)$ =(100, 200, $ρ = 0.2$ ⁠). The best values (of “mean”) among the different methods are presented in bold.

PE, prediction error.

Table 2.

Means and standard deviations of various performance measures among different methods, based on 100 simulations

		PE		PE (10%)		Loss 1		Loss 2		Loss $\infty$
		Mean	SD	Mean	SD	Mean	SD	Mean	SD	Mean	SD
(A)	Lasso	0.390	0.073	0.275	0.054	1.602	0.518	0.164	0.066	0.207	0.059
	ZS	0.380	0.072	0.268	0.053	1.561	0.601	0.147	0.064	0.185	0.051
	RobL	0.772	0.804	0.548	0.577	2.474	1.474	0.581	0.856	0.350	0.236
	RobZS	1.179	0.483	0.836	0.353	3.582	0.875	1.031	0.517	0.521	0.146
	SLTS	1.367	0.813	0.974	0.601	3.377	1.262	1.280	0.798	0.603	0.191
	ZS (B&T)	0.377	0.176	0.267	0.129
(B)	Lasso	4.549	1.637	4.999	1.660	6.468	2.662	3.609	1.197	1.000	0.195
	ZS	4.366	1.366	4.827	1.417	6.202	1.952	3.443	1.017	0.975	0.210
	RobL	1.514	0.957	1.680	1.044	3.700	1.671	1.395	1.017	0.600	0.253
	RobZS	0.771	0.383	0.865	0.437	2.709	0.958	0.575	0.405	0.388	0.135
	SLTS	0.790	0.398	0.885	0.451	2.380	0.853	0.609	0.416	0.417	0.130
	ZS (B&T)	4.736	1.863	5.191	1.907
(C)	Lasso	5.163	1.182	3.834	0.837	10.414	1.780	10.477	1.271	1.674	0.172
	ZS	8.821	1.727	6.519	1.168	10.935	1.865	7.405	1.165	1.339	0.197
	RobL	0.953	0.691	1.073	0.788	2.978	1.395	0.806	0.779	0.461	0.213
	RobZS	0.672	0.318	0.752	0.357	2.526	0.936	0.480	0.377	0.346	0.116
	SLTS	0.733	0.345	0.822	0.384	2.304	0.887	0.580	0.376	0.406	0.131
	ZS (B&T)	12.928	3.297	9.247	2.243

		PE		PE (10%)		Loss 1		Loss 2		Loss $\infty$
		Mean	SD	Mean	SD	Mean	SD	Mean	SD	Mean	SD
(A)	Lasso	0.390	0.073	0.275	0.054	1.602	0.518	0.164	0.066	0.207	0.059
	ZS	0.380	0.072	0.268	0.053	1.561	0.601	0.147	0.064	0.185	0.051
	RobL	0.772	0.804	0.548	0.577	2.474	1.474	0.581	0.856	0.350	0.236
	RobZS	1.179	0.483	0.836	0.353	3.582	0.875	1.031	0.517	0.521	0.146
	SLTS	1.367	0.813	0.974	0.601	3.377	1.262	1.280	0.798	0.603	0.191
	ZS (B&T)	0.377	0.176	0.267	0.129
(B)	Lasso	4.549	1.637	4.999	1.660	6.468	2.662	3.609	1.197	1.000	0.195
	ZS	4.366	1.366	4.827	1.417	6.202	1.952	3.443	1.017	0.975	0.210
	RobL	1.514	0.957	1.680	1.044	3.700	1.671	1.395	1.017	0.600	0.253
	RobZS	0.771	0.383	0.865	0.437	2.709	0.958	0.575	0.405	0.388	0.135
	SLTS	0.790	0.398	0.885	0.451	2.380	0.853	0.609	0.416	0.417	0.130
	ZS (B&T)	4.736	1.863	5.191	1.907
(C)	Lasso	5.163	1.182	3.834	0.837	10.414	1.780	10.477	1.271	1.674	0.172
	ZS	8.821	1.727	6.519	1.168	10.935	1.865	7.405	1.165	1.339	0.197
	RobL	0.953	0.691	1.073	0.788	2.978	1.395	0.806	0.779	0.461	0.213
	RobZS	0.672	0.318	0.752	0.357	2.526	0.936	0.480	0.377	0.346	0.116
	SLTS	0.733	0.345	0.822	0.384	2.304	0.887	0.580	0.376	0.406	0.131
	ZS (B&T)	12.928	3.297	9.247	2.243

Note: Parameter configuration: $(n, p)$ =(100, 200, $ρ = 0.2$ ⁠). The best values (of “mean”) among the different methods are presented in bold.

PE, prediction error.

Table 3.

Means and standard deviations of various performance measures among different methods, based on 100 simulations

		PE		PE (10%)		Loss 1		Loss 2		Loss $\infty$
		Mean	SD	Mean	SD	Mean	SD	Mean	SD	Mean	SD
(A)	Lasso	0.500	0.117	0.352	0.084	2.280	0.618	0.293	0.111	0.284	0.073
	ZS	0.480	0.103	0.339	0.074	2.150	0.603	0.259	0.101	0.261	0.067
	RobL	3.354	1.890	2.368	1.343	5.633	1.845	3.356	1.894	0.934	0.369
	RobZS	1.919	1.289	1.362	0.916	4.462	1.606	1.859	1.409	0.669	0.291
	SLTS	2.730	1.088	1.928	0.771	5.573	1.274	2.712	1.032	0.873	0.192
	ZS (B&T)	0.470	0.358	0.334	0.262
(B)	Lasso	5.481	1.190	5.946	1.172	7.537	2.912	4.709	1.068	1.133	0.185
	ZS	5.444	1.171	5.902	1.177	7.575	2.830	4.647	1.108	1.128	0.209
	RobL	3.157	1.157	3.544	1.333	6.037	1.371	3.287	1.240	0.946	0.233
	RobZS	1.513	1.013	1.701	1.143	3.999	1.408	1.434	1.156	0.591	0.252
	SLTS	1.966	0.998	2.214	1.141	4.875	1.486	1.950	1.079	0.726	0.223
	ZS (B&T)	5.827	2.178	6.337	2.287
(C)	Lasso	2.996	0.722	2.324	0.580	8.105	1.175	6.224	0.643	1.258	0.120
	ZS	6.356	1.105	4.655	0.900	9.787	1.461	5.477	0.862	1.135	0.166
	RobL	2.843	1.527	3.162	1.712	5.552	1.594	2.911	1.438	0.892	0.260
	RobZS	1.342	0.819	1.490	0.914	3.986	1.326	1.282	0.981	0.556	0.222
	SLTS	1.818	0.877	2.026	0.960	4.604	1.386	1.833	0.910	0.707	0.197
	ZS (B&T)	10.260	2.300	7.197	1.632

		PE		PE (10%)		Loss 1		Loss 2		Loss $\infty$
		Mean	SD	Mean	SD	Mean	SD	Mean	SD	Mean	SD
(A)	Lasso	0.500	0.117	0.352	0.084	2.280	0.618	0.293	0.111	0.284	0.073
	ZS	0.480	0.103	0.339	0.074	2.150	0.603	0.259	0.101	0.261	0.067
	RobL	3.354	1.890	2.368	1.343	5.633	1.845	3.356	1.894	0.934	0.369
	RobZS	1.919	1.289	1.362	0.916	4.462	1.606	1.859	1.409	0.669	0.291
	SLTS	2.730	1.088	1.928	0.771	5.573	1.274	2.712	1.032	0.873	0.192
	ZS (B&T)	0.470	0.358	0.334	0.262
(B)	Lasso	5.481	1.190	5.946	1.172	7.537	2.912	4.709	1.068	1.133	0.185
	ZS	5.444	1.171	5.902	1.177	7.575	2.830	4.647	1.108	1.128	0.209
	RobL	3.157	1.157	3.544	1.333	6.037	1.371	3.287	1.240	0.946	0.233
	RobZS	1.513	1.013	1.701	1.143	3.999	1.408	1.434	1.156	0.591	0.252
	SLTS	1.966	0.998	2.214	1.141	4.875	1.486	1.950	1.079	0.726	0.223
	ZS (B&T)	5.827	2.178	6.337	2.287
(C)	Lasso	2.996	0.722	2.324	0.580	8.105	1.175	6.224	0.643	1.258	0.120
	ZS	6.356	1.105	4.655	0.900	9.787	1.461	5.477	0.862	1.135	0.166
	RobL	2.843	1.527	3.162	1.712	5.552	1.594	2.911	1.438	0.892	0.260
	RobZS	1.342	0.819	1.490	0.914	3.986	1.326	1.282	0.981	0.556	0.222
	SLTS	1.818	0.877	2.026	0.960	4.604	1.386	1.833	0.910	0.707	0.197
	ZS (B&T)	10.260	2.300	7.197	1.632

Note: Parameter configuration: $(n, p) = (100, 1000), ρ = 0.2$ ⁠. The best values (of “mean”) among the different methods are presented in bold.

PE, prediction error.

Table 3.

Means and standard deviations of various performance measures among different methods, based on 100 simulations

		PE		PE (10%)		Loss 1		Loss 2		Loss $\infty$
		Mean	SD	Mean	SD	Mean	SD	Mean	SD	Mean	SD
(A)	Lasso	0.500	0.117	0.352	0.084	2.280	0.618	0.293	0.111	0.284	0.073
	ZS	0.480	0.103	0.339	0.074	2.150	0.603	0.259	0.101	0.261	0.067
	RobL	3.354	1.890	2.368	1.343	5.633	1.845	3.356	1.894	0.934	0.369
	RobZS	1.919	1.289	1.362	0.916	4.462	1.606	1.859	1.409	0.669	0.291
	SLTS	2.730	1.088	1.928	0.771	5.573	1.274	2.712	1.032	0.873	0.192
	ZS (B&T)	0.470	0.358	0.334	0.262
(B)	Lasso	5.481	1.190	5.946	1.172	7.537	2.912	4.709	1.068	1.133	0.185
	ZS	5.444	1.171	5.902	1.177	7.575	2.830	4.647	1.108	1.128	0.209
	RobL	3.157	1.157	3.544	1.333	6.037	1.371	3.287	1.240	0.946	0.233
	RobZS	1.513	1.013	1.701	1.143	3.999	1.408	1.434	1.156	0.591	0.252
	SLTS	1.966	0.998	2.214	1.141	4.875	1.486	1.950	1.079	0.726	0.223
	ZS (B&T)	5.827	2.178	6.337	2.287
(C)	Lasso	2.996	0.722	2.324	0.580	8.105	1.175	6.224	0.643	1.258	0.120
	ZS	6.356	1.105	4.655	0.900	9.787	1.461	5.477	0.862	1.135	0.166
	RobL	2.843	1.527	3.162	1.712	5.552	1.594	2.911	1.438	0.892	0.260
	RobZS	1.342	0.819	1.490	0.914	3.986	1.326	1.282	0.981	0.556	0.222
	SLTS	1.818	0.877	2.026	0.960	4.604	1.386	1.833	0.910	0.707	0.197
	ZS (B&T)	10.260	2.300	7.197	1.632

		PE		PE (10%)		Loss 1		Loss 2		Loss $\infty$
		Mean	SD	Mean	SD	Mean	SD	Mean	SD	Mean	SD
(A)	Lasso	0.500	0.117	0.352	0.084	2.280	0.618	0.293	0.111	0.284	0.073
	ZS	0.480	0.103	0.339	0.074	2.150	0.603	0.259	0.101	0.261	0.067
	RobL	3.354	1.890	2.368	1.343	5.633	1.845	3.356	1.894	0.934	0.369
	RobZS	1.919	1.289	1.362	0.916	4.462	1.606	1.859	1.409	0.669	0.291
	SLTS	2.730	1.088	1.928	0.771	5.573	1.274	2.712	1.032	0.873	0.192
	ZS (B&T)	0.470	0.358	0.334	0.262
(B)	Lasso	5.481	1.190	5.946	1.172	7.537	2.912	4.709	1.068	1.133	0.185
	ZS	5.444	1.171	5.902	1.177	7.575	2.830	4.647	1.108	1.128	0.209
	RobL	3.157	1.157	3.544	1.333	6.037	1.371	3.287	1.240	0.946	0.233
	RobZS	1.513	1.013	1.701	1.143	3.999	1.408	1.434	1.156	0.591	0.252
	SLTS	1.966	0.998	2.214	1.141	4.875	1.486	1.950	1.079	0.726	0.223
	ZS (B&T)	5.827	2.178	6.337	2.287
(C)	Lasso	2.996	0.722	2.324	0.580	8.105	1.175	6.224	0.643	1.258	0.120
	ZS	6.356	1.105	4.655	0.900	9.787	1.461	5.477	0.862	1.135	0.166
	RobL	2.843	1.527	3.162	1.712	5.552	1.594	2.911	1.438	0.892	0.260
	RobZS	1.342	0.819	1.490	0.914	3.986	1.326	1.282	0.981	0.556	0.222
	SLTS	1.818	0.877	2.026	0.960	4.604	1.386	1.833	0.910	0.707	0.197
	ZS (B&T)	10.260	2.300	7.197	1.632

Note: Parameter configuration: $(n, p) = (100, 1000), ρ = 0.2$ ⁠. The best values (of “mean”) among the different methods are presented in bold.

PE, prediction error.

Table 4.

Comparison of selective performance among different methods, scenarios and parameters configuration

$(n, p)$		Method	Scenario
			(A)		(B)		(C)
			Mean	SD	Mean	SD	Mean	SD
(50,30)	FP	Lasso	10.63	4.037	6.07	5.109	12.91	2.771
		ZS	10.37	4.012	6.19	4.896	12.35	2.672
		RobL	10.81	4.282	9.28	5.121	10.71	4.728
		RobZS	17.37	4.894	14.61	4.878	14.95	4.723
		SLTS	7.63	2.581	7.38	2.662	8.24	2.523
		ZS (B&T)	1.46	1.167	1.42	1.505	4.13	1.942
	FN	Lasso	0.00	0.000	2.84	1.680	2.20	1.326
		ZS	0.00	0.000	2.53	1.678	2.05	1.132
		RobL	0.06	0.278	0.54	1.105	0.15	0.458
		RobZS	0.10	0.414	0.04	0.197	0.02	0.141
		SLTS	0.53	0.674	0.25	0.479	0.16	0.420
		ZS (B&T)	0.05	0.219	3.47	1.226	3.66	0.987
(100, 200)	FP	Lasso	29.54	12.851	16.16	13.925	23.89	11.573
		ZS	29.60	13.999	16.19	12.054	30.37	8.684
		RobL	31.27	13.111	23.27	12.984	27.34	13.192
		RobZS	35.39	10.549	31.30	11.079	32.26	12.554
		SLTS	22.10	6.505	20.87	5.417	20.23	5.683
		ZS (B&T)	0.88	0.087	1.48	1.867	4.88	1.677
	FN	Lasso	0.00	0.000	2.67	1.303	0.51	0.577
		ZS	0.00	0.000	2.45	1.250	1.80	1.025
		RobL	0.05	0.219	0.65	0.957	0.12	0.433
		RobZS	0.13	0.418	0.05	0.261	0.04	0.243
		SLTS	0.37	0.525	0.14	0.472	0.09	0.321
		ZS (B&T)	0.01	0.100	3.66	1.066	3.51	1.068
(100, 1000)	FP	Lasso	47.28	18.593	21.28	21.592	28.29	14.163
		ZS	45.02	17.448	22.11	19.950	41.77	11.610
		RobL	37.65	20.594	38.09	23.701	34.99	19.350
		RobZS	42.10	17.210	39.91	16.076	43.74	14.912
		SLTS	40.52	6.850	40.49	7.770	36.40	8.452
		ZS (B&T)	1.11	1.230	1.77	1.958	4.78	1.962
	FN	Lasso	0.00	0.000	3.91	1.065	0.91	0.818
		ZS	0.00	0.000	3.52	1.275	2.08	0.849
		RobL	2.33	1.798	2.28	1.371	1.90	1.453
		RobZS	0.90	1.193	0.59	1.045	0.42	0.768
		SLTS	1.76	1.026	0.94	0.993	1.04	0.777
		ZS (B&T)	0.05	0.261	4.11	1.024	3.91	0.911

$(n, p)$		Method	Scenario
			(A)		(B)		(C)
			Mean	SD	Mean	SD	Mean	SD
(50,30)	FP	Lasso	10.63	4.037	6.07	5.109	12.91	2.771
		ZS	10.37	4.012	6.19	4.896	12.35	2.672
		RobL	10.81	4.282	9.28	5.121	10.71	4.728
		RobZS	17.37	4.894	14.61	4.878	14.95	4.723
		SLTS	7.63	2.581	7.38	2.662	8.24	2.523
		ZS (B&T)	1.46	1.167	1.42	1.505	4.13	1.942
	FN	Lasso	0.00	0.000	2.84	1.680	2.20	1.326
		ZS	0.00	0.000	2.53	1.678	2.05	1.132
		RobL	0.06	0.278	0.54	1.105	0.15	0.458
		RobZS	0.10	0.414	0.04	0.197	0.02	0.141
		SLTS	0.53	0.674	0.25	0.479	0.16	0.420
		ZS (B&T)	0.05	0.219	3.47	1.226	3.66	0.987
(100, 200)	FP	Lasso	29.54	12.851	16.16	13.925	23.89	11.573
		ZS	29.60	13.999	16.19	12.054	30.37	8.684
		RobL	31.27	13.111	23.27	12.984	27.34	13.192
		RobZS	35.39	10.549	31.30	11.079	32.26	12.554
		SLTS	22.10	6.505	20.87	5.417	20.23	5.683
		ZS (B&T)	0.88	0.087	1.48	1.867	4.88	1.677
	FN	Lasso	0.00	0.000	2.67	1.303	0.51	0.577
		ZS	0.00	0.000	2.45	1.250	1.80	1.025
		RobL	0.05	0.219	0.65	0.957	0.12	0.433
		RobZS	0.13	0.418	0.05	0.261	0.04	0.243
		SLTS	0.37	0.525	0.14	0.472	0.09	0.321
		ZS (B&T)	0.01	0.100	3.66	1.066	3.51	1.068
(100, 1000)	FP	Lasso	47.28	18.593	21.28	21.592	28.29	14.163
		ZS	45.02	17.448	22.11	19.950	41.77	11.610
		RobL	37.65	20.594	38.09	23.701	34.99	19.350
		RobZS	42.10	17.210	39.91	16.076	43.74	14.912
		SLTS	40.52	6.850	40.49	7.770	36.40	8.452
		ZS (B&T)	1.11	1.230	1.77	1.958	4.78	1.962
	FN	Lasso	0.00	0.000	3.91	1.065	0.91	0.818
		ZS	0.00	0.000	3.52	1.275	2.08	0.849
		RobL	2.33	1.798	2.28	1.371	1.90	1.453
		RobZS	0.90	1.193	0.59	1.045	0.42	0.768
		SLTS	1.76	1.026	0.94	0.993	1.04	0.777
		ZS (B&T)	0.05	0.261	4.11	1.024	3.91	0.911

FP, number of false positives; FN, number of false negatives. The best values (of “mean”) among the different methods are presented in bold.

Table 4.

Comparison of selective performance among different methods, scenarios and parameters configuration

$(n, p)$		Method	Scenario
			(A)		(B)		(C)
			Mean	SD	Mean	SD	Mean	SD
(50,30)	FP	Lasso	10.63	4.037	6.07	5.109	12.91	2.771
		ZS	10.37	4.012	6.19	4.896	12.35	2.672
		RobL	10.81	4.282	9.28	5.121	10.71	4.728
		RobZS	17.37	4.894	14.61	4.878	14.95	4.723
		SLTS	7.63	2.581	7.38	2.662	8.24	2.523
		ZS (B&T)	1.46	1.167	1.42	1.505	4.13	1.942
	FN	Lasso	0.00	0.000	2.84	1.680	2.20	1.326
		ZS	0.00	0.000	2.53	1.678	2.05	1.132
		RobL	0.06	0.278	0.54	1.105	0.15	0.458
		RobZS	0.10	0.414	0.04	0.197	0.02	0.141
		SLTS	0.53	0.674	0.25	0.479	0.16	0.420
		ZS (B&T)	0.05	0.219	3.47	1.226	3.66	0.987
(100, 200)	FP	Lasso	29.54	12.851	16.16	13.925	23.89	11.573
		ZS	29.60	13.999	16.19	12.054	30.37	8.684
		RobL	31.27	13.111	23.27	12.984	27.34	13.192
		RobZS	35.39	10.549	31.30	11.079	32.26	12.554
		SLTS	22.10	6.505	20.87	5.417	20.23	5.683
		ZS (B&T)	0.88	0.087	1.48	1.867	4.88	1.677
	FN	Lasso	0.00	0.000	2.67	1.303	0.51	0.577
		ZS	0.00	0.000	2.45	1.250	1.80	1.025
		RobL	0.05	0.219	0.65	0.957	0.12	0.433
		RobZS	0.13	0.418	0.05	0.261	0.04	0.243
		SLTS	0.37	0.525	0.14	0.472	0.09	0.321
		ZS (B&T)	0.01	0.100	3.66	1.066	3.51	1.068
(100, 1000)	FP	Lasso	47.28	18.593	21.28	21.592	28.29	14.163
		ZS	45.02	17.448	22.11	19.950	41.77	11.610
		RobL	37.65	20.594	38.09	23.701	34.99	19.350
		RobZS	42.10	17.210	39.91	16.076	43.74	14.912
		SLTS	40.52	6.850	40.49	7.770	36.40	8.452
		ZS (B&T)	1.11	1.230	1.77	1.958	4.78	1.962
	FN	Lasso	0.00	0.000	3.91	1.065	0.91	0.818
		ZS	0.00	0.000	3.52	1.275	2.08	0.849
		RobL	2.33	1.798	2.28	1.371	1.90	1.453
		RobZS	0.90	1.193	0.59	1.045	0.42	0.768
		SLTS	1.76	1.026	0.94	0.993	1.04	0.777
		ZS (B&T)	0.05	0.261	4.11	1.024	3.91	0.911

$(n, p)$		Method	Scenario
			(A)		(B)		(C)
			Mean	SD	Mean	SD	Mean	SD
(50,30)	FP	Lasso	10.63	4.037	6.07	5.109	12.91	2.771
		ZS	10.37	4.012	6.19	4.896	12.35	2.672
		RobL	10.81	4.282	9.28	5.121	10.71	4.728
		RobZS	17.37	4.894	14.61	4.878	14.95	4.723
		SLTS	7.63	2.581	7.38	2.662	8.24	2.523
		ZS (B&T)	1.46	1.167	1.42	1.505	4.13	1.942
	FN	Lasso	0.00	0.000	2.84	1.680	2.20	1.326
		ZS	0.00	0.000	2.53	1.678	2.05	1.132
		RobL	0.06	0.278	0.54	1.105	0.15	0.458
		RobZS	0.10	0.414	0.04	0.197	0.02	0.141
		SLTS	0.53	0.674	0.25	0.479	0.16	0.420
		ZS (B&T)	0.05	0.219	3.47	1.226	3.66	0.987
(100, 200)	FP	Lasso	29.54	12.851	16.16	13.925	23.89	11.573
		ZS	29.60	13.999	16.19	12.054	30.37	8.684
		RobL	31.27	13.111	23.27	12.984	27.34	13.192
		RobZS	35.39	10.549	31.30	11.079	32.26	12.554
		SLTS	22.10	6.505	20.87	5.417	20.23	5.683
		ZS (B&T)	0.88	0.087	1.48	1.867	4.88	1.677
	FN	Lasso	0.00	0.000	2.67	1.303	0.51	0.577
		ZS	0.00	0.000	2.45	1.250	1.80	1.025
		RobL	0.05	0.219	0.65	0.957	0.12	0.433
		RobZS	0.13	0.418	0.05	0.261	0.04	0.243
		SLTS	0.37	0.525	0.14	0.472	0.09	0.321
		ZS (B&T)	0.01	0.100	3.66	1.066	3.51	1.068
(100, 1000)	FP	Lasso	47.28	18.593	21.28	21.592	28.29	14.163
		ZS	45.02	17.448	22.11	19.950	41.77	11.610
		RobL	37.65	20.594	38.09	23.701	34.99	19.350
		RobZS	42.10	17.210	39.91	16.076	43.74	14.912
		SLTS	40.52	6.850	40.49	7.770	36.40	8.452
		ZS (B&T)	1.11	1.230	1.77	1.958	4.78	1.962
	FN	Lasso	0.00	0.000	3.91	1.065	0.91	0.818
		ZS	0.00	0.000	3.52	1.275	2.08	0.849
		RobL	2.33	1.798	2.28	1.371	1.90	1.453
		RobZS	0.90	1.193	0.59	1.045	0.42	0.768
		SLTS	1.76	1.026	0.94	0.993	1.04	0.777
		ZS (B&T)	0.05	0.261	4.11	1.024	3.91	0.911

FP, number of false positives; FN, number of false negatives. The best values (of “mean”) among the different methods are presented in bold.

In Scenario (A)—no outliers (Clean)—ZS and ZS (B&T) show the best performance in terms of the mean prediction error. Of course, the Lasso estimator (and its robust version), as well as the SLTS estimator, only perform variable selection, but they do not fulfill the condition that the sum of the estimated regression coefficients should be zero, missing the desirable properties of CODA analysis mentioned in Section 2, and these results are reported here only for benchmarking purposes. The algorithm of Bates and Tibshirani (2019), ZS (B&T), slightly improves the ZS prediction error results. A big difference is the excellent performance for the false positives (FP), but a (much) poorer performance for the false negatives (FN), see Table 4; the latter might be more important in applications.

All robust methods lose efficiency which is reflected by a somewhat higher prediction error, and the gap to the non-robust estimators is increasing in higher dimension. However, this gap is smaller for the mean 10% trimmed PE, which means that although no outliers have been generated, there are test set observations which are clearly deviating from the data majority. All methods perform well in terms of the average number of false positives and false negatives. In high dimension, the robust methods produce a higher number of FN. The estimation accuracy through $ℓ_{q}$ losses is quite comparable for all methods, but the values increase for the robust methods with increasing dimensionality.

The second scenario (B)—outliers in the response, or vertical outliers (Vert)—shows quite different results: the Lasso and ZeroSum estimators are strongly influenced by the outliers. The prediction errors increased dramatically, and the same is true for the $ℓ_{q}$ losses. The reason for that can be seen in the high number of FN (remember that 6 non-zero coefficients have been generated). The robust estimators achieve similar results as in the case of non-contaminated data. RobZS shows an excellent behavior, and it is the clear winner especially in the high-dimensional situation. Since the non-trimmed and the trimmed prediction errors are very similar for the robust estimators, they are able to correctly identify the model and thus the generated outliers. The variable selection performance of the proposed estimator is comparable to that of SLTS, but it tends to select fewer FN at the cost of slightly increased FP. We note that the FN have a substantially higher negative effect on the prediction error than FP, as important variables are incorrectly ignored.

In the third scenario (C)—outliers in both response and the predictors (Both)—the RobZS estimators shows the best performance in terms of prediction error, especially in the high-dimensional setting. As observed before, RobZS leads to the smallest FN at the cost of a higher FP.

An interesting observation is that, although SLTS shows in several settings comparable performance to the RobZS estimator, it performs poor in terms of the FN for the uncontaminated setting, particularly in lower dimension.

Overall, we observe a general decrease in prediction accuracy for the Lasso and the ZeroSum estimators in presence of vertical outliers and with both vertical outliers and bad leverage points, underlining the need for robust methods. Moreover, in the contaminated scenarios, the standard deviations of the RobZS estimator for the various performance measures are among the smallest, suggesting stability in the estimation and in the prediction performance in all considered settings. These simulation results also enhance that the RobZS estimator has an excellent prediction performance in contaminated scenarios.

Supplementary Section S2.1 of Supplementary Material reports the analogue results for $ρ = 0.5$ and 10% contamination, as well as results for a contamination level of 20%. These results follow the same tendency and thus a change in the correlation structure of the explanatory variables or an increase of the amount of outliers has no essential impact on the overall conclusions. An exception is that for 20% contamination, SLTS is very competitive to RobZS in scenario (B) in lower dimension, but this advantage disappears in the high-dimensional setting.

3.4 Simulations with increasing proportion of zeros in the covariates

We compare the predictive accuracy of the ZS and RobZS estimators as a function of the proportion of zeros in the training and test data, because this setting is relevant in various real data applications. We firstly generate the matrix of count data from which, after normalization and logarithmic transform, we compute the response vector y according to the linear model. Then we replace a fixed proportion of the existing counts by 0 in a random uniform way, and subsequently the zeros are replaced by values 0.5 before converting the data to compositional form to allow the logarithmic transformation. We use contamination setting (B) and increase the zero proportion in the covariates from 0.1 to 0.8 in steps of 0.1. Note that we only contaminated the training sample and not the test sample. Figure 1 shows the resulting prediction error averaged over 100 replications for each fixed proportion of zeros (solid lines). The dashed lines are the means plus/minus two times the standard errors from the replications. The red lines are for ZS, the blue lines for RobZS.

Fig. 1.

Prediction performance of the ZS (red) and RobZS (blue) estimators in scenario (B) by increasing the proportions of zeros in training and test data from 0.1 to 0.8 in steps of 0.1. Parameter configuration: (a) $n = 50, p = 30, ρ = 0.2$ ⁠, (b) $n = 100, p = 200, ρ = 0.2$ ⁠, (c) $n = 100, p = 1000, ρ = 0.2$ ⁠. Shown are means (solid lines) plus/minus two standard errors averaged over 100 replications for each fixed proportions of zeros

As expected, the performance of both estimators linearly reduces as the proportion of zero counts in the covariates increases. However RobZS shows the best overall behavior even when the proportion of zero counts in the covariates is very high, as ZS is very much affected by outliers.

3.5 Simulations with increasing outlier proportion

The simulations in this section investigate the behavior of the estimators ZS and RobZS for increasing levels of contamination. We use contamination setting (C) and increase the outlier proportion from zero to 0.5 in steps of 0.02. In each step, 50 replications are carried out, and the means plus/minus two standard errors of the results are presented in Figure 2. The red lines are for ZS, the blue lines for RobZS. The simulations are conducted for the parameters n = 50, p = 30 and $ρ = 0.2$ ⁠. The results basically reveal that the results for ZS get worse if the outlier proportion increases. Particularly, FN quickly increases to a value of about 2, and thus 2 out of 6 active variables are (on average) not identified. RobZS shows stable performance up to about 25% of contamination. This is explained by the trimming proportion of the procedure, which we set to 25% in all experiments. The evaluation with a 10% trimmed prediction error (upper right plot) is clearly not appropriate in a setting with high proportions of outliers. It is interesting that FN is very stable (up to about 20% contamination), and that FP decreases for an outlier proportion of up to 0.25. This means that the estimated regression parameters get sparser with higher contamination, and true noise variables are more accurately identified.

Fig. 2.

Performance of the ZS (red) and RobZS (blue) estimators by increasing the contamination level, using scenario (C). Here, n = 50, p = 30, $ρ = 0.2$ ⁠; shown are means (solid lines) plus/minus two standard errors derived from 50 simulation replications at each step

Supplementary Section S2.3 of Supplementary Material also contains simulations studies which investigate the effect of varying sparsity. A general conclusion from these results is that less sparsity of the model reduces the advantage of the robust method. Or, in other words, RobZS has much better performance than ZS in presence of outliers, and if the true underlying model is very sparse.

Finally, a further simulation study is presented in Supplementary Section S2.4 of Supplementary Material which focuses on the use of the elastic-net penalty (both Ridge and Lasso). The comparison of ZS and RobZS reveals that RobZS tends to select on average a much smaller value for the tuning parameter α than ZS, which comes closer to a ridge penalty, and thus contains more variables in the model. Consequently, FP is generally higher for RobZS compared to ZS, but FN is considerably lower, even in the uncontaminated case. As expected, the prediction error is much smaller for the robust method in a contaminated scenario.

4 An application to human gut microbiome data

We applied the proposed RobZS model to a cross-sectional study of the association between diet and gut microbiome composition (Wu et al., 2011). In this study, fecal samples from 98 healthy individuals were collected and the microbiome dataset was produced by high-throughput sequencing of 16S rRNA, obtaining 6674 OTUs, the normalized counts of clustered sequences that depict bacteria types. We aim to predict caffeine intake, the continuous outcome of interest, based on the OTU abundances (Jaquet et al., 2009; Xiao et al., 2018). The microbiome dataset was previously preprocessed by Xiao et al. (2018) removing rare OTUs with prevalence less than 10%. Due to the high proportion of zero counts, we further retained OTUs that appeared in at least 25 samples, resulting in a matrix of dimension $n \times p = 98 \times 240$ ⁠. Additionally, we applied the quantile transformation to the caffeine intake to fulfill the underlying assumption of normality, as done in Xiao et al. (2018). Zero counts were replaced by the maximum rounding error of 0.5 to allow for the logarithmic transformation, which is a common practice in the context of analyzing microbiome data (Aitchison, 1986, Section 11.5). Note that in CODA analysis there are more sophisticated methods for zero replacement (Lubbe et al., 2021), but since this is not the focus of this paper, and because the proportion of zeros is also quite high with 49%, we stick to this simple replacement strategy.

For a fair investigation of the prediction performance of the four sparse estimators, a 5-fold CV procedure was repeated 50 times, resulting in 250 fitted models for each sparse regression method. This is a common way used in machine learning to reduce the error in the estimate of mean model performance. In the training set, the parameter selection follows the one described in the simulation section. Prediction error and trimmed prediction error were used to assess the prediction accuracy of the different methods. Note that instead of using a trimmed prediction error, one could also use other robust error measures if the outlier proportion is unknown, such as the robust τ scale estimator of Maronna and Zamar (2002).

Figure 3 shows the boxplots of the CV PEs (left) and the 10% trimmed CV PEs (right) over all replications for the estimators Lasso, ZeroSum, RobL, RobZS and SLTS. The estimators RobZS and SLTS show somewhat smaller prediction errors, and SLTS yields the smallest 10% trimmed PEs, at the cost of a larger variability. It can thus be assumed that there is a certain effect of outliers which influence the model estimation.

Fig. 3.

Analysis of gut microbiome data. Boxplots of CV PEs (left) and 10% trimmed CV PEs (right) over all replications for Lasso, ZeroSum, RobL, RobZS and SLTS

In order to investigate the impact of potential outliers in more detail, we can apply an outlier analysis on the scaled residuals. For each model fit within the CV scheme, the scale of the residuals for the CV training data can be estimated. This is done with the classical standard deviation; but for the robust fits we only include residuals from observations with weight 1 in the reweighting step, see Equation (9). Thus, outliers according to this weighting scheme will not affect the estimation of the residual scale. Then the residuals from the left-out folds are scaled with this estimator, and the CV PEs now include only the observations where the scaled residuals are within the interval $[- 2.5, 2.5]$ ⁠. The results are shown in the boxplots of the left panel of Figure 4, and in Figure 5. Figure 5 shows, for each model and over all CV replications, the mean of the scaled residuals for each observation. The residual scale was estimated from the model fit, and the scaled residuals are computed from the CV predictions. Since there are 50 CV replications, we can show the averages over 50 scaled residuals for each observation and for each estimator. The sorting of the observations on the horizontal axis is according to the RobZS mean. With the cutoff values ±2.5, shown as dashed lines, we see that SLTS identifies a huge amount of outliers when compared to the other estimators, i.e. more than 74% of the predicted values are flagged as anomalous, also their range is extremely large, suggesting that this model is inadequate for the example data we are dealing with. Consequently, the smaller CV PE without outliers for SLTS in the boxplot of Figure 4(left) is not comparable with the others as it is based only on few observations. RobZS instead identifies a more feasible number of outliers in the predictions, namely less than 30%.

Fig. 4.

Analysis of gut microbiome data. Left panel: Boxplot of CV PEs over all replications by Lasso, ZeroSum, RobL and RobZS. Only the observations whose corresponding scaled residuals are within the interval $[- 2.5, 2.5]$ were considered. Right panel: Fitted versus measured values of the transformed caffeine intake. The green points correspond to observations detected as outliers by the RobZS estimator

Fig. 5.

Analysis of gut microbiome data. Mean of the scaled CV prediction residuals for each observation. The sorting of the observations on the x-axis is according to the mean for RobZS

We can recognize that the RobZS estimator achieves the best performance, outperforming the other methods by a large margin. This is due to the robustness and precision of the RobZS estimator, which allows for a reliable outlier diagnostics for the predictions. This can also be seen in the right panel of Figure 4, where RobZS was simply applied to the complete dataset. The plot shows the measured versus fitted response variable, and the color corresponds to the weights from Equation (9). The green points correspond to observations detected as outliers by the RobZS estimator, namely data with binary weight w_i = 0. Based on the analysis in Figure 4 (right) we can also investigate which observations are potential (prediction) outliers.

The 250 models for each method, resulting from the described CV procedure, can be further analyzed for the variable selection performance. Figure 6 shows on the vertical axis the proportion of models containing at least the number of zeros in the regression parameter estimates indicated on the horizontal axis (in total 241 variables). A substantial proportion of models lead to a fully sparse solution; for RobL we obtain in about 60% of the models full or almost full sparsity. In contrast, SLTS yields much less sparsity, and the models are also very similar in terms of sparsity. RobZS seems to be best tunable, as this method leads to higher sparsity, but the proportion of fully sparse models is still the lowest.

Fig. 6.

Analysis of gut microbiome data. Proportion of models (out of all 50 * 5) containing at least the number of zeros shown on the horizontal axis over all CV replications by Lasso, ZeroSum, RobL, RobZS and SLTS

Further investigations on application results are reported in Supplementary Section S3 of Supplementary Material.

5 Conclusions

We have proposed the robust ZeroSum (RobZS) regression estimator as a trimmed version of the ZeroSum (ZS) estimator, used in high-dimensional settings with compositional covariates. This model can be applied in microbiome analysis to identify bacterial taxa associated with a continuous response. Like in Lasso or elastic-net regression, the estimated regression coefficient vector is typically sparse. Additionally, however, the non-zero coefficients sum up to zero, and this constraint is appropriate for linear log-contrast models as they are used in the context of CODA analysis. In other words, the estimator is appropriately performing variable selection among compositional explanatory variables and allows for an interpretation of those selected compositional parts.

The estimation procedure of the RobZS estimator is based on an analogue of the fast-LTS algorithm in the context of robust regression. For the computation, a robust elastic-net regression procedure has been adapted and implemented. The conducted simulation studies reveal that the RobZS estimator has similar performance as the non-robust ZS estimator if there are no outliers, but in case of contamination (vertical outliers, leverage points) the robust version leads to a big advantage in terms of prediction error, precision of the estimated regression coefficients and ability to correctly identify the relevant variables (partly at the cost of a slightly increased false positive rate). Also when compared to other robust estimators such as sparse LTS (Alfons et al., 2013) or elastic-net LTS (Kurnaz et al., 2018), which however do not incorporate the compositional aspect of the data, RobZS is superior according to the evaluation measures in almost all settings. Simulations with zeros in the data, with varying sparsity and varying outlier proportions have further underlined the excellent performance of RobZS. The application to microbiome data has demonstrated that RobZS is capable to balance the sparsity of the solution with proper prediction accuracy. A further benefit is that outliers in the training data can be identified, but also for new data it is possible to indicate outlyingness, thus values of the explanatory variables or the response which do not match the training data.

In future work, this model will be extended to the generalized linear models framework for high-dimensional compositional covariates.

Acknowledgements

The authors wish to thank the Editor, the Associate Editor and the reviewers for their valuable comments and suggestions. They greatly acknowledge the DEMS Data Science Lab for supporting this work by providing computational resources.

Funding

This research was supported by local funds from University of Milano-Bicocca [FAR 2018 to G.S.M.].

Conflict of Interest: none declared.

References

Aitchison

J.

(

1986

)

The Statistical Analysis of Compositional Data

.

Chapman & Hall

,

London

.

Aitchison

J.

,

Bacon-Shone

J.

(

1984

)

Log contrast models for experiments with mixtures

.

Biometrika

,

71

,

323

–

330

.

Aitchison

J.

,

Shen

S.M.

(

1980

)

Logistic-normal distributions: some properties and uses

.

Biometrika

,

67

,

261

–

272

.

Alfons

A.

et al. (

2013

)

Sparse least trimmed squares regression for analyzing high-dimensional large data sets

.

Ann. Appl. Stat

.,

7

,

226

–

248

.

Altenbuchinger

M.

et al. (

2017

)

Reference point insensitive molecular data analysis

.

Bioinformatics

,

33

,

219

–

226

.

Bates

S.

,

Tibshirani

R.

(

2019

)

Log-ratio lasso: scalable, sparse estimation for log-ratio models

.

Biometrics

,

75

,

613

–

624

.

Filzmoser

P.

et al. (

2018

)

Applied Compositional Data Analysis. With Worked Examples in R.

Springer Series in Statistics, Springer

,

Cham, Switzerland

.

Freue

G.V.C.

et al. (

2019

)

Robust elastic net estimators for variable selection and identification of proteomic biomarkers

.

Ann. Appl. Stat

.,

13

,

2065

–

2090

.

Friedman

J.

et al. (

2007

)

Pathwise coordinate optimization

.

Ann. App. Stat

.,

1

,

302

–

332

.

Friedman

J.

et al. (

2010

)

Regularization paths for generalized linear models via coordinate descent

.

J. Stat. Softw

.,

33

,

1

–

22

.

Gloor

G.B.

et al. (

2016

)

It’s all relative: analyzing microbiome data as compositions

.

Ann. Epidemiol

.,

26

,

322

–

329

.

Hastie

T.

et al. (

2001

)

The Elements of Statistical Learning

.

Springer Series in Statistics. Springer New York Inc

.,

Berlin

.

Huber

P.

,

Ronchetti

E.

(

2009

)

Robust Statistics

, 2nd edn.

Wiley

,

New York

.

Jaquet

M.

et al. (

2009

)

Impact of coffee consumption on the gut microbiota: a human volunteer study

.

Int. J. Food Microbiol

.,

130

,

117

–

121

.

Kurnaz

F.S.

et al. (

2018

)

Robust and sparse estimation methods for high-dimensional linear and logistic regression

.

Chemometr. Intell. Lab

.,

172

,

211

–

222

.

Li

H.

(

2015

)

Microbiome, metagenomics, and high-dimensional compositional data analysis

.

Annu. Rev. Stat. Appl

.,

2

,

73

–

94

.

Lin

W.

et al. (

2014

)

Variable selection in regression with compositional covariates

.

Biometrika

,

101

,

785

–

797

.

Lubbe

S.

et al. (

2021

)

Comparison of zero replacement strategies for compositional data with large numbers of zeros

.

Chemometr. Intell. Lab

.,

210

,

104248

.

Maronna

R.

,

Zamar

R.

(

2002

)

Robust estimates of location and dispersion for high-dimensional datasets

.

Technometrics

,

44

,

307

–

317

.

Maronna

R.

et al. (

2006

)

Robust Statistics

.

John Wiley & Sons, Ltd

.,

Hoboken, NJ

.

Maronna

R.A.

(

2011

)

Robust ridge regression for high-dimensional data

.

Technometrics

,

53

,

44

–

53

.

Meinshausen

N.

(

2007

)

Relaxed lasso

.

Comput. Stat. Data Anal

.,

52

,

374

–

393

.

Quinn

T.P.

et al. (

2018

)

Understanding sequencing data as compositions: an outlook and review

.

Bioinformatics

,

34

,

2870

–

2878

.

Rousseeuw

P.J.

(

1984

)

Least median of squares regression

.

J. Am. Stat. Assoc

.,

79

,

871

–

880

.

Rousseeuw

P.J.

,

Van Driessen

K.

(

2006

)

Computing LTS regression for large data sets

.

Data Min. Knowl. Disc

,

12

,

29

–

45

.

Shi

P.

et al. (

2016

)

Regression analysis for microbiome compositional data

.

Ann. Appl. Stat

.,

10

,

1019

–

1040

.

Smucler

E.

,

Yohai

V.J.

(

2017

)

Robust and sparse estimators for linear regression models

.

Comput. Stat. Data Anal

.,

111

,

116

–

130

.

Tibshirani

R.

(

1994

)

Regression shrinkage and selection via the lasso

.

J. R. Stat. Soc. Ser. B Stat. Methodol

.,

58

,

267

–

288

.

Tibshirani

R.

(

2011

)

Regression shrinkage and selection via the lasso: a retrospective

.

J. R. Stat. Soc. Ser. B Stat. Methodol

.,

73

,

273

–

282

.

Wu

G.D.

et al. (

2011

)

Linking long-term dietary patterns with gut microbial enterotypes

.

Science

,

334

,

105

–

108

.

Xiao

J.

et al. (

2018

)

A phylogeny-regularized sparse regression model for predictive modeling of microbial community data

.

Front. Microbiol

.,

9

,

3112

.

Zou

H.

,

Hastie

T.

(

2005

)

Regularization and variable selection via the elastic net

.

J. R. Stat. Soc. Ser. B Stat. Methodol

.,

67

,

301

–

320

.