TY - JOUR
T1 - Global-scale gap filling of satellite soil moisture products
T2 - Methods and validation
AU - Zhang, Chunlin
AU - Zeng, Jiangyuan
AU - Shi, Pengfei
AU - Ma, Hongliang
AU - Letu, Husi
AU - Zhang, Xiang
AU - Wang, Panshan
AU - Bi, Haiyun
AU - Rong, Jiaming
N1 - Publisher Copyright:
© 2025 Elsevier B.V.
PY - 2025
Y1 - 2025
N2 - The utility of satellite soil moisture products is often limited by their missing values, and thus it is crucial to develop gap-filling methods to obtain soil moisture datasets with high-precision and spatiotemporal coverage. Previous studies often used a single gap-filling method in specific regions without analysis of the factors affecting the gap-filling accuracy. To narrow this research gap, this study first compared the correlation of SMAP soil moisture products with five spatially seamless model-based soil moisture datasets globally. Then based on the optimal ERA5 data from 2016 to 2019, the performance of four machine learning methods in filling the SMAP missing values was compared. The best-performing random forest (RF) method was compared with other five traditional bias-corrected methods. Subsequently, twelve auxiliary data were incorporated into the RF to improve the accuracy of gap-filled SMAP data, which were validated by ground measurements from 1071 sites worldwide. Finally, the environmental factors affecting the filling accuracy of SMAP data were analyzed on a global scale. The results indicate: 1) RF generally performs the best among the four machine learning approaches. When only using the ERA5 dataset for the model input, RF achieves higher accuracy compared to the other five bias-corrected methods during the training phase, but its skill degrades noticeably in the validation phase. The performance of RF improves significantly after adding auxiliary data; 2) against globally distributed in situ data, the gap-filled products show improved skill over the original SMAP data, with smaller ubRMSE of 0.049 m3m−3 (vs. 0.060 m3m−3), demonstrating the RF method with auxiliary data can effectively fill the missing values of SMAP data; 3) the gap-filling accuracy is mainly affected by vegetation cover, soil moisture conditions, and land cover heterogeneity. Specifically, the filling accuracy is lower in denser vegetation coverage, wetter soil, and larger land cover heterogeneity.
AB - The utility of satellite soil moisture products is often limited by their missing values, and thus it is crucial to develop gap-filling methods to obtain soil moisture datasets with high-precision and spatiotemporal coverage. Previous studies often used a single gap-filling method in specific regions without analysis of the factors affecting the gap-filling accuracy. To narrow this research gap, this study first compared the correlation of SMAP soil moisture products with five spatially seamless model-based soil moisture datasets globally. Then based on the optimal ERA5 data from 2016 to 2019, the performance of four machine learning methods in filling the SMAP missing values was compared. The best-performing random forest (RF) method was compared with other five traditional bias-corrected methods. Subsequently, twelve auxiliary data were incorporated into the RF to improve the accuracy of gap-filled SMAP data, which were validated by ground measurements from 1071 sites worldwide. Finally, the environmental factors affecting the filling accuracy of SMAP data were analyzed on a global scale. The results indicate: 1) RF generally performs the best among the four machine learning approaches. When only using the ERA5 dataset for the model input, RF achieves higher accuracy compared to the other five bias-corrected methods during the training phase, but its skill degrades noticeably in the validation phase. The performance of RF improves significantly after adding auxiliary data; 2) against globally distributed in situ data, the gap-filled products show improved skill over the original SMAP data, with smaller ubRMSE of 0.049 m3m−3 (vs. 0.060 m3m−3), demonstrating the RF method with auxiliary data can effectively fill the missing values of SMAP data; 3) the gap-filling accuracy is mainly affected by vegetation cover, soil moisture conditions, and land cover heterogeneity. Specifically, the filling accuracy is lower in denser vegetation coverage, wetter soil, and larger land cover heterogeneity.
KW - Auxiliary data
KW - Gap filling
KW - Global scale
KW - SMAP
KW - Soil moisture
U2 - 10.1016/j.jhydrol.2025.132762
DO - 10.1016/j.jhydrol.2025.132762
M3 - Journal article
AN - SCOPUS:85216275705
SN - 0022-1694
VL - 653
JO - Journal of Hydrology
JF - Journal of Hydrology
M1 - 132762
ER -