پژوهشهای ژئومورفولوژی کمّی

پژوهشهای ژئومورفولوژی کمّی

ارزیابی دقت الگوریتم‌های یادگیری ماشین در پهنه‌بندی خطر زمین‌لغزش حوضه آبخیز سامیان ‏

نوع مقاله : مقاله پژوهشی

نویسندگان
1 استاد ژئومورفولوژی، گروه جغرافیای طبیعی، دانشکده علوم اجتماعی، دانشگاه محقق اردبیلی، ایران
2 دانشجوی دکتری آب و هواشناسی، گروه جغرافیای طبیعی، دانشکده علوم اجتماعی، دانشگاه محقق اردبیلی، اردبیل، ‏ایران
10.22034/gmpj.2025.551301.1583
چکیده
پهنه‌بندی خطر زمین‌لغزش از ابزارهای کلیدی در مدیریت ریسک و برنامه‌ریزی پایدار منابع طبیعی به‌شمار می‌رود. این پژوهش با هدف ارزیابی دقت الگوریتم‌های یادگیری ماشین در پهنه‌بندی خطر زمین‌لغزش حوضه آبخیز سامیان، در محیط گوگل ارث انجین انجام شد. در گام نخست، مجموعه‌ای از متغیرهای مؤثر بر پایداری دامنه‌ها شامل عوامل توپوگرافی (ارتفاع، شیب، جهت و انحنای زمین)، ویژگی‌های زمین‌شناسی (جنس سنگ)، فاصله از رودخانه، جاده و گسل، و نیز پارامترهای اقلیمی و زیست‌محیطی (بارش، رطوبت خاک، شاخص پوشش گیاهی NDVI و کاربری اراضی) گردآوری و در محیط GIS فازی‌سازی شدند. برای آموزش و ارزیابی مدل‌ها، از ۱۱۰ نقطه نمونه شامل مناطق لغزشی و غیرلغزشی استفاده شد که ۷۰ درصد آن‌ها به آموزش و ۳۰ درصد به آزمون مدل‌ها اختصاص یافت. سه الگوریتم جنگل تصادفی، ماشین بردار پشتیبان و نزدیک‌ترین همسایگان به‌منظور تولید نقشه احتمال وقوع لغزش به‌کار گرفته شدند. نقشه‌های حاصل با داده‌های واقعی مقایسه و شاخص دقت مساحت زیر منحنی ROC(AUC) برای هر مدل محاسبه شد. نتایج نشان داد که عامل شیب با سهم تأثیر ۱۶٫۱۰ درصد، مهم‌ترین متغیر در بروز زمین‌لغزش است. از نظر مکانی، نواحی مرکزی با گرایش شرقی کمترین خطر لغزش و نواحی پیرامونی حوضه بیشترین خطر را نشان دادند. تمرکز بالای خطر در این نواحی عمدتاً ناشی از ارتفاع و شیب زیاد، رطوبت بالای خاک، بارش فراوان، هم‌پوشانی مسیر گسل‌ها با جاده‌ها و رودخانه‌ها و گسترش اراضی کشاورزی و بایر در دامنه‌های سبلان است. بر اساس نتایج پهنه‌بندی، به‌ترتیب ۵۳، ۴۰ و ۴۵ درصد از کل منطقه توسط مدل‌های SVM، RF و KNN در طبقات خطر زیاد و بسیار زیاد قرار گرفتند. مقدار شاخص AUC برای مدل‌های مذکور به‌ترتیب ۰٫۹۰۶، ۰٫۸۶۴ و ۰٫۸۴۸ به‌دست آمد که نشان‌دهنده دقت بالاتر مدل SVM در تفکیک نواحی لغزشی و پایدار است.
کلیدواژه‌ها

عنوان مقاله English

Evaluating the accuracy of machine learning algorithms in landslide hazard zoning in the Samian watershed

نویسندگان English

Sayyad Asghari Saraskanroud 1
mahdi frotan 2
1 Professor of Geomorphology, Department of Physical Geography, Faculty of Social Sciences, University of Mohaghegh Ardabili, Ardabil, Iran
2 Ph.D. Student of Climatology, Department of Physical Geography, Faculty of Social Sciences, University ‎of ‎Mohaghegh Ardabili, Ardabil, Iran
چکیده English

Introduction

Landslides are important natural hazards in mountainous areas that cause extensive human, financial and environmental losses by suddenly changing the land surface. This phenomenon destroys vegetation, agricultural lands and infrastructure and increases secondary destruction by intensifying erosion. Between 1995 and 2014, more than 3,876 landslides occurred in the world, resulting in thousands of injuries and more than 163,000 deaths. Identifying prone areas and zoning landslide susceptibility are effective solutions in reducing damage and managing the risk of this phenomenon. In this context, statistical methods and machine learning algorithms such as random forest and support vector machine can predict the occurrence of landslides with high accuracy. The aim of the present study is to zone landslide risk in the Samian watershed using machine learning algorithms and compare their performance in order to provide an accurate and reliable model for planning and reducing potential damages.



Methodology

This study aimed to map landslide risk in the Samian watershed using machine learning algorithms in the Google Earth Engine environment. First, topographic data (elevation, slope, direction, and curvature), geology (rock type), distance from river, road, and fault, and climatic and environmental variables (rainfall, soil moisture, NDVI, and land use) were collected and processed in a GIS environment. Field data included 55 landslide points and 55 non-slip points that were used to train and test the models. The layers were standardized using the fuzzification method to numerically represent the relative impact of each factor on slope instability. Three machine learning algorithms, including random forest (RF), support vector machine (SVM), and nearest neighbor (KNN), were implemented to generate a landslide probability map. Finally, the output of the models was converted into a hazard zoning map and categorized into five classes from very low to very high. The accuracy of the models was evaluated with AUC and comparison with real landslide data to determine the most accurate algorithm for the Samian basin.



Results and Discussion

The study of the distribution map of landslide and non-slip points in the Samian watershed showed that out of a total of 110 field-collected points, half were related to real landslides and were mainly concentrated in the western half of the basin, especially the slopes of the Sabalan heights and Nir county. This spatial distribution confirms the direct role of slope and high altitude in the occurrence of landslides. The analysis of the significance of the variables also showed that the slope factor with a share of 16.10% has the greatest impact on the occurrence of landslides, followed by altitude and rock type, which indicates the importance of morphometric features in controlling slope stability. At a spatial scale, the steep and high areas of the northwest of the basin (Sabalan heights) with slopes between 15 and 68 degrees and the southeastern part of the basin (Hire) have the highest potential for landslides due to the concentration of faults, high soil moisture, and abundant rainfall. In the southern part of the basin, especially in the Kuzeh Topraghi area, the northern slopes are more humid due to being in the shade of the sun, and long periods of frost and repeated cycles of freezing and thawing reduce the cohesion of the materials and increase instability. In the southwest of the basin (Nir) and along the Balikhlochay and Imamchay rivers, human activities such as road excavation and land use change have weakened the natural balance of the slopes and increased the risk of instability. In the northeast of the basin (Namin), despite dense vegetation, steep slopes and soil saturation during the rainy season have caused surface landslides. In contrast, the central areas of the basin with a tendency to the east (from Ardabil city to Abibiglou) are the least risky areas, which have higher stability due to their gentle slope, lower altitude, and less precipitation, and are in the low to very low risk class in all models.



Conclusion

A comparison of the performance of three machine learning algorithms (RF, SVM, and KNN) showed that all three are capable of predicting landslide-prone areas, but they have differences in sensitivity and accuracy. The accuracy index (AUC) for SVM, RF, and KNN was calculated to be 0.906, 0.864, and 0.848, respectively, indicating that SVM is more accurate in distinguishing landslide-prone and stable areas. Zoning results also showed that the total area of high and very high risk areas for RF, SVM, and KNN was 2222.1, 1680.79, and 1890.39 square kilometers, respectively, which is equivalent to 53, 40, and 45 percent of the total area. These values indicate that the RF algorithm takes a more conservative view and identifies more areas as high-risk, while the SVM provides a more balanced approach but with higher accuracy.

کلیدواژه‌ها English

Landslide
Random Forest
SVM
KNN
Samian Watershed

مقالات آماده انتشار، پذیرفته شده
انتشار آنلاین از 25 آبان 1404