تخمین موقعیت منبع لرزه‌زایی القایی با استفاده از داده‌های ثبت‌شده در ایستگاه‌های لرزه‌نگاری بر پایه هوش مصنوعی

پارسا‌, هومن; رادان, محمد یاسر

doi:10.30499/ijg.2025.524794.1699

تخمین موقعیت منبع لرزه‌زایی القایی با استفاده از داده‌های ثبت‌شده در ایستگاه‌های لرزه‌نگاری بر پایه هوش مصنوعی

نوع مقاله : مقاله پژوهشی‌

نویسندگان

هومن پارسا‌ ¹

محمد یاسر رادان ²

¹ دانش‌آموخته کارشناسی‌ارشد گروه مهندسی عمران، دانشکده فنی و مهندسی، دانشگاه صنعتی خواجه نصیرالدین طوسی، تهران، ایران

² استادیار، مجتمع دانشگاهی پدافندغیرعامل، دانشگاه صنعتی مالک اشتر، تهران، ایران

10.30499/ijg.2025.524794.1699

چکیده

لرزه‌های القایی ناشی از فعالیت‌های انسانی مانند استخراج و تزریق سیالات زیرسطحی می‌توانند یکپارچگی زیرساخت‌های حیاتی را به‌ مخاطره اندازند. این پژوهش می‌تواند بدون نیاز به داده‌های زمین‌شناسی جامع و شبکه لرزه‌نگاری کامل موقعیت منطقه‌ای رومرکز چشمه لرزه‌ای را در چارچوبی چندمرحله‌ای تخمین زند. ابتدا، سیگنال‌های لرزه‌ای با بهره‌گیری از نسبت میانگین سیگنال در پنجره کوتاه‌دوره به میانگین سیگنال در پنجره بلنددوره برای اصلاح جهش‌های ناگهانی خط‌مبنا پیش‌پردازش شد. در ادامه، همبستگی متقابل برای شکل موج‌ها در چهار تاخیر زمانی 5/0، 1/0، 05/0 و 01/0 ثانیه محاسبه و برای استخراج زیرمجموعه بهینه ویژگی‌ها، روش حذف بازگشتی ویژگی به‌کار گرفته شد. ویژگی حاصل به مدل طبقه‌بندی ترکیبی بر پایه میانگین‌گیری احتمال‌ها ارائه شد که ماشین بردار پشتیبان و الگوریتم افزایش گرادیان فوق‌العاده در آن تلفیق شده‌اند. برای بررسی تأثیر نامتوازن بودن داده‌ها، فرایند آموزش مدل بدون و با نمونه‌برداری افزایشی مصنوعی انجام شد. در حالت بدون نمونه‌برداری ، با کاهش گام زمانی به 1/0 ثانیه، دقت طبقه‌بندی مدل در تعیین منطقه‌ی رومرکز منبع لرزه‌ای از 73/0 به 90/0 افزایش می‌یابد و نتایج پایدارتری را نشان می‌دهد. در مقیاس‌های کمتر، حساسیت مدل به نوسانات نویزی موجب اشباع دقت عملکرد و افزایش اندکی در انحراف‌معیار گردید؛ از این رو، گام 1/0 ثانیه به‌عنوان تعادلی مطلوب معرفی شد. با اعمال نمونه‌برداری ، پایداری تخمین‌ها به‌طور کلی بهبود یافته و انحراف‌معیار به‌نحو قابل‌توجهی کاهش یافته است؛ اما در گام‌های بزرگ‌تر از 01/0 افت جزئی دقت مشاهده شد که به کیفیت نمونه‌های مصنوعی نسبت داده می‌شود. بیشترین بهره‌وری با ترکیب گام 01/0 ثانیه و نمونه‌برداری حاصل گردید؛ دقت طبقه‌بندی مدل افزایش 7/5 درصدی ارتقا یافته است. نتایج نشان می‌دهد که چارچوب پیشنهادی دقت و پایداری خود را حفظ کرده و برای استقرار عملیاتی مناسب است؛ اما در شرایط محدودیت محاسباتی، گام 1/0 ثانیه بدون نمونه‌برداری همچنان بهینه‌ترین سازش میان هزینه و دقت طبقه‌بندی را فراهم می‌کند.

کلیدواژه‌ها

تخمین موقعیت منبع لرزه‌ای، همبستگی متقابل، الگوریتم گرادیان تقویتی، ماشین بردار پشتیبان

موضوعات

زلزله شناسی

عنوان مقاله English

Estimating the location of induced seismic sources using seismic station records and artificial intelligence

نویسندگان English

Hooman Parsa ¹

Mohammad Yaser Radan ²

¹ M.Sc. in Civil Engineering, Faculty of Engineering, K. N. Toosi University of Technology, Tehran, Iran

² Assistant Professor, Faculty of Passive Defense, Malek Ashtar University of Technology, Tehran, Iran

چکیده English

Induced seismic events triggered by human activities such as subsurface fluid extraction and injection can jeopardize the integrity of critical infrastructure. The multistage framework proposed here obviates the need for exhaustive geological models and dense seismic arrays, yet accurately and reliably estimates the regional epicenter location. To derive region-based labels for the supervised classifiers, K-means clustering was first applied to the latitude–longitude coordinates of all recorded events; the resulting cluster assignments were adopted as class labels, providing an objective, data-driven regional segmentation for subsequent training.
   In the initial processing stage, three-component seismic recordings were pre-processed by applying the short-term average to long-term average ratio (STA/LTA) to identify and correct abrupt baseline offsets. The cleaned records were then paired to form cross-correlation matrices at four lags (0.5, 0.1, 0.05 and 0.01 s) capturing relative information across multiple temporal scales. Recursive feature elimination with cross-validation (RFECV) extracted the most informative subset of correlation coefficients, substantially reducing dimensionality while preserving discriminative power. These feature vectors drove a probabilistic-averaging (soft-voting) ensemble that couples a support-vector machine (SVM) with an extreme-gradient-boosting (XGBoost) classifier, combining the margin-maximizing strength of SVM with the nonlinear learning capacity of boosted decision trees.
   Model development was conducted twice (first on the raw, imbalanced data and then on data balanced with the Synthetic Minority Over-sampling Technique (SMOTE)) to quantify the influence of class imbalance. Without SMOTE, decreasing the correlation-window step from 0.5 s to 0.1 s improved classification accuracy for epicentral region assignment from 0.73 to 0.90 while markedly shrinking the standard deviation of epicentral errors, indicating greater solution stability. Moving to still finer steps (0.05 s and 0.01 s) made the model increasingly sensitive to high-frequency noise, saturating accuracy gains and slightly inflating variance; the 0.1 s lag therefore emerged as an optimal trade-off between resolution and robustness.
   With SMOTE, overall stability improved further and error dispersion contracted, yet a modest drop in accuracy appeared at steps coarser than 0.01 s, attributable to the limited representativeness of some synthetic samples. The best performance arose from pairing SMOTE with the 0.01 s step, achieving a classification accuracy of 0.93 in epicentral region assignment, an absolute gain of 5.7% over the non-SMOTE result.
   These findings demonstrate that the proposed workflow can deliver accurate, repeatable epicentral estimates in data-limited environments, supporting real-time decision-making without the need for comprehensive subsurface models. Furthermore, where computational resources are constrained, the 0.1 s configuration without SMOTE remains a well-balanced option that combines high classification accuracy with modest processing cost.