1. Introduction
2. Database Overview
2.1 Lithological characteristics and mechanical properties
2.2 Cutting tool geometry and cutting parameters
2.3 Pick acting forces and their correlation
3. Predictive Modeling Approach
4. Results and Discussion
4.1 Multivariate linear regression analysis
4.2 Random Forest model performance
4.3 Feature importance analysis
5. Conclusions
1. Introduction
Rock cutting represents a fundamental process across mining and civil engineering applications, such as tunnel boring, roadheading, and mineral extraction. The effectiveness and efficiency of these operations depend critically on understanding and predicting pick acting forces. These forces significantly influence tool wear, energy consumption, and overall excavation efficiency. The accurate prediction of these forces necessitates a comprehensive understanding of the complex interactions between rock properties, cutting tool parameters, and operational conditions.
The evolution of methodological approaches for estimating pick acting forces spans several decades, encompassing theoretical, empirical, numerical, and machine learning (ML) methods. The theoretical foundation was established by Evans (1961) through pioneering work on coal ploughing mechanics, focusing on stress distribution and failure mechanisms. This theoretical framework was further developed by Nishimatsu (1972), who adapted Merchant’s formula from metal cutting to create a validated pick acting force formula. Evans (1984) later contributed a theoretical model for point-attack pick acting force prediction, which continues to influence contemporary research. Goktan (1990) expanded this understanding by investigating Evans’ cutting theory application to high-strength rocks, revealing the dominance of compressive failure over tensile failure at low rake angles. Copur (1999) also conducted an extensive theoretical and experimental analysis of rock cutting with drag bits, developing reliable performance prediction models for roadheaders. More recent theoretical advances include the model by Gao et al. (2013) and Li et al. (2018), which is based on elastic fracture mechanics for peak acting force of conical picks, highlighting the importance of cutting tool-rock interaction and stress distribution.
Empirical investigations have significantly contributed to establishing practical relationships between measurable parameters and pick acting forces. Roxborough and Phillips (1975) conducted factorial experiments examining thrust and rolling forces in relation to cutter geometry, while Snowdon et al. (1982) analyzed tool forces and specific energy across various rock types. Altindag (2003) introduced a novel brittleness index correlating with specific energy, and Balci et al. (2004) developed empirical models for roadheader performance prediction. Subsequent research by Goktan and Yilmaz (2005), Bilgin et al. (2006, 2008), Chang et al. (2006), and Gertsch et al. (2007) further explored relationships between rock properties, cutting parameters, and performance metrics.
Additional contributions from Tiryaki and Dikmen (2006), Yilmaz et al. (2007), Gehring (2009), Jeong et al. (2016, 2020a, 2020b, 2023), Jeong and Jeon (2018), Geng et al. (2016), Wang et al. (2018), and Pan et al. (2018, 2019) have enhanced our understanding of rock cuttability, pick acting force prediction, and specific energy relationships. Recent experimental studies have significantly advanced the field, with Zeng et al. (2021) identifying optimal cutting parameters for conical pick performance, and Yang et al. (2024) validating small-scale rock cutting tests against full-scale performance while developing predictive models for pick acting forces estimation. Additionally, Zheng et al. (2025) enhanced empirical knowledge by creating a real-time monitoring system to measure cutter forces in tunnel boring machines.
The advent of numerical modeling has enabled more precise simulation of rock cutting processes. Researchers including Rojek et al. (2011), Moon and Oh (2012), and Huang et al. (2013) have employed discrete element methods to simulate tool-rock interactions. Recent advances by Zhang et al. (2020), Jeong et al. (2020b), Lu et al. (2019), Wicaksana et al. (2021), Li et al. (2021), Saksala (2023), and Xue et al. (2025) have further refined numerical approaches, incorporating various modeling techniques and considering dynamic loading conditions.
The emergence of artificial intelligence has introduced powerful machine learning (ML) tools for force estimation. Tiryaki et al. (2010) conducted comprehensive comparisons between traditional statistical methods and artificial neural networks (ANNs) for predicting mean pick acting forces. Their study employed stepwise multiple linear regression as a baseline and developed a multi-layer perceptron neural network architecture, demonstrating that ANNs achieved superior predictive capabilities when utilizing rock properties and cutting geometry parameters as input features. Geng et al. (2012) expanded the algorithmic comparison by implementing neural networks, support vector regression (SVR) with different kernel functions, and k-nearest neighbors (KNN) algorithms for cutter force prediction. Their results showed that machine learning approaches consistently outperformed traditional empirical models, with SVR using a radial basis function kernel achieving the highest accuracy. Avunduk et al. (2014) further validated the effectiveness of ANNs in predicting instantaneous cutting rates of roadheaders, utilizing a backpropagation algorithm with optimized network architecture.
Recent innovations include Hu et al.’s (2021) application of SVR for predicting pick acting forces on CCS disc cutters, achieving improved accuracy through systematic outlier removal. Grafe (2022) developed novel algorithms for rock type identification during cutting operations using high-frequency force measurements, establishing a robust classification system adaptable to varying cutting parameters. Additional advances include Fathipour-Azar (2023) developing a hybrid model combining extreme gradient boosting (XGBoost) with evolutionary optimization techniques, utilizing both genetic algorithms (GA) and particle swarm optimization (PSO) to fine-tune hyperparameters for mean pick acting forces prediction. Their approach demonstrated that evolutionary optimization significantly improved the base XGBoost model’s performance. Zhou et al. (2023) introduced an innovative approach using random forest models optimized by the salp swarm algorithm (SSA), incorporating feature importance analysis to identify key predictors among tensile strength, compressive strength, and various cutting parameters. Their SSA-optimized random forest showed superior performance compared to traditional models and basic ensemble methods.
The latest developments have focused on advanced ensemble techniques and real-time applications. Morshedlou et al. (2024) implemented sophisticated ensemble learning methods, combining multiple decision trees through stacking techniques and employing meta-learners to optimize the final predictions. Liu et al. (2024) advanced the field by developing a Bayesian optimization framework for boosting tree models, systematically exploring the hyperparameter space to achieve optimal model configurations. Their approach demonstrated superior performance over traditional machine learning models and standard optimization techniques.
Despite significant advancements in pick acting forces prediction through various machine learning techniques, the field continues to face challenges primarily due to the complex interactions between tool geometry, rock properties, and cutting conditions, further complicated by the variability in datasets and rock origins. This study focuses on analyzing and predicting mean normal force (FNm) and mean cutting force (FCm) using a database compiled from multiple experimental sources, encompassing rock types ranging from soft to hard from different geological locations, along with different cutting conditions, and diverse tool geometries and operational parameters. The research methodology employs a two-stage approach: first establishing baseline correlations through multivariable linear regression, followed by implementing an optimized random forest model with systematic hyperparameter tuning through randomized search cross-validation for enhanced prediction accuracy.
2. Database Overview
The dataset utilized in this study comprises 195 experimental samples compiled from multiple sources, including the works of Bilgin et al. (2006), Copur (1999), and additional contributions from Jeong et al. (2018, 2020a, 2020b, 2023). The dataset encompasses a wide range of rock types, cutting tool geometry, and operational parameters, providing a comprehensive foundation for analyzing the relationship between pick acting forces and various input parameters.
2.1 Lithological characteristics and mechanical properties
The dataset includes a diverse array of lithological units exhibiting significant variation in their mechanical properties. The rock types include chromite-bearing rocks stratified by chromium content (high, medium, and low variants), ultramafic and metamorphic samples (specifically harsburgite and serpentinite), and a broad spectrum of sedimentary and evaporitic deposits including trona, anhydrite, celestite, gypsum, sandstone, siltstone, limestone (including specific Indiana limestone samples), and dolomite. The inclusion of volcanic tuff, copper ore, and concrete extends the analysis to both natural and engineered materials commonly encountered in excavation projects.
The mechanical behavior of these materials is described by their uniaxial compressive strength (UCS) and Brazilian tensile strength (BTS) in the collected dataset. The UCS values demonstrate a considerable range, spanning from 6 MPa in weaker formations to 174 MPa in high-strength samples. In the BTS measurements, similarly, there is considerable variation, ranging from 0.2 MPa to 11.6 MPa.
2.2 Cutting tool geometry and cutting parameters
The dataset incorporates a range of cutting tool geometrical characteristics and cutting parameters that significantly impact excavation forces. Fig. 1 illustrates the fundamental geometrical features of the cutting tool, including the tip diameter (dt) and tip angle (𝜙), along with critical operational parameters such as cutting depth (d), cutter spacing (s), attack angle (𝛾), and skew angle (𝜀). The directional components of the pick acting forces generated during the rock-tool interaction process are also depicted. The key parameters and their respective ranges within the dataset are detailed below:
∙Tip diameter (dt): The diameter of the cutting tool’s tip, which influences the contact area between the tool and the rock. The dataset includes tip diameters ranging from 7.94 mm to 22 mm.
∙Tip angle (𝜙): The angle of the cutting tool’s tip, which affects cutting efficiency and force distribution. The dataset contains tip angles of 70°, 75°, and 80°.
∙Cutting depth (d), The penetration depth per cutting cycle, which ranges from 1 mm to 15 mm.
∙Cutting spacing (s): The distance between adjacent cutting tools, influencing rock fragmentation and efficiency. The spacing values range from 5 mm to 38 mm.
∙Spacing-to-depth ratio (s/d): A crucial parameter for optimizing cutting efficiency, with values ranging from 1.5 to 7.
∙Attack angle (𝛾): The inclination of the cutting tool relative to the rock surface. The dataset includes attack angles of 35°, 45°, 50°, and 55°.
∙Skew Angle (𝜀): The lateral inclination of the cutting tool, affecting force distribution. Values range from -20° to 20°, including -20°, -15°, -10°, -5°, 0°, 5°, 10°, 15°, and 20°.
∙Cutting condition: The dataset includes both unrelieved and relieved cutting conditions. Unrelieved cutting refers to a single cutter acting on the rock, whereas relieved cutting involves multiple cutters with overlapping cutting paths, potentially reducing the overall cutting forces

Fig. 1.
Schematic representation of cutting tool geometry and operational parameters, including tip diameter, tip angle, cutting depth, attack angle, skew angle, and the directional components of acting forces (FN and FC) (41Wang et al., 2018)
2.3 Pick acting forces and their correlation
The dataset includes measured values for the normal force (FN) and cutting force (FC) under various cutting conditions, which are critical for evaluating energy consumption and tool wear. These forces are influenced by the mechanical properties of the rock, the cutting tool geometry, and the operational parameters. In the dataset utilized in this study, these forces are reported as the mean normal force (FNm) and mean cutting force (FCm). A statistical summary of the dataset, including parameter ranges and average values, is provided in Table 1.
Table 1.
Statistical summary of dataset parameters, including cutting tool geometrical characteristics, operational parameters, and measured cutting forces
Fig. 2 presents the correlation matrix of the dataset parameters, illustrating the relationships between mechanical properties, cutting tool geometry, cutting parameters, and both cutting forces. As observed, FNmexhibits a significant correlation with UCS (R = 0.83), underscoring the dependence of cutting forces on rock strength properties. Additionally, Fig. 3 illustrates the exponential relationship between FNm and UCS. The fitted exponential function demonstrates a high coefficient of determination (R2 = 0.88), indicating that UCS serves as a key determinant of FNm.
Furthermore, Fig. 4 depicts the relationship between FNm and FCm for the entire collected dataset. Two distinct trends are apparent: the first follows a linear relationship, expressed with an R2 value of 0.94. This trend is associated with a specific dataset source and particular rock types, namely limestone and dolomite, as derived from the Copur (1999) dataset. The second trend, which represents the primary dataset, exhibits an exponential and/or linear relationship, characterized by R2 = 0.71, encompassing the majority of the collected dataset. However, in this predictive study, the objective is to establish correlations based on the complete dataset rather than analyzing subsets separately.
3. Predictive Modeling Approach
This study first employed a multivariate linear regression (MLR) model. The objective of this initial analysis was to identify correlations and derive equations for FNmand FCm based on the input features outlined in the previous section.
Subsequently, the primary focus of this research was to develop an optimized Random Forest (RF) model to predict both picks acting forces across the dataset. The RF, an ensemble learning method, constructs multiple decision trees using randomly selected subsamples of the data. The aggregation of predictions from these trees enhances model robustness, effectively reducing variance and mitigating overfitting through averaging, making it particularly suitable for handling noisy data.
The RF model was implemented using the scikit-learn library to establish relationships between input variables and target outputs for regression tasks. Model optimization was conducted through a randomized hyperparameter search (RandomizedSearchCV) to refine key parameters, as detailed in Table 2. The optimized parameters included the number of trees (n_estimators ranging from 50 to 1000), maximum depth (max_depth ranging from 2 to 10), minimum samples required for node splitting (min_samples_split from 2 to 10), and the minimum number of samples per leaf node (min_samples_leaf from 1 to 4). These parameters were systematically tuned to balance model performance and computational efficiency.
Table 2.
Overview of hyperparameters and their optimized ranges used in the Random Forest model for predicting rock cutting forces
The selection of the RF model over other ML approaches was driven by several factors. RF has been widely recognized for its ability to handle high-dimensional and nonlinear. Compared to models such as SVR, XGBoost, and KNN, used in previous studies (Geng et al., 2012, Hu et al., 2021, Fathipour-Azar, 2023, Zhou et al., 2023), RF exhibited superior performance in preliminary tests and demonstrated robustness across different input conditions. These advantages make RF a suitable choice for predicting cutting forces in rock excavation.
The dataset undergoes strategic partitioning, with a random allocation of 80% for training and 20% for testing. To mitigate overfitting and ensure robust model performance, a 5-fold cross-validation strategy was employed on the training dataset during hyperparameter tuning. This approach iteratively trains the model on four subsets and validates on the fifth, cycling through all possible combinations, enhancing model generalizability and reducing the risk of overfitting. The negative mean-squared error was used as the scoring metric to directly optimize the regression predictive capability. This iterative process of model training and evaluation across various hyperparameter configurations facilitated the identification of the optimal parameter settings.Furthermore, a 5-fold cross-validation strategy was employed during hyperparameter tuning to enhance the generalizability of the model. The negative mean-squared error was used as the scoring metric to directly optimize the regression predictive capability. This iterative process of model training and evaluation across various hyperparameter configurations facilitated the identification of the optimal parameter settings.
Model performance assessment utilized two statistical metrics: the root-mean-square error (RMSE) and the coefficient of determination (R2). These metrics evaluated PR prediction accuracy, with RMSE quantifying average prediction deviation and R2 measuring the proportion of variance explained by the model. The mathematical formulations for these metrics are expressed as Equations (1) and (2):
where n denotes the number of samples, () represents subtracting the average y value () from the individual value () for each data point, and () is the difference between the actual and predicted target values for each data point.
4. Results and Discussion
4.1 Multivariate linear regression analysis
The results of the MLR analysis for predicting FNm are presented in Fig. 5a. The model exhibits a good correlation between estimated and measured FNm values, with an R2 value of 0.743. The RMSE of 5.236 kN indicates a moderate prediction error. Similarly, the MLR results for FCm are shown in Fig. 5b, where the R2 value of 0.674 suggests a slightly weaker correlation compared to FNm. The derived equations for predicting FNm and FCm using MLR are presented in Equations (3) and (4). While this approach provides a fundamental understanding of the relationships between input parameters and cutting forces, its limitations arise from the inability to capture the complex nonlinear interactions inherent in rock cutting processes.
4.2 Random Forest model performance
The optimized RF model demonstrated significantly improved predictive accuracy for both FNm and FCm. As depicted in Fig. 6a, the RF model achieved an R2 value of 0.993 for the training dataset and 0.983 for the test dataset when predicting FNm. The corresponding RMSE values of 0.918 kN (training) and 1.170 kN (testing) highlight the model’s strong predictive capabilities, with minimal overfitting as evidenced by the closely matched training and testing performances.
For FCm prediction, illustrated in Fig. 6b, the RF model attained an R2 value of 0.972 for the training set and 0.908 for the testing set. The RMSE values of 0.660 kN (training) and 1.005 kN (testing) confirm the model’s robustness in predicting cutting force, with a consistent performance across datasets. These results indicate that RF significantly outperforms MLR in predicting both acting FN and FCm).
Furthermore, comparative analysis reveals that FNm predictions exhibit higher accuracy than FCm predictions under the same input conditions; This observation is consistent across both MLR and RF models. The reason for this difference lies in the complexity of force interactions in rock cutting. FNm is predominantly controlled by UCS, a single material property that remains relatively stable, while FCm is influenced by a more complex interaction of multiple parameters, including cutting tool geometry and operational settings. Consequently, the variability in FCm predictions is higher, leading to slightly lower accuracy compared to FNm. This is in line with findings from Geng et al. (2012) and Hu et al. (2021), which also reported lower prediction accuracy for FCm compared to FNm across multiple ML models.
4.3 Feature importance analysis
To further interpret the RF model’s predictions, Fig. 7 presents the relative importance of input features in predicting FNm and FCm. Additionally, Table 3 provides a ranked summary of relative importance for both forces. As shown in Fig. 7a, approximately 80% of the total feature importance for FNm prediction is attributed to UCS, emphasizing its dominant role in determining FNm. Other parameters, such as BTS, cutting spacing, attack angle, and cutting depth, contribute minimally to FNm predictions. This finding largely aligns with Hu et al. (2021), which also identified UCS as the dominant parameter for FNm prediction. However, some contrasting results exist in the literature, such as Geng et al. (2012), who found cutter ring tip width and cutting depth to be more influential than UCS.
Conversely, Fig. 7b illustrates that multiple features significantly influence FCm predictions. While UCS remains the most influential parameter (accounting for nearly 40% of total importance), other features-including BTS, cutting spacing, attack angle, and cutting depth also exhibit substantial contributions to the model’s predictive power. This aligns with findings from several studies, including Tiryaki et al. (2010), Hu et al. (2021), and Zhou et al. (2023), which similarly identified UCS as the primary predictor for FCm using various ML methods. However, there are some notable differences between current research findings and other existing literature. For instance, Fathipour-Azar (2023) identified cutting depth as the most influential parameter for FCm, whereas current analysis ranked it fifth in importance. These discrepancies may stem from differences in dataset composition, particularly the range of cutting depths and rock types considered.
Table 3.
Relative importance rankings and corresponding values of input parameters for mean normal force (FNm), (b) mean cutting force (FCm) predictions using RF model
This observation aligns with the theoretical understanding of rock cutting mechanics, where normal force is primarily influenced by the compressive strength of rock while cutting force depends on a more complex interaction of parameters including cutting geometry and tool orientation. The empirically derived equations further support this finding, with UCS showing a stronger coefficient (0.342) in the FNm equation compared to the FCm equation (0.061), while parameters such as attack angle and BTS demonstrated more significant influences on FCm. These findings suggest that optimizing rock cutting parameters requires different approaches for managing normal and cutting forces, with UCS being the primary consideration for normal force control, while a more holistic consideration of multiple parameters is necessary for effective cutting force management.
5. Conclusions
This research presents an investigation into predicting rock cutting forces through an advanced machine learning technique, utilizing a comprehensive dataset collected from multiple sources and encompassing diverse rock types.
The initial MLR analysis established foundational correlations, while the subsequently developed and optimized RF model demonstrated significantly improved predictive accuracy for both mean normal force (FNm) and mean cutting force (FCm).
The optimized RF model exhibited exceptional predictive capabilities, achieving R2 values exceeding 0.97 for training datasets and maintaining high accuracy (above 0.90) in testing scenarios. The remarkably low RMSE values further highlight the model’s robustness and minimal overfitting. Moreover, the minimal disparity between training and testing performance indicates robust generalizability across different geological contexts.
Feature importance analysis revealed distinctive characteristics in predicting mean normal and cutting forces. The FNm prediction was predominantly influenced by UCS, whereas FCm prediction demonstrated a more complex interaction of multiple parameters, including BTS, cutting spacing, attack angle, and cutting depth. This observation substantiates the existing theoretical framework, which posits that cutting force emerges from an intricate interaction of rock properties, cutting geometry, and operational factors.
Several limitations and corresponding future research directions emerge from this study. While our dataset encompasses diverse rock types, it primarily relies on laboratory-scale tests that may not fully capture field conditions and their complexities. Future work should focus on extending the analysis to field-scale operations and incorporating data from various geological locations to enhance the model’s applicability across different operational environments. Additionally, current study analysis did not account for environmental factors (temperature, moisture content, rock mass discontinuities) and other operational parameters (tool wear, cutting speed variations) that could significantly influence cutting performance. These limitations could be addressed through expanded experimental programs incorporating these parameters.









