Automatic Fracture Detection in CT Scan Images of Rocks Using Modified Faster R-CNN Deep-Learning Algorithm with Rotated Bounding Box

Pham Chuyen; Li Zhuang; Sun Yeom; Hyu-Soung Shin

doi:10.7474/TUS.2021.31.5.374

Preview

Original Article

Tunnel and Underground Space. 31 October 2021. 374-384
https://doi.org/10.7474/TUS.2021.31.5.374

Automatic Fracture Detection in CT Scan Images of Rocks Using Modified Faster R-CNN Deep-Learning Algorithm with Rotated Bounding Box

회전 경계박스 기능의 변형 FASTER R-CNN 딥러닝 알고리즘을 이용한 암석 CT 영상 내 자동 균열 탐지

Pham Chuyen¹

Li Zhuang²

Sun Yeom³

Hyu-Soung Shin⁴^*

추엔 팜¹

장 리²

염 선³

신 휴성⁴^*

¹Ph.D. Student, Dept. of Geo-Space Engineering, University of Science and Technology (UST)

²Senior researcher/Associate Professor, Dept. of Future & Smart Construction Research, Korea Institute of Civil Engineering and Building Technology (KICT)

³Technician, Dept. of Future & Smart Construction Research, Korea Institute of Civil Engineering and Building Technology (KICT)

⁴Senior Research Fellow/Professor, Dept. of Future & Smart Construction Research, Korea Institute of Civil Engineering and Building Technology (KICT)

¹한국과학기술연합대학원대학교 지반신공간공학과 박사과정

²한국건설기술연구원 미래스마트건설연구본부 수석연구원 겸 부교수

³한국건설기술연구원 미래스마트건설연구본부 주임기술원

⁴한국건설기술연구원 미래스마트건설연구본부, 선임연구위원 겸 교수

^{*Corresponding Author}

ABSTRACT

In this study, we propose a new approach for automatic fracture detection in CT scan images of rock specimens. This approach is built on top of two-stage object detection deep learning algorithm called Faster R-CNN with a major modification of using rotated bounding box. The use of rotated bounding box plays a key role in the future work to overcome several inherent difficulties of fracture segmentation relating to the heterogeneity of uninterested background (i.e., minerals) and the variation in size and shape of fracture. Comparing to the commonly used bounding box (i.e., axis-align bounding box), rotated bounding box shows a greater adaptability to fit with the elongated shape of fracture, such that minimizing the ratio of background within the bounding box. Besides, an additional benefit of rotated bounding box is that it can provide relative information on the orientation and length of fracture without the further segmentation and measurement step. To validate the applicability of the proposed approach, we train and test our approach with a number of CT image sets of fractured granite specimens with highly heterogeneous background and other rocks such as sandstone and shale. The result demonstrates that our approach can lead to the encouraging results on fracture detection with the mean average precision (mAP) up to 0.89 and also outperform the conventional approach in terms of background-to-object ratio within the bounding box.

Keywords

Fracture detection

Computed tomography

Deep learning

Faster R-CNN

Rotated bounding box

본 논문에서는 암석시료의 CT 촬영 이미지상의 균열을 자동으로 탐지하는 새로운 인공지능 딥러닝 기법을 제안한다. 본 제안 기법은 2단계 딥러닝 객체인식 알고르즘인 Faster R-CNN을 기반으로 회전 가능한 경계박스(bounding box) 개념을 도입하여 알고리즘을 개조하였다. 회전 경계박스의 도입은 관심 균열영역 밖의 배경의 불균질성 및 균열의 크기와 형태에 영향을 받는 딥러닝 객체인식기법 상의 고유한 어려움을 극복하기 위한 핵심 역할을 한다. 본 회전형 경계박스의 사용은 일반적으로 사용되는 영상 수평축과 평행한 경계박스 사용의 경우와 비교하여 긴 형태의 균열 형상 특성에 매우 잘 부합된다. 즉, 좋지 않은 영향을 끼치는 경계박스 내 균열 이외 배경영역의 비율을 최소화 시킬 수 있다. 이외에도, 회전 경계박스의 추가적인 이점은 인식된 균열의 방향에 따라 회전하여 추론되는 경계박스를 통해 균열의 방향과 길이에 대한 정보를 직접적으로 얻을 수 있다. 본 제안기법의 적용성을 검증하기 위하여, 이미지상에서 매우 불균질한 화강암 시료에 인공적으로 균열을 발생시킨 다수의 암석시료 영상을 딥러닝 학습에 사용하고 추론 성능 실험을 진행하였다. 그 외에도, 동일 조건에서 사암과 셰일 암석 시료에도 적용하여 검증하였다. 결론적으로, 제안된 기법을 통해 균열 객체 인식의 평균 추론정확도(mAP)값이 0.89 정도 수준의 우수한 추론 성능을 보였으며, 기존 기법에 비해 추론된 경계박스 내 균열과 배경 영역의 비율 측면에서 배경의 비율이 획기적으로 최소화되는 유리한 추론 검증 결과를 보였다.

키워드

균열 탐지

컴퓨터 단층촬영

딥러닝 알고리즘

Faster R-CNN

회전 경계박스

MAIN

1. INTRODUCTION
2. METHODOLOGY
2.1 Rotated bounding box Faster R-CNN
2.2 Experiment
3. FRACTURE DETECTION RESULT
4. DISCUSSION
5. CONCLUSION

1. INTRODUCTION

Fractures are common structural features encountered in rocks. The characteristics and distribution of fractures can have significant effect on deformability and permeability of rock mass. In many recent applications of rock engineering such as CO₂ storage, geothermal production and nuclear waste disposal, the knowledge of fracture geometry is critical to understand the initiation and development of damage zone in response to those operations. With the advantage as non-destructive testing method, computed tomography (CT) scan recently emerged as an effective tool for geological material research (Ketcham and Carlson, 2001, Andrä et al., 2013). Many published studies demonstrate the use of CT scan to evaluating internal structure of rock sample like pore, mineral and also fracture (Wennberg et al., 2009, Kyle and Ketcham, 2015, Schmitt et al., 2016).

The fundamental step to obtain the geometric information from CT scan images is the segmentation of the feature of interest, i.e., fracture in the context of this study. In this step, fractures are detected and delineated either manually or automatically. While the manual approach is time-consuming and user-bias, the automatic is still far behind expectation. Automatic thresholding methods such as Otsu’s or Yen’s method (Yen, 1970, Otsu, 1979) are extensively used for CT scan image segmentation, which are relied on identification of a single grayscale value to partition fractures from the background. However, these methods severely suffer from the complexity of uninterested background, where the existence of non-fracture objects with similar grayscale to the one of fracture. To partly deal with this problem, ridge filters such as Hessian or Frangi filter are commonly adopted to enhance the contrast between fracture and background (Voorn et al., 2013). The major drawback of these filters is that it amplifies not only the contrast of fracture, but also mineral boundaries in the background. Thus, after thresholding step, additional attempt is always required to remove noises caused by unintended objects.

For the past few years, there has been a rapid rise in the application of deep learning in image segmentation. Various algorithms were investigated to detect and segment the fracture and crack in images including U-Net and Mask Region-based Convolutional Neural Networks (Mask R-CNN) (Ronneberger et al., 2015, He et al., 2017). In comparison with traditional approach, the deep learning algorithms are not only more effective, but also give the better results. U-Net is the most commonly used algorithm for the task of segmentation, which is simple and yields reasonable result. However, the main challenge facing this algorithm is the complexity in background and vagueness of fracture, which may lead to missing as well as mistaken detections. For Mask R-CNN, although the Mask R-CNN is considered the most effective way to obtain instance segmentation, many latest research in CT scan image segmentation indicate that Mask R-CNN works well to find bounding boxes for interested object, but its performance on segmentation is worse than U-Net. To this end, we can conclude that detection and segmentation of fracture in CT scan images in practice still remains as a very challenging task and there is still plenty of room for improvement.

In this study, we aim to develop a new approach to automatically and accurately detect fractures in CT scan images. The proposed approach is developed on the basis of Faster R-CNN algorithm with a major modification for the task of rotated bounding box prediction (Ren et al., 2015). The use of rotated bounding box for object detection has been introduced in the past few years primarily for scene text detection and ships, aircraft and other vehicles detection in aerial images. This type of bounding box have advantages over other axis-align bounding box as it allows a better spatial constrain, such that the object is unique and fully contained in the bounding box. The idea underlying our approach is to take this advantage of rotated bounding box to minimize the ratio of background within the box and consequently provide a better prior spatial information to guide the segmentation process.

2. METHODOLOGY

2.1 Rotated bounding box Faster R-CNN

Faster R-CNN, a well-known object detection algorithm in the family of region-based algorithm, was first published in 2015 (Ren et al., 2015). Up to now, although several different models have been proposed with the aim of further improving detection performance such as You Only Look Once algorithm (YOLO) (Redmon et al., 2016) or Single Shot MultiBox Detector algorithm (SSD) (Liu et al., 2016), Faster R-CNN still remains as an outperforming algorithm in terms of detection accuracy. Despite the complexity in architecture, Faster R-CNN can be generally divided into two main components, i.e., region proposal network (RPN) and R-CNN corresponding to double stages of detection framework (Fig. 1). Apart from those two components, Faster R-CNN also adopts a deep CNN known as backbone network to extract the feature maps from input image. As usual, we use ResNet-101 in combination with Feature Pyramid Network (FPN) to obtain multi-scale feature maps (Lin et al., 2017). Loosely speaking, RPN and R-CNN perform almost similar functions but in some different fashion, including: (1) predicting bounding boxes where fracture is likely to exist and (2) refining the coordinate of those boxes to better match with the shape of fracture. The difference between them is that while RPN uses the whole feature maps extracted from input image for bounding box prediction, RCNN only takes a specific number of partial regions in the feature maps corresponding to proposal boxes (i.e., RPN’s output) and uses them for final bounding box prediction. In the following, we will discuss on the modifications within each component that are required for the purpose of rotated bounding box prediction.

https://static.apub.kr/journalsite/sites/ksrm/2021-031-05/N0120310504/images/ksrm_31_05_04_F1.jpg

Fig. 1

Rotated Faster R-CNN architecture

It is beneficial to enumerate the means of rotated bounding box and anchor box before getting into the detail of Faster R-CNN. As mentioned earlier, the detection of fracture will be in form of oriented rectangular boxes. For such bounding box representation, besides four conventional parameters, i.e., center point coordinate (xc, yc), width (w) and height (h), an additional parameter of angle (θ) is introduced to take account of box orientation. This angle is given as an angle between positive vertical axis and the longer side of bounding box, such that ranging from -90° to 90°. It is also worth to mention that the width of bounding box is always less than height. In line with rotated bounding box, anchor boxes are a set of predefined rotated bounding boxes with a certain scale, aspect ratio and angle. They are uniformly distributed over the image and will play a role as references for generating proposal boxes in RPN stage.

In short, RPN network will learn to identify anchor boxes which may contain fracture and produce target regression coefficients to convert the anchor boxes into a better proposal bounding boxes. For that purpose, RPN network is designed with a convolution layer followed by two sibling fully connected branches so-called classifier and regressor. During training, based on the overlap with ground truth boxes and predefined thresholds, the anchor boxes will be classified into 3 groups labeling as positive, negative and neutral corresponding to foreground (i.e., fracture-contained) boxes, background boxes and ambiguous boxes. Consequently, the classifier branch will learn how to predict anchor box label by using soft-max activation function. The output of this branch will be objectiveness scores, which reflect how confident the network is that a proposal bounding box contains the fracture. At the same time, one or more than one positive anchor boxes will be assigned to each specific ground truth box. The offset value between assigned anchor boxes and the ground truth box will be calculated as the equations (1) – (5) below. First four equations present the scale invariant translation between center coordinates and the log-space height-width shift. Additionally, the fifth equation presents the rotation angle in radian orientations. The regressor branch will learn to predict those offset values. Afterward, the predicted offset values will be used to transform an anchor box to the final predicted proposals via equations, which are simply the reverses of the equations (1) - (5).

(1)

t_{x} = (x - x_{a}) / w_{a}

(2)

t_{y} = (y - y_{a}) / h_{a}

(3)

t_{w} = b g (w / w_{a})

(4)

t_{h} = b g (h / h_{a})

(5)

t_{θ} = (θ - θ_{a}) \times \frac{π}{180 °}

where x_a, y_a, w_a, h_a and θa are x-, y-center coordinate, width, height and orientation of anchor box, respectively.

Through inspection of the result from the first training, we found a problem with using IoU (i.e., Intersection-over-Union, which is an index to measure the overlap between two boxes) for anchor box assignment. Without considering the difference in angle between two boxes, an anchor box, which has a large different angle to ground truth box, still can be assigned as positive anchor because its overlap with ground truth anchor is higher than assignment threshold. To solve this issue, we modified the conventional IoU by multiplying it with an index referred as weight of the parameter “angle” and then used this new IoU for anchor assignment. This implementation helps to decay IoU with respect to the angle difference with Gaussian penalty function as the following equation:

(6)

I o U_{a n g l e w e i g h t e d} = w e i g h t \times I o U_{n o r m a l}

(7)

w e i g h t = 1 - e^{- α \times {(1 \frac{- △ θ}{90})}^{2}}

where Δθ is the acute angle between two oriented bounding boxes ranging from 0° to 90° and α is empirical constant (equal to 0.1 in this study).

As mentioned earlier, a ground truth box can be matched up with one or more than one anchor box. This will cause a problem that the prediction may be composed of overlapping boxes. To remove redundant boxes, the proposal boxes will be passed through a filter algorithm known as rotated non-maximum suppression (NMS). NMS will keep or remove boxes according to their objectiveness scores and IoU index, such that if a bounding box overlaps considerably with a higher objectiveness scores bounding box (i.e., IoU > 0.7), it will be removed.

However, although RPN has a bounding box regression branch, it only gives coarse bounding boxes that may contain fracture. The primary reason is due to the fact that RPN has to use high-variance anchor boxes to predict the potential boxes (Vu et al., 2019). Thus, a second stage R-CNN is needed to further refine the bounding boxes. In this stage, proposal boxes will now serve as reference boxes and also as basis to crop feature maps for the input of this stage. Since the size of the proposal boxes is different, cropped feature maps cannot be directly input into R-CNN. ROI align is therefore carried out to extract fixed size feature maps. The remaining operations within this stage is similar to RPN as described above. The outputs of R-CNN will be a list of bounding boxes along with their objectiveness scores, which is again filtered by using NMS to get the final list of predicted bounding boxes.

Faster R-CNN is optimized for a multi-task loss function, which is a combination of classification loss and box regression loss. For each region of interest, the loss function was calculated as follows:

(8)

T o t a l l o s s = R P N_{c l a s s_l o s s} + R P N_{b o x_l o s s} + R C N N_{c l a s s_l o s s} + R C N N_{b o x_l o s s}

where class_loss and box_loss represent the classification loss and bounding box regression loss in each stage (i.e., RPN and R-CNN), respectively. The bounding box regression loss for all rotated bounding box representations adopted the smooth L1 loss function, which is defined in Girshick (2015). In short, smooth L1 loss function, also known as Huber loss function, is a combination of mean square error loss and mean absolute error loss in which the square term will be used if absolute error falls below 1 and absolute term otherwise.

2.2 Experiment

To assess the performance of the proposed methods, we conducted experiments using the CT scan image dataset acquired from scanning of artificially fractured rock samples (including granite, sandstone and shale). Each of scanning provides a stack of 1024 image slices with a size of 1024 × 1024. In general, CT scan images are visualized in grayscale with value ranging from 0 to 255, which depicts X-ray attenuation coefficient of material making up the rock. The fractures appear in the images as dark and arbitrarily-oriented curved line structures with varied thickness along their length. In the initialization stage, the grayscale images undergo preprocessing in order to remove noise and to enhance the contrast between phases in CT scan images by using non-local means filter and contrast limited adaptive histogram equalization.

In total, 680 images were randomly selected from the previous sets. These images were first cropped to remove the unwanted black background. Afterward, ‘roLabelImg’, a graphical image annotation tool, is used to annotate fracture with rotated bounding boxes in those CT images. In accordance with elongated shape of fracture, we annotated fractures with head-to-tail overlapping boxes in order to avoid excessively large uninterested background included in the bounding boxes. Then, we randomly divided the dataset into train, test, and validation sets with the ratio of 7:2:1. Moreover, all the images were resized to a relatively lower resolution of 640 × 640 to obtain the same size image for model’s input as well as speed up the training and evaluation. Also, different data augmentation methods, including rotation, flip, brightness and contrast shift and sharpness, were also applied to training set to obtain more diverse CT scan images for training. This strategy is important to prevent an overfitting problem in deep learning and improve generalization performance of the model.

The algorithms presented in this study were implemented in Python using Tensorflow library (Abadi et al., 2015). We carried out experiments on a 4-core Intel Core i7-6700K @ 4.00 GHz processor with Nvidia Geforce GTX 1080 Ti GPU with 11 GB of memory. Since there were no available pre-trained weights for CT scan images, the training was performed from scratch by initializing each layer’s weights using initialization method described in He el al. (2015). The network was trained for 100 epochs using Adam optimizer with constant learning rate of 0.0001 and batch size of 2.

Finally, the mean average precision (mAP) with the IoU threshold of 0.5 was used for quantitative evaluations of our proposal method (Lin et al., 2014). Since fracture was annotated with a group of consecutive boxes instead of a large one, in many cases predicted bounding box group is mismatched with the ground truth due to the difference in individual box size and number of boxes, but the boxes still catch the same detection area. Therefore, a new reasonable box matching rule is needed to avoid underestimation of the new algorithm operation. In this study, besides using IoU as a criterion to decide whether a predicted box is true positive or false positive, we also considered the orientation of the predicted bounding box in regard to ground truth bounding boxes and number of ground truth bounding box overlapped it. Accordingly, there are two cases that a predicted bounding box can be defined as true positive prediction: (1) the predicted bounding box has an IoU higher than 0.5; and (2) the bounding box overlaps with more than two ground truth bounding boxes, and the orientation angle difference between the predicted bounding box and the ground truth boxes is less than 10 degree.

3. FRACTURE DETECTION RESULT

In order to ensure that the model was properly trained, we first examine the training curves. In overall, both training and validation loss curves show decreasing trends as training progressed, which means that our model performs well on unseen dataset (Fig. 2(a)). Also, we can see that the training and validation loss starts to converge after more or less 90 epochs.Thus, further training is not needed as it will not contribute to performance of the model. Besides, the model is also evaluated at the end of each epoch by calculating mAP of the model on the test dataset which is not seen in training. In Fig. 2(b), mAP versus epoch is depicted, showing that as the number of epoch increases, the mAP (accuracy) on the test dataset also increases. The model achieves its best mAP of approximately 89% on the test dataset. We also make a comparison between rotated and horizontal bounding box approaches. Both methods were trained and tested on the same dataset. Several qualitative results are shown in Fig. 3. In general, the two methods present similar performance in terms of mAP. For training and inference speed, axis-align bounding box is certainly faster than rotated bounding box approach. This is due to both facts that the number of anchor boxes in axis-align bounding box approach is much less than rotated bounding box, and IoU estimation in rotated bounding box is far more complicated and requires more computing time than in axis-align bounding box. However, the rotated bounding box approach is able to achieve more accurate detection on fractures employing less background area, which is an important factor for accurate segmentation.

https://static.apub.kr/journalsite/sites/ksrm/2021-031-05/N0120310504/images/ksrm_31_05_04_F2.jpg

Fig. 2

Training results

https://static.apub.kr/journalsite/sites/ksrm/2021-031-05/N0120310504/images/ksrm_31_05_04_F3.jpg

Fig. 3

Comparison of inference results on test image set empolying rotated (left) and horizontal (right) bounding box in each subfigure

4. DISCUSSION

In overall, we have demonstrated that our newly developed algorithm can produce reasonable results on fracture detection regardless of heterogeneity in background of CT scan image and variation in fracture size and pattern. Besides, our approach with use of axis-align bounding box has some essential advantages over the conventional approach. First, the use of rotated bounding box can effectively eliminate the background from bounding box, such that setting more robust basement for fracture segmentation. Second, it is possible to extract information on fracture length and orientation from rotated bounding box without further implementing segmentation step. To some extent, this approach is also applicable for cracks or fractures detection on buildings, roads, tunnels and other structures.

However, there still remains several limitations in the proposed algorithm. One limiting case could be where CT images are highly degraded due to artifacts, or where the fracture is too thin and is below the resolution of CT scan image. By examination of the detection result on test dataset, we noticed that most of false negative (i.e., missing detection) is related to the poor quality of these particular CT images. In such cases, CT images are highly deteriorated by artifacts including beam hardening causing dark streak overlapping fracture area and partial volume artifact causing blurriness of the fracture. However, previous research demonstrate that fluid flow in fractured porous media is mainly governed by large-aperture fractures and less tortuous fractures rather than minor fractures which have a smaller aperture and are highly tortuous (Kling et al., 2016, Crandall et al., 2017). Thus, the missing of minor fractures can be neglected without significant consequence on the characterization of fluid flow through fracture. Another limitation of this algorithm is that its training and inference speed. While the original implementation of Faster R-CNN with axis-align bounding box only uses 9 cases of anchor box (i.e., 3 scales × 3 aspect ratios), in the new approach the total case of anchor box is much higher because of one additional parameter (i.e., angle). Therefore, the effects of anchor box selection should be analyzed to obtain acceptable trade-offs between analysis speed and accuracy.

Our future efforts will be directed to explore potential ways to segment the fracture within the bounding boxes. In the end, a user interface, in which the detection algorithm is integrated seamlessly with segmentation algorithm, will be established to obtain final segmentation of fractures.

5. CONCLUSION

We present a new algorithm that is a modification based on the Faster R-CNN for fracture detection in CT scan images. Due to the complexity of uninterested background and tortuous fracture, we adopted the rotated bounding box to enhance the capability of bounding box to fit with the pattern of fracture. The experimental results show that the algorithm can accurately detect fracture and considerably reduce ratio of background in the bounding box. Although the proposed method requires some increments in the computational complexity due to the deep learning architectures, it is expected to make the fracture segmentation more accurate.

Acknowledgements

This research was supported by the research project “Development of environmental simulator and advanced construction technologies over TRL6 in extreme conditions” funded by KICT, and R& D project ”Development of construction structure and long-term performance monitoring” (No. 20193210100050) funded by Korea Institute of Energy Technology Evaluation and Planning.

References

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., & Zheng, X. (2015), “TensorFlow: Large-scale machine learning on heterogeneous systems”, Software available from http://tensorflow.org/.

Andrä, H., Combaret, N., Dvorkin, J., Glatt, E., Han, J., Kabel, M., Keehm, Y., Krzikalla, F., Lee M., and Madonna, C. (2013), “Digital rock physics benchmarks-Part I: Imaging and segmentation”, Comput. Geosci. 50, pp. 25-32. 10.1016/j.cageo.2012.09.005

Crandall D, Moore J, Gill M, Stadelman M. (2017), “CT scanning and flow measurements of shale fractures after multiple shearing events”, Int J Rock Mech Min Sci. 100, pp.177-187. 10.1016/j.ijrmms.2017.10.016

Girshick, R. (2015), Fast R-CNN. In: ICCV. 10.1109/ICCV.2015.169

He, K., Gkioxari, G., Dollár, P., Girshick, R. B. (2017), “Mask R-CNN”, CoRR abs/1703.06870 (2017). arXiv:1703.06870 http://arxiv.org/abs/1703.06870 10.1109/ICCV.2017.322

He, K., Zhang, X., Ren, S., and Sun, J. (2015), “Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification”, In: Proceedings of the IEEE international conference on computer vision, IEEE, NY, USA, pp. 1026-1034. 10.1109/ICCV.2015.123

Ketcham, R. A., and Carlson, W. D. (2001), “Acquisition, optimization and interpretation of X-ray computed tomographic imagery: applications to the geosciences”, Comput. Geosci. Geosc. 27, pp. 381-400. 10.1016/S0098-3004(00)00116-3

Kling, T., Huo, D., Schwarz, J.O., Enzmann, F., Benson, S., Blum, P. (2016), “Simulating stress-dependent fluid flow in a fracturedcore sample using real-time X-ray CT data”, Solid Earth, 7, pp. 1109-1124. 10.5194/se-7-1109-2016

Kyle, J. R., and Ketcham, R. A. (2015), “Application of high resolution X-ray computed tomography to mineral deposit origin, evaluation, and processing”, Ore Geol. Rev. 65, pp. 821-839. 10.1016/j.oregeorev.2014.09.034

Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S. J. (2017), “Feature pyramid networks for object detection”, CVPR. 10.1109/CVPR.2017.106PMC5744014

Lin, T., Maire, M., Belongie, S. J., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick C. L. (2014), “Microsoft COCO: common objects in context”, in Computer Vision - ECCV 2014 - 13th European Conference, Zurich,Switzerland, September 6-12, 2014, Proceedings, Part V, ser. LectureNotes in Computer Science, vol. 8693. Springer, pp. 740-755. 10.1007/978-3-319-10602-1_48

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu C-Y. (2016), “SSD: Single Shot MultiBox Detector”, ArXiv151202325 Cs. 2016; 9905:21-37. 10.1007/978-3-319-46448-0_2

Otsu, N., 1979. A threshold selection method from gray-level histograms. IEEE Trans. Sys. Man. Cyber. 9 (1), 62-66. doi:10.1109/TSMC.1979.4310076. 10.1109/TSMC.1979.4310076

Redmon, J., Divvala, S., Girshick, R., Farhadi, A. (2016), “You only look once: Unified, real-time object detection”, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 10.1109/CVPR.2016.91

Ren, S., He, K., Girshick, R., Sun, J. (2015), “Faster R-CNN: Towards real-time object detection with region proposal networks”, in: NIPS.

Ronneberger, P. Fischer, Brox, T. (2015), “U-Net: Convolutional Networks for Biomedical Image Segmentation”, MICCAI, Springer, LNCS, 9351, pp. 234-241. 10.1007/978-3-319-24574-4_28

Schmitt, M., Halisch, M., Müller, C., and Fernandes, C. P. (2016), “Classification and quantification of pore shapes in sandstone reservoir rocks with 3-D X-ray microcomputed tomography”, Solid Earth. 7, pp. 285-300. 10.5194/se-7-285-2016

Voorn, M., Exner, U., and Rath, A. (2013), “Multiscale Hessian fracture filtering for the enhancement and segmentation of narrow fractures in 3D image data”, Comput. Geosci. 57, pp. 44-53. 10.1016/j.cageo.2013.03.006

Vu, T. X., Jang, H., Pham, T. X., and Yoo, C. D. (2019), “Cascade RPN: Delving into high-quality region proposal network with adaptive convolution”, in Proc. NIPS, pp. 1-11.

Wennberg, O. P., Rennan, L., and Basquet, R. (2009), “Computed tomography scan imaging of natural open fractures in a porous rock; geometry and fluid flow”, Geophys. Prospect. 57, pp. 239-249. 10.1111/j.1365-2478.2009.00784.x

Yen, J.Y., 1970. An algorithm for finding shortest routes from all source nodes to a given destination in general networks. Quarterly of Applied Mathematics. 27 (4), 526-530. doi:10.1090/qam/253822. 10.1090/qam/253822

Tunnel and Underground SpaceISSN:1225-1275(Print) 2287-1748(Online)한국암반공학회

Preview

Automatic Fracture Detection in CT Scan Images of Rocks Using Modified Faster R-CNN Deep-Learning Algorithm with Rotated Bounding Box

ABSTRACT

MAIN

Fig. 1

Rotated Faster R-CNN architecture

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

Fig. 2

Training results

Fig. 3

Comparison of inference results on test image set empolying rotated (left) and horizontal (right) bounding box in each subfigure

Acknowledgements

References