考虑数据不平衡影响的钻井复杂智能诊断方法

2021年 43卷 第4期
阅读:97
查看详情
An intelligent drilling accident diagnosis method considering the influence of data imbalance
谭天一 张辉 马丹妮 路宗羽 吴怡 焦金刚
TAN Tianyi ZHANG Hui MA Danni LU Zongyu WU Yi JIAO Jin’gang
中国石油大学(北京)石油工程学院 中国石油新疆油田分公司 中海石油(中国)有限公司北京研究中心
College of Petroleum Engineering, China University of Petroleum (Beijing), Beijing 102249, China PetroChina Xinjiang Oilfield Company, Karamay 834002, Xinjiang, China CNOOC Research Institute Company Limited, Beijing 100028, China
钻井复杂的准确识别是钻井工程顺利开展的保障,现有基于机器学习的钻井复杂诊断方法未考虑钻井资料数据不平衡的特点,可能导致将钻井复杂误判为正常工况。基于决策树分类模型,建立了考虑数据不平衡影响的钻井复杂诊断方法:从录井资料、工程异常记录等现场资料中收集原始数据,提取钻压、钩载、排量等钻井参数,并以波动值构建样本集;引入错误分类成本以修正数据不平衡的影响,建立以最小错误分类成本期望值为分类目标的决策树模型,取代以最高准确率为目标的分类模型。将新模型应用于某页岩气水平井卡钻复杂诊断,结果表明:考虑数据不平衡后,模型能识别出传统方法遗漏的卡钻样本,并将成本期望值降低85%。文中处理数据不平衡的方法不局限于决策树模型,亦可推广至其他机器学习方法,帮助解决钻井复杂识别问题。
The accurate recognition of drilling accident ensures the smooth implementation of drilling engineering. Existing drilling accident diagnosis method based on machine learning doesn’t consider the data imbalance of drilling data, which may mistake drilling accidents for normal working conditions. In this paper, a drilling accident diagnosis method considering the influence of data imbalance was established based on decision tree classification model. In this method, initial data is collected from mud logging data, engineering anomaly records and other field data, weight on bit (WOB), hook load, displacement and other drilling parameters are extracted and a sample set is constructed based on fluctuation values. Mis-classification cost is introduced to correct the influence of data imbalance, a decision tree model with the expected minimum mis-classification cost as the classification objective is established to replace the classification model with the maximum accuracy as the objective. This new model was applied to diagnose the sticking accident in one certain shale-gas horizontal well. The results show that after data imbalance is taken into consideration, the model can recognize the sticking samples that are neglected by traditional method and the expected cost is reduced by 85%. This proposed data imbalance treat method is not limited to decision tree model and it can be popularized to other machine learning methods to assist the recognition of drilling accidents.
钻井; 智能识别; 钻井复杂; 数据不平衡; 决策树; 错误分类成本;
drilling; intelligent recognition; drilling accident; data imbalance; decision tree; mis-classification cost;
10.13639/j.odpt.2021.04.006