XGSleeve: Harnessing the power of Hidden Markov Models in shale oil production

In recent decades, the Canadian energy sector has witnessed a substantial annual increase of 40,000 barrels per day in shale resource oil production, according to Canada’s Energy Regulator. Shale resources encompass unconventional oil and gas reserves found deep underground within rock formations. Unfortunately, the extraction processes associated with shale resources often suffer from inefficiencies, resulting in significant material waste that directly impacts the environment. To address environmental issues, the government has placed increased focus on reducing emissions and improving the efficiency of the oil and gas industry, aiming to mitigate its impact on climate change.

A new completion technology gaining increased popularity, called coiled tubing-enabled fracturing sliding sleeve (CTFSS), has shown potential to improve the efficiency of shale resource extraction. This approach uses coiled tubing to carry a switch tool that opens or closes the sliding sleeve, which improves working efficiency. 

However, the process for opening or closing sliding sleeves is complicated, and errors can lead to incomplete operations and increased costs. For example, sleeves sometimes require multiple attempts to open or mislead operators into proceeding to the next step, like starting fracking prematurely. Consequently, this leads to a less efficient process and a significant waste of material, which can cost between $10,000 to $100,000 USD per sleeve.

When this uncertainty happens during the completion phase, it can require approximately twice the processing time for each stage to be spent recycling the tool due to repeated attempts. Currently, the only option to capture downhole events during fracking operations is to deploy a camera inside the well, which is an expensive solution.

Amii and Kobold Completions Inc. partnered together to develop a new method for identifying sleeve incidents using machine learning and Kobold’s proprietary technology. The innovative approach is detailed in a recent paper published by the team. It represents a pioneering alternative, which uses downhole data (GuideHawk©) and the capturing of surface vibration signals through the Echo© system. This approach leverages machine learning techniques to effectively identify sleeve incidents. The installation and integration of the Echo and Guidehawk devices are straightforward, making them adaptable to various job settings. Both tools utilize familiar oil field sensors employed by other analogous devices. This approach provides a cost-effective and in-situ solution when compared to using existing methods.

XGSleeve Framework

In this project, we propose two frameworks: one for Guidehawk© data (multivariate time series) and one for Echo© data (univariate time series data). Using Hidden Markov Model (HMM) clustering for Echo© data to extract features related to stages of fracking operations can improve model training, as it captures the relationship between stages in events, especially since the raw data from Echo consists of only a single shock value measurement.

Guidehawk© model

The proposed framework for analyzing Guidehawk© data aims to train an XGBoost model for the identification of sleeve opening incidents. The process commences by collecting data from both GuideHawk and Echo sources subsequent to the completion of fracking operations.

GuideHawk© data — encompassing coil and annular pressure, strain, shock, temperature, and torque, along with Echo's surface vibration (shock) data — are gathered and transferred to an edge device. This edge device compiles the collected data into a structured JSON file, arranging it chronologically based on timestamps. Subsequently, this JSON file is uploaded to Cosmos DB, serving as a repository for data originating from diverse fracking jobs conducted across Canada.

Within the cloud environment, GuideHawk© data undergoes extraction and subsequent feature engineering procedures. Temporal and statistical features are extracted to create an enriched dataset. The XGBoost model, initially trained using GuideHawk© data, then takes on the role of generating labels for new fracking jobs. These generated labels are integrated with their respective timestamps. The final dataset incorporates both these appended labels and the corresponding Echo© data.

Concluding this process, the XGSleeve model is trained using this enhanced dataset. The aim is to augment the predictive capabilities for upcoming fracking jobs, further enhancing the understanding and forecasting of sleeve-opening incidents.

XGSleeve model for Echo© data

After the completion of XGSleeve model training, its deployment in the field offers real-time support to field operators in making informed decisions. The process begins with continuous recording of shock values on the wellhead by Echo© data, which is then downsampled to one sample per second. This data is subsequently transmitted to an edge device for the prediction of the ML model.

The XGSleeve model functions on the edge device, beginning with the utilization of the pre-trained HMM to derive cluster probabilities for each timestamp. These probabilities then serve as inputs for the subsequent XGBoost model. Subsequently, temporal and statistical features are extracted. In the subsequent phase of feature engineering, the XGBoost model assesses the present timestamp using the extracted features, thereby producing probabilities that are linked with each occurrence of opening and closing incidents.

These probabilities are then relayed to the operator's monitoring device, facilitating real-time visualization of the ongoing situation. Empowered with these insights, the operator gains the ability to make well-informed decisions, whether to reattempt the opening procedure or proceed to the subsequent steps of the operation. The integration of the XGSleeve model into field operations significantly bolsters efficiency and safety, offering crucial assistance for swift and accurate decision-making during critical fracking procedures.

Unlocking Power of Hidden Markov Model

Fracking operations in this project happen in different steps, like releasing a downhole tool, moving to the next sleeve, opening the sleeve, and actual fracking. By using HMM clustering, we can extract more comprehensive information compared to the window rolling method or lag values. One useful thing we can find is how likely each time period is to be in each group. This helps us know which fracking step is happening in a certain data point. This extra information can help us understand how fracking steps happen one after another.

First, we train the HMM in the training group using a library called hmmlearn. We chose a hidden Markov model with Gaussian distribution because our data looks like a normal distribution, especially the Echo data. We use the Expectation-Maximization (EM) way for the training algorithm. We start by guessing model settings (like how likely each step is, how they change, and what the average and spread are) randomly, along with some real examples. While we train, the model changes these settings to match the real examples. We train this Hidden Markov Model for 100 tries.

To find the best number of groups, we use the elbow method. Based on the elbow method, we selected the reasonable number of 5 as cluster number. We then trained an HMM model with 5 clusters and used the resulting probabilities for each cluster as an additional feature set in combination with temporal and statistical features. Incorporating clustering information as the additional input to the XGBoost model resulted in a notable improvement in the F1 score.

The green colour highlights the timesteps that belong to each cluster, while purple highlights the labels. Clusters 3 and 4 may appear identical due to low resolution, but they represent long-term peaks in signal interchangeably. Cluster 5 signifies downtime and stage changes from one well to another. Clusters 2 and 1 represent instances when the tool is releasing pressure to transition to the next well. We can observe that clusters 3 and 4 closely represent the labels, which are highlighted in purple.

Figure 1. Clusters for specific Echo© singles. Green color highlight clusters.

New horizons

The ability of XGSleeve to harness the power of Hidden Markov Models (HMM) for comprehending stage sequences and using this information to detect sleeve incidents effectively marks a noteworthy advancement in the realm of well completion processes. The underlying concepts driving XGSleeve's robust clustering algorithms hold the potential to pave the path towards addressing novel challenges in time series classification and handling complex real-world sensor data, especially in scenarios where processes occur sequentially.

Latest News Articles

Connect with the community

Get involved in Alberta's growing AI ecosystem! Speaker, sponsorship, and letter of support requests welcome.

Explore training and advanced education

Curious about study options under one of our researchers? Want more information on training opportunities?

Harness the potential of artificial intelligence

Let us know about your goals and challenges for AI adoption in your business. Our Investments & Partnerships team will be in touch shortly!