This challenge aims to build a model that can detect large salt deposits from image of seismic scans. Detecting salt is important beacause in most cases large accumulation of oil and gas are also found underneith these salt beds. Salt is also much easier to detech that oil and gas and so if a model can be trained to detect these salt beds it will be very profitable for the Oil and Gas companies because it makes oil discovery faster.
clone the repo and install the packages with
pip install -r requirements.txt
At the time being I was using a custom build of the popular
Segmentation-Model-Pytorch library so you have to install that
$ wget https://github.com/jjmachan/segmentation_models.pytorch/releases/download/0.1.3.3/segmentation_models_pytorch-0.1.3-py3-none-any.whl -q $ pip install segmentation_models_pytorch-0.1.3-py3-none-any.whl -qq
The dataset is downloaded from the competition
page. There is
an EDA notebook in
nbs/EDA.ipynb for you to view and understand the data
I’ve build the Pytorch loaders and some utility function to get started with the
dataset quickly. Those can be found in the
I’ve used Albumentation library for augmenting the dataset but these can be easily switched out. Check the blog for more details about the performance boost augmentation gave.
trainer.py is a script that will run the trainer and save the trained
model into the
saved_models/ directory. You can modify hyperparameters in the
script directly to try stuff out.
In addition to that I’ve also added 2 notebooks that I used for building this repo.
nbs/Baseline.ipynb- is the code to create a Baseline model that gives fairly good accuracy. It’s a useful starting point if you want to get started on your own.
nbs/TGS Training.ipynb- is similar to the
trainer.pybut for rapid development and prefer notebooks for that like me.
Training is fairly fast and takes at max 1 hr.
To run the inference with the trained model simple call the
infer.py script or
use the functions provided. This is generate the submission file in the required
formate. To run the evaluation head over to the kaggle website and submit to
I’ve implemented my own version of Test Time Augmentation(TTA) to give an additional boost to the model’s performance.