Training machine learning models can be a time-consuming task and can take several hours to days to train a model especially if the model is a dense deep neural network. Recently I faced a similar issue while trying to run a feedforward net with GridSearch for a classification problem. Since my laptop lacks a GPU it took over 8 hours to train the model and considerable consumption of memory and CPU rendered my laptop useless for the time duration.

This experience along with several other experiences motivated me to look for an alternative way to train models that can be done remotely without compromising time and resources of my laptop. After doing some research I stumbled upon Floydhub a cloud-based solution for training our Machine Learning Models.

Following are the advantages of Floydhub:

  • Easy to use and intuitive UI
  • Great documentation
  • Interactive environment through Jupyter notebook on cloud
  • Seemless integration with python
  • Pre setup GPU along with libraries like Keras and Tensorflow pre-installed

Here I will display a binary classification problem of customer churning/not churning from a bank on the basis of the customer characteristics on the floydhub platform. We will be training afeed-forwardd neural net with back propagation and stochastic gradient descent using the Keras library to achieve this task. I will be displaying two methods of doing this. One with Jupyter notebook and another using remote execution of a python script. The dataset for this exercise can be downloaded from here:

https://www.kaggle.com/hj5992/bank-churn-modelling

Let’s go over both the methods one by one.

Part 1 : Training ANN using Jupyter Notebook in Floydhub

Preconditions:

Account on Floydhub.

Step 1: Install the floyd-cli for python.

First step is to install floyd-cli for python. This can be achieved by issuing the following command:

pip install -U floyd-cli

Step 2: Login into the Floydhub account

Once we have the floyd-cli installed we need to login to floydhub account from the command line to enable our floyd interface to run our job. This can be achieved as follows:

floyd login
Login with your FloydHub username and password to run jobs.
Username [aryan]: aryan
Password: 
Login Successful as aryan

Step 3: Create a Floydhub project on cloud

Now we need to create a Floydhub project on the Floydhub portal. We can do so by hovering on the + sign on the top right corner.

cretae_floyd_project

Once we click on create project we need to add details for the project:

create_project2

Now the project is ready and we can see the following project page:

project_page

Step 4: Initialize the Floydhub project locally

As highlighted in the previous image we need to issue the following command in the command prompt to initialize the project locally:

floyd init aryancodify/bank-churn

Let’s issue the command and look at the output. Before issuing the command we can go to the working directory where we want this project to be initialized:

floydInit

Step 5: Sync the Jupyter notebook and datasets from local to floydhub and running it on cloud

Once the project is initialized locally we can sync our jupyter notebooks and dataset to the floydhub project on cloud to run the notebook interactively using the gpu. We need to issue the following command for this:

floyd run --gpu --mode jupyter

Once we run the command we will see the following output:

gpuInit

We had the following files which will be synced to cloud:

files

This can take some time as the floydhub runtime executes following task in the backend:

  • Sync your local code to FloydHub’s server
  • Provision a GPU instance on the cloud (if you want CPU, drop the --gpu flag)
  • Set up an deep learning environment with Tensorflow and keras installed
  • Start a Jupyter server on the cloud, and open the url in your browser.

Once this command is executed a new window will open up in the browser with the view of the Jupyter notebook:

jupyter_floyd

Here we can see our notebook that we synced from local file system to floydhub. Then we can open the Jupyter notebook and start interacting with it.

Step 6: Looking at the Job status and metrics:

We can check the status of our job by going into the jobs page of the web dashboard. This will list all our jobs:

job_status

We can click on a particular project to check on its jobs.

For eg here clicking on the bank-churn project we see the following dashboard:

metrics

Here we can see that our GPU utilisation was almost 95.2% i.e. 11 gb.

Looking at the notebook:

We use the Keras library to build a feedforward net in order to perform our binary classification the project and notebook can be accessed here:

https://www.floydhub.com/aryancodify/projects/bank-churn/1/files/ChurnRateClassification.ipynb

runJupyter

We can clearly see that we trained the model for 100 epochs and it took roughly 1-2 mins to train the model.

Since building the network is out of scope for this post will not be discussing the details here, though appropriate Markdowns and comments have been added in the notebook for a better understanding.

After training the model we get an accuracy of around 86% on the test dataset which is pretty good.

To check whether Keras is running with the GPU we use the following commands in the Jupyter notebook:

from keras import backend as K
K.tensorflow_backend._get_available_gpus()

We will look at achieving the same using the remote python script in part 2 of this tutorial.