The only requisite to install MOMIC locally is to have Docker and docker-compose already installed. Docker is a platform used to develop, deploy, and run applications with containers. Follow the instructions on each project website (docker and docker-compose).
The minimun RAM memory recommended is 56GB. It may need to be increased with large datasets, specially for some process during a RNASeq Analysis or an imputation of GWAS data. Disk space can vary depending on the size of your data; 260GB is the actual size of the data volume of momic.us.es. MOMIC has been tested on Ubuntu, CentOS, Windows and macOS servers.
First, clone MOMIC_server in your local directory via git clone https://github.com/laumadmar/MOMIC_server.git
and inspect the content:
docker-compose.yml: YAML file defining services, networks and volumes
Dockerfile: text document that contains the instructions to build the service. Two copies of this file is provided named Dockerfile.steps and Dockerfile.image. The former contains the original instrucctions to build MOMIC server from scratch and can be fully customised. The later uses an image with MOMIC server and notebooks already installed. The default Dockerfile is a copy of the Docker.image one. Rename Dockerfile.steps to Dockerfile to fully control the instalaltion propcess if desired.
jupyterhub_config.py: configuration file for JupyterHub
README.md: readme file with instructions on how to deploy the multiomics pipeline
software: directory containing third-party software
Modify if required the following parameters in docker-compose and/or Jupyter configuration file:
ports:
- "8000:8000"
volumes:
- ./jupyterhub_config.py:/home/jupyterconfig/jupyterhub_config.py
- ./../datavolume:/mnt/data
#- ./../jupyterhomevolume:/home
Container port 8000 is exposed in your local machine at port 8000, specified as “host port: container port”. Change it if this port is in used in your local machine.
Volumes are a mechanism for persisting data generated by and used by Docker containers. Three volumes are suggested here for:
- jupyterhub config file – so there is no need to rebuild if you change this file
- the directory where to keep the data to be analysed
- jupyter home directory - the home directory of each user contains the collection of notebooks that comprise the multi-omics pipeline
The syntax for defining volumes consists of two fields, separated by colon characters (:). The first field is the source of the mount, the absolute path in the host machine, and the second field is the path where the file or directory is mounted in the container. Note the jupyter home directory is comented out as it removes momic home directory the first time the container is created, as the directory on the host machine is empty. Uncoment it if you want to backup this data using docker volumes.
In order to have the same permissions in the data directory on the host machine and the directory mounted as a volume in the docker container, it is advisable to create users in the container with the same name and uid as in the host machine. You can create users in the container using the Dockerfile or once the container is up and running, accessing it via ssh and creating a linux user as usual.
The default parameters you can change if needed is: c.JupyterHub.bind_url = 'http://8000/jupyter'
which sets the protocol, ip and base url on which the proxy will bind. By default, the JupyterLab view is loaded as indicated with the parameter: c.Spawner.default_url = '/lab'
. Comment this line to launch the Classic Notebook. Note you can change from one to another later on.
Follow these steps to build and run locally the multiomics pipeline with Docker Compose. Step 5 is not needed if using the default Dockerfile. Notice you need sudo privileges or a special group for running docker commands; read more on the Docker web site.
From your project directory, start the server by running docker-compose up
. Compose builds an image from the instructions specified in the Dockerfile, and starts the services defined.
Enter http://localhost:8000/jupyter
in a browser to see the application running. Modify this url accordingly if you have changed the port and base url in docker-compose.yml.
Once the server is up, press CTRL+c
to stop the console output. This will also stop the container.
Run docker-compose start
to keep the service started in the background.
Note the RUN directive that installs R libraries within the Dockerfile. This takes very long to execute and it is commented out. As an alternative, install them after the build, accessing the container from the terminal and executing nohup Rscript /tmp/install_specific_libraries.R &
There are two bash files on this repo for quickly accessing the container via terminal and checking the logs.
The access files contains: docker exec -it momic_server_web_1 bash
The logs file contains: docker logs momic_server_web_1
Change “momic_server_web_1” by the name of your service. You can get if from sudo docker-compose ps
.
If you modify at some point the Dockerfile, you need you build the image again – follow:
docker-compose stop
docker-compose build
docker-compose up
Log in the first time you fire up the container can take a bit longer. It will show a message saying: “Your server is starting up. You will be redirected automatically when it’s ready for you.” Refresh after a while if the home page does not come up.
Few useful docker commands:
docker-compose stop
to stop the running container
docker-compose ps
to check the status
docker inspect --format='' momic_server_web_1
to get the path to the log file
docker logs momic_server_web _1
to print the log in console
docker exec –it momic_server_web_1 bash
to access the running container.
Running more than one MOMIC server locally
If you need to set up more than one MOMIC instances and you are using the default Dockerfile, you will likely see an error like this ERROR: Service 'web' failed to build: cannot mount volume over existing file, file exists /var/lib/docker/overlay2/0e2e83c65d96fb02e994608a4eb7abceb7de5368d6043ec55278f572e94129da/merged/opt/jupyterhub/jupyterhub_config.py
In this case, use this other image docker pull laumadmarq/momic:built_image
which is the image originally built and not the one created with docker commit.
If you are installing MOMIC server from scratch (using Dockerfile.steps), you need to get the collection of jupyter notebooks from a git repository.
Log into MOMIC Jupyterhub and go to the git tab located in the left menu. Click on the button ‘Clone a Repository’ and provide the url https://github.com/laumadmar/MOMIC_notebooks.git
An alternative to use the gitlab extension is to clone the repository from the terminal, either using Jupyter terminal or via ssh into the container. CD into your jupyter home directory and type git clone https://github.com/laumadmar/MOMIC_notebooks.git
.
You now have in your home directory a copy of all the notebooks necessary to carry out the analysis presented in MOMIC.
Having MOMIC locally installed allows you to install new tools and libraries and fully customised this bioinformatics suit.
In order to install new software and R or pyhton libraries, ssh into the container using the access
script and proceed as you normally would. Get the required files either using wget
or copying them from your local computer to the container using docker cp
command.
Developing new analysis pipelines is as simple as creating new jupyter notebooks and/or extra scripts in whichever language is needed. Note that a python notebook allows you to use a variety of languages via the rpy2
library, which can be load with %load_ext rpy2.ipython
.