Installing Transrate



Written by: Richard Smith-Unna Last updated: 2013-10-19 12:53:00 +0000

NOTE... Transrate is now much easier to install! Just follow the instructions here.

Transrate is a program for analysing the quality of transcriptome assemblies. It's designed for anyone who is doing de-novo assembly of transcriptomes from RNA-Seq data. Recently I've had several requests for help installing transrate from non-expert users. This set of instructions is aimed at helping users new to the linux/unix environment to get up and running.

Transrate is written in the Ruby programming language. This makes it fairly easy to install. It also depends on some external software, which can be more complicated for new users to install. Here we'll go through the whole process step-by-step.

Installing Ruby

The easiest way to get Ruby installed is to use the RVM (Ruby Version Manager).

First, open up your Terminal app. Then, paste in the commands below and hit enter.

Note: commands in boxes like the one below can be copy-pasted into the terminal to run them. Lines starting with a # are comments, and will do nothing when run.

# download and run the RVM installation code
\curl -L https://get.rvm.io | bash -s stable --ruby

That was easy!

Installing Transrate

Programs written in Ruby can be packaged up in an easy-to-install bundle called a gem. There's a great website, RubyGems, that lists the available gems. To install a gem, you just run:

gem install gemname

Transrate is a gem, so to install it we run:

gem install transrate

This will install transrate itself along with all the other Ruby gems it depends on.

External dependencies

Three programs, USEARCH, Bowtie 2 and eXpress, are required by transrate.

A note about the PATH

In order to install these programs, you need to put the executable program files in a directory on your computer, and then tell the computer where to find them. You do this by adding the location of the files to your PATH.

PATH is an environment variable that simply lists the places you have programs installed. When you run the Terminal, it loads the PATH and then any time you run a command, it searches all the directories listed in the PATH for the program whose name you have typed.

You can check the current PATH by running:

echo $PATH

You'll get an output like this:

/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin

That's a list of directory locations separated by colons. So if you run a command, say ls, your shell will first look in /usr/bin for an executable file called ls, then if it doesn't find one it will look in /bin, and so on down the list until it either finds the file or runs out of places to look.

To add to the PATH, you just replace the current setting with a new one that includes the old one, plus the location you want to add:

# add a new path to the existing one
export PATH=$PATH:/a/new/path

There's more useful information about the PATH here and here.

Creating an installation directory

It's useful to have one place where you install new programs. That way, you only have to add a directory to the PATH once, and you always know where to put new things.

Let's create an ~/apps directory where we can download program files, and an ~/apps/bin directory where we'll put links to the executables.

mkdir ~/apps
cd ~/apps
mkdir bin

Now let's tell the command line where to find our new directory. We do this by making a command to add the location to the path, and then putting that command in a file that gets run each time the terminal opens:

echo "export PATH=\$PATH:~/apps/bin" >> ~/.bashrc
source ~/.bashrc

By default, the file .bashrc in your home directory contains a list of commands that get executed whenever you start a new BASH prompt (by opening a new terminal window).

Now if we place a file in ~/apps/bin and make it executable, we can run it by name from any location.

USEARCH

USEARCH is similar to BLAST, but much, much faster and more versatile. We use it to align the assembled contigs with proteins from a reference species.

Go to the USEARCH website and register to download the latest version of USEARCH for your operating system. You'll receive an email that contains a link. Run the following commands to install USEARCH, making sure to put the download link you were sent in place of the one I've got here:

# change to the apps directory
cd ~/apps
# download the remote file to a local file called 'usearch'
curl -o usearch http://drive5.com/cgi-bin/upload3.py?license=201310190445061132
# make the 'usearch' file executable
chmod +x usearch
# create a symbolic link to the file in the bin directory
cd ~/apps/bin
ln -s ~/apps/usearch .

Now, you should be able to run usearch in the terminal and see some output like this:

usearch v7.0.1001_i86linux32, 4.0Gb RAM (32.9Gb total), 8 cores
(C) Copyright 2013 Robert C. Edgar, all rights reserved.
http://drive5.com

Licensed to: rds45@cam.ac.uk

Bowtie 2

Bowtie 2 is used to align reads to the assembled transcriptome. We'll follow a similar procedure to the one we used for USEARCH.

Go to the download site and locate the latest version for your operating system. Right-click the link and choose 'copy link address'. Now run the following commands in the terminal, substituting the link you just copied for the one I've used if necessary:

# change to the apps directory
cd ~/apps
# download the bowtie2 zip to bowtie2.zip
curl -L -o bowtie2.zip http://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.1.0/bowtie2-2.1.0-linux-x86_64.zip/download
# extract
unzip bowtie2.zip
# symlink all the necessary executables to bin
cd ~/apps/bin
ln -s ~/apps/bowtie2-2.1.0/bowtie2* .

Now, you should be able to run bowtie2 --version in the terminal and see some output like:

/home/rds45/apps/bowtie2-2.1.0/bowtie2-align version 2.1.0
64-bit
Built on do-dmxp-mac.win.ad.jhu.edu
Tue Feb 26 13:34:02 EST 2013
Compiler: gcc version 4.1.2 20080704 (Red Hat 4.1.2-54)
Options: -O3 -m64 -msse2 -funroll-loops -g3
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}

eXpress

eXpress is used to estimate how much support each contig has based on the read alignments.

Again, the procedure is similar...

Go to the eXpress website and hover over the 'Download' menu item. Right-click on the link for your operating system in the drop-down menu, and choose 'Copy Link Address'.

Now run the following in the terminal, substituting in your copied link and filename if necessary

# change to apps directory
cd ~/apps
# download the tarred, gzipped file
curl -O http://bio.math.berkeley.edu/eXpress/downloads/express-1.4.1/express-1.4.1-linux_x86_64.tgz
# extract the contents
tar xvf express-1.4.1-linux_x86_64.tgz
# change to the extracted directory
cd express-1.4.1-linux_x86_64
# make the file executable
chmod +x express
# symlink to bin
cd ~/apps/bin
ln -s ~/apps/express*/express .

Now, you should be able to run express -h and see some output that starts similar to this:

express v1.4.0
-----------------------------
File Usage:  express [options] <target_seqs.fa> <hits.(sam/bam)>
Piped Usage: bowtie [options] -S <index> <reads.fq> | express [options] <target_seqs.fa>

Congratulations! You've installed all the dependencies.

Running transrate

Finally, you can run transrate! To get help in the terminal, run transrate --help. You should see:

Transrate v0.0.1a by Richard Smith <rds45@cam.ac.uk>

DESCRIPTION:
Analyse a de-novo transcriptome
assembly using three kinds of metrics:

1. contig-based
2. read-mapping
3. reference-based

Please make sure USEARCH and bowtie2 are both installed
and in the PATH.

Bug reports and feature requests at:
http://github.com/blahah/transrate

USAGE:
transrate <options>

OPTIONS:
    --assembly, -a <s>:   assembly file in FASTA format
   --reference, -r <s>:   reference proteome file in FASTA format
        --left, -l <s>:   left reads file in FASTQ format
       --right, -i <s>:   right reads file in FASTQ format
  --insertsize, -n <i>:   mean insert size (default: 200)
    --insertsd, -s <i>:   insert size standard deviation (default: 50)
     --threads, -t <i>:   number of threads to use (default: 8)
         --version, -v:   Print version and exit
            --help, -h:   Show this message

If you need any further help, please post to the help group, or leave a comment below.


← Previous Archive Next →

blog comments powered by Disqus