What is a Terminal?

The terminal is a text-based interface to your computer. Instead of clicking icons and dragging files, you type commands that tell the computer exactly what to do.

For bioinformatics, the terminal is not just another way to use your computer - it is the way to do computational biology.

Why Text Commands?

You might wonder why bioinformaticians use text commands instead of graphical interfaces. The answer is simple: scale and reproducibility.

A single command can process thousands of files in seconds. The same task with a graphical interface would take hours or be completely impossible.

Consider these real scenarios:

Scenario 1: Quality Control on 96 Samples

With a graphical tool, you would:

  1. Open the application
  2. Click File → Open
  3. Select sample 1
  4. Click Analyze
  5. Click Export Results
  6. Repeat 95 more times

With the terminal:

Input~8 minSuccess
for file in samples/*.fastq.gz; do fastqc $file -o qc_reports/; done
Output
96 samples~4 hours time saved
Analysis complete for sample_01.fastq.gz
Analysis complete for sample_02.fastq.gz
...
Analysis complete for sample_96.fastq.gz

96 samples processed in 8 minutes

Process all 96 samples with a single command. This runs in parallel, automatically naming output files, and completes while you get coffee.

Scenario 2: Finding Specific Sequences Across Multiple Genomes

You need to find all genes containing a specific protein domain across 20 bacterial genomes. Each genome file is 5 GB.

Graphical approach: Open each file (if your text editor can even handle 5 GB), search manually, copy results. This would take days and likely crash your computer.

Terminal approach:

Input12.3sSuccess
grep -h 'Zinc_finger_domain' genomes/*.gff | cut -f9 | sort -u > zinc_finger_genes.txt
Output
20 genomes searched100 GB total data2,847 genes found
Found 2,847 unique genes with Zinc finger domains across 20 genomes

Search 100 GB of annotation data in 12 seconds. The command searches all files, extracts gene IDs, removes duplicates, and saves results.

Terminal vs Shell vs Command Line

You will hear these terms used interchangeably, but they have specific meanings:

Terminology

Terminal: The window where you type commands. Also called a terminal emulator.

Shell: The program that interprets your commands. Common shells include bash, zsh, and fish.

Command Line: The text interface where you type commands. Sometimes called the CLI (Command Line Interface).

Think of it like this:

  • The terminal is the window (the envelope)
  • The shell is the interpreter (the translator)
  • The command line is where you type (the message)

Anatomy of a Terminal Window

When you open a terminal, you see something like this:

InputSuccess
user@hostname:~/projects/rnaseq$ 

This is the command prompt. It shows your username, computer name, and current directory. The $ symbol means you're ready to type a command.

Let's break down what you see:

  • user - Your username on the system
  • @hostname - The name of the computer you're using
  • ~/projects/rnaseq - Your current directory (~ is shorthand for your home directory)
  • $ - The prompt symbol (ready for your command)

On HPC clusters or remote servers, the hostname tells you which machine you're connected to. This is important when managing jobs across multiple nodes.

Your First Commands

Let's start with the most basic commands. These work on any UNIX system.

whoami - Who Am I?

Input0.02sSuccess
whoami
Output
scotthandley

Shows your username. Useful to confirm which account you're using, especially on shared systems.

hostname - What Computer Am I Using?

Input0.01sSuccess
hostname
Output
bio-cluster-node12.university.edu

Shows the name of the computer. On HPC systems, this tells you which compute node you're on.

date - What Time Is It?

Input0.01sSuccess
date
Output
Wed Nov 20 14:23:45 PST 2025

Shows the current date and time. Useful for timestamping analyses in your notes.

echo - Display Text

Input0.01sSuccess
echo "Starting genome alignment at $(date)"
Output
Starting genome alignment at Wed Nov 20 14:23:45 PST 2025

Prints text to the terminal. The $(date) part gets replaced with the current date. Useful for adding timestamps to log files.

Running Bioinformatics Programs

Most bioinformatics tools are command-line programs. They follow a common pattern:

InputSuccess
program_name [options] input_file

Basic structure of a bioinformatics command. The program name comes first, followed by options (flags that modify behavior), then the input file(s).

Here are real examples:

Common Bioinformatics Commands

3 steps
fastqc sample.fastq.gz
Output
Analysis complete for sample.fastq.gz
HTML report: sample_fastqc.html

Why Bioinformatics Requires the Terminal

Five fundamental reasons why graphical interfaces cannot replace the terminal for genomics:

  1. File Sizes: Genomic files are often too large to open in regular programs. A human genome BAM file is 100+ GB. The terminal processes these files without loading them entirely into memory.

  2. Batch Processing: Process hundreds or thousands of samples with the same commands. You write the command once, it runs on all files.

  3. Reproducibility: Commands can be saved as scripts and shared. Anyone can run the exact same analysis on their computer or cluster.

  4. Remote Computing: Most powerful computers for bioinformatics are remote servers or HPC clusters. You access them through the terminal.

  5. Tool Availability: Most bioinformatics software is command-line only. BLAST, STAR, Bowtie, GATK, SAMtools - all require the terminal.

Excel cannot open files larger than 1,048,576 rows. A single FASTQ file from Illumina sequencing contains 200+ million reads. The terminal handles this easily.

Common Terminal Applications

Depending on your operating system, you use different terminal applications:

macOS: Terminal.app (built-in) or iTerm2 (popular alternative)

Linux: GNOME Terminal, Konsole, or xterm (varies by distribution)

Windows: Windows Subsystem for Linux (WSL2), Git Bash, or PuTTY for remote connections

For Windows users, WSL2 (Windows Subsystem for Linux) provides a full Ubuntu Linux environment. This is the recommended setup for bioinformatics on Windows.

Remote Terminals and SSH

Most bioinformatics computing happens on remote servers or HPC clusters. You connect to these systems using SSH (Secure Shell):

InputSuccess
ssh username@bio-cluster.university.edu
Output
Welcome to Bio-HPC Cluster
Last login: Wed Nov 20 14:15:23 2025 from 10.0.1.45

username@bio-cluster:~$ 

Connect to a remote server using SSH. You'll be prompted for your password. After logging in, any commands you type run on the remote server, not your local computer.

Once connected, everything you type runs on the remote server. This is how you submit jobs to HPC clusters, analyze large datasets on powerful machines, and access institutional computing resources.

Terminal Best Practices

As you start using the terminal, keep these principles in mind:

Terminal Best Practices
  1. Tab completion is your friend: Press Tab to autocomplete file names and commands
  2. Up arrow recalls previous commands: Don't retype - press ↑ to cycle through command history
  3. Copy commands exactly: UNIX is case-sensitive and space-sensitive
  4. Read error messages: They usually tell you exactly what went wrong
  5. Don't run commands you don't understand: Especially commands involving rm (delete)

What You Cannot Do in the Terminal

The terminal is powerful, but not magic. Here's what it cannot do:

  • View images or plots directly (you generate them, then open in another program)
  • Edit documents with formatting (no bold, italics, colors in plain text)
  • Browse the web (though you can download files)
  • Play videos or audio

For bioinformatics, these limitations don't matter. You use the terminal to process data, then use other tools to visualize results.

Practice Environment

Practice in evomics-learn

Practice terminal commands in an interactive browser environment

The evomics-learn platform provides a safe environment to practice terminal commands. You cannot break anything, and you get instant feedback on whether your commands are correct.

Next Steps

Now that you understand what the terminal is and why bioinformatics requires it, the next section covers how to navigate the file system and find your way around.

You'll learn:

  • How directories are organized
  • How to move between directories
  • How to list files and see what's on your system
  • How to understand file paths

Further Reading