What is a Terminal?
The terminal is a text-based interface to your computer. Instead of clicking icons and dragging files, you type commands that tell the computer exactly what to do.
For bioinformatics, the terminal is not just another way to use your computer - it is the way to do computational biology.
Why Text Commands?
You might wonder why bioinformaticians use text commands instead of graphical interfaces. The answer is simple: scale and reproducibility.
A single command can process thousands of files in seconds. The same task with a graphical interface would take hours or be completely impossible.
Consider these real scenarios:
Scenario 1: Quality Control on 96 Samples
With a graphical tool, you would:
- Open the application
- Click File → Open
- Select sample 1
- Click Analyze
- Click Export Results
- Repeat 95 more times
With the terminal:
for file in samples/*.fastq.gz; do fastqc $file -o qc_reports/; doneAnalysis complete for sample_01.fastq.gz
Analysis complete for sample_02.fastq.gz
...
Analysis complete for sample_96.fastq.gz
96 samples processed in 8 minutesProcess all 96 samples with a single command. This runs in parallel, automatically naming output files, and completes while you get coffee.
Scenario 2: Finding Specific Sequences Across Multiple Genomes
You need to find all genes containing a specific protein domain across 20 bacterial genomes. Each genome file is 5 GB.
Graphical approach: Open each file (if your text editor can even handle 5 GB), search manually, copy results. This would take days and likely crash your computer.
Terminal approach:
grep -h 'Zinc_finger_domain' genomes/*.gff | cut -f9 | sort -u > zinc_finger_genes.txtFound 2,847 unique genes with Zinc finger domains across 20 genomesSearch 100 GB of annotation data in 12 seconds. The command searches all files, extracts gene IDs, removes duplicates, and saves results.
Terminal vs Shell vs Command Line
You will hear these terms used interchangeably, but they have specific meanings:
Terminal: The window where you type commands. Also called a terminal emulator.
Shell: The program that interprets your commands. Common shells include bash, zsh, and fish.
Command Line: The text interface where you type commands. Sometimes called the CLI (Command Line Interface).
Think of it like this:
- The terminal is the window (the envelope)
- The shell is the interpreter (the translator)
- The command line is where you type (the message)
Anatomy of a Terminal Window
When you open a terminal, you see something like this:
user@hostname:~/projects/rnaseq$ This is the command prompt. It shows your username, computer name, and current directory. The $ symbol means you're ready to type a command.
Let's break down what you see:
user- Your username on the system@hostname- The name of the computer you're using~/projects/rnaseq- Your current directory (~ is shorthand for your home directory)$- The prompt symbol (ready for your command)
On HPC clusters or remote servers, the hostname tells you which machine you're connected to. This is important when managing jobs across multiple nodes.
Your First Commands
Let's start with the most basic commands. These work on any UNIX system.
whoami - Who Am I?
whoamiscotthandleyShows your username. Useful to confirm which account you're using, especially on shared systems.
hostname - What Computer Am I Using?
hostnamebio-cluster-node12.university.eduShows the name of the computer. On HPC systems, this tells you which compute node you're on.
date - What Time Is It?
dateWed Nov 20 14:23:45 PST 2025Shows the current date and time. Useful for timestamping analyses in your notes.
echo - Display Text
echo "Starting genome alignment at $(date)"Starting genome alignment at Wed Nov 20 14:23:45 PST 2025Prints text to the terminal. The $(date) part gets replaced with the current date. Useful for adding timestamps to log files.
Running Bioinformatics Programs
Most bioinformatics tools are command-line programs. They follow a common pattern:
program_name [options] input_fileBasic structure of a bioinformatics command. The program name comes first, followed by options (flags that modify behavior), then the input file(s).
Here are real examples:
Common Bioinformatics Commands
3 stepsWhy Bioinformatics Requires the Terminal
Five fundamental reasons why graphical interfaces cannot replace the terminal for genomics:
-
File Sizes: Genomic files are often too large to open in regular programs. A human genome BAM file is 100+ GB. The terminal processes these files without loading them entirely into memory.
-
Batch Processing: Process hundreds or thousands of samples with the same commands. You write the command once, it runs on all files.
-
Reproducibility: Commands can be saved as scripts and shared. Anyone can run the exact same analysis on their computer or cluster.
-
Remote Computing: Most powerful computers for bioinformatics are remote servers or HPC clusters. You access them through the terminal.
-
Tool Availability: Most bioinformatics software is command-line only. BLAST, STAR, Bowtie, GATK, SAMtools - all require the terminal.
Excel cannot open files larger than 1,048,576 rows. A single FASTQ file from Illumina sequencing contains 200+ million reads. The terminal handles this easily.
Common Terminal Applications
Depending on your operating system, you use different terminal applications:
macOS: Terminal.app (built-in) or iTerm2 (popular alternative)
Linux: GNOME Terminal, Konsole, or xterm (varies by distribution)
Windows: Windows Subsystem for Linux (WSL2), Git Bash, or PuTTY for remote connections
For Windows users, WSL2 (Windows Subsystem for Linux) provides a full Ubuntu Linux environment. This is the recommended setup for bioinformatics on Windows.
Remote Terminals and SSH
Most bioinformatics computing happens on remote servers or HPC clusters. You connect to these systems using SSH (Secure Shell):
ssh username@bio-cluster.university.eduWelcome to Bio-HPC Cluster
Last login: Wed Nov 20 14:15:23 2025 from 10.0.1.45
username@bio-cluster:~$ Connect to a remote server using SSH. You'll be prompted for your password. After logging in, any commands you type run on the remote server, not your local computer.
Once connected, everything you type runs on the remote server. This is how you submit jobs to HPC clusters, analyze large datasets on powerful machines, and access institutional computing resources.
Terminal Best Practices
As you start using the terminal, keep these principles in mind:
- Tab completion is your friend: Press Tab to autocomplete file names and commands
- Up arrow recalls previous commands: Don't retype - press ↑ to cycle through command history
- Copy commands exactly: UNIX is case-sensitive and space-sensitive
- Read error messages: They usually tell you exactly what went wrong
- Don't run commands you don't understand: Especially commands involving
rm(delete)
What You Cannot Do in the Terminal
The terminal is powerful, but not magic. Here's what it cannot do:
- View images or plots directly (you generate them, then open in another program)
- Edit documents with formatting (no bold, italics, colors in plain text)
- Browse the web (though you can download files)
- Play videos or audio
For bioinformatics, these limitations don't matter. You use the terminal to process data, then use other tools to visualize results.
Practice Environment
Practice terminal commands in an interactive browser environment
The evomics-learn platform provides a safe environment to practice terminal commands. You cannot break anything, and you get instant feedback on whether your commands are correct.
Next Steps
Now that you understand what the terminal is and why bioinformatics requires it, the next section covers how to navigate the file system and find your way around.
You'll learn:
- How directories are organized
- How to move between directories
- How to list files and see what's on your system
- How to understand file paths