Functions and Debugging

Functions organize code into reusable blocks. Debugging techniques help find and fix errors quickly. Together, they transform scripts from fragile one-offs into robust, maintainable tools.

Functions are like mini-scripts within your script. Write once, use many times. If you're copying and pasting code, you probably need a function.

Creating Functions

Functions encapsulate related commands under a single name:

Simple Function

1#!/bin/bash

3# Define function

4greet() {

5echo "Hello from function!"

8# Call function

9greet

Format Details

Function definition: name() { commands }

Function call: Just use the name

InputSuccess

./function_demo.sh

Output

Hello from function!

Functions run when called. Define before calling.

Functions with Arguments

Functions access arguments like scripts do ($1, $2, etc.):

Function with Arguments

1#!/bin/bash

3count_reads() {

4local fastq=$1

6if [ ! -f "$fastq" ]; then

7 echo "ERROR: File not found: $fastq"

8 return 1

9fi

11lines=$(wc -l < "$fastq")

12reads=$((lines / 4))

13echo "$fastq: $reads reads"

14}

16# Call function with argument

17count_reads "Sample_01.fastq"

18count_reads "Sample_02.fastq"

Format Details

Get argument: $1 is first argument to function

local: Variable only exists in function

return: Exit function with status code

Call: Pass arguments like any command

InputSuccess

./count_reads_func.sh

Output

Sample_01.fastq: 1234567 reads
Sample_02.fastq: 2345678 reads

Reuse function for multiple files.

Local vs Global Variables

Variable Scope

1#!/bin/bash

3# Global variable

4output_dir="results"

6process_sample() {

7# Local variable (only in function)

8local sample=$1

9local temp_file="temp.txt"

11echo "Processing $sample"

12echo "Output to: $output_dir" # Can access global

13}

15process_sample "Sample_01"

17# sample and temp_file don't exist here

18echo "Global output_dir: $output_dir"

Format Details

Global: Available everywhere

Local: Only in this function

Access global: Functions can read globals

local gone: Local vars vanish after function

Always Use 'local'

Always declare function variables as local. This prevents accidentally overwriting global variables with the same name. Makes functions safer and more predictable.

Return Values

Functions return exit codes (0-255), not arbitrary values. To return data, use echo:

Returning Values

1#!/bin/bash

3# Return data via echo

4get_read_count() {

5local fastq=$1

6local lines=$(wc -l < "$fastq")

7local reads=$((lines / 4))

8echo $reads # Output the value

11# Capture returned value

12count=$(get_read_count "sample.fastq")

13echo "Total reads: $count"

15# Return status code

16validate_file() {

17local file=$1

18if [ -f "$file" ]; then

19 return 0 # Success

20else

21 return 1 # Failure

22fi

23}

25# Use return code in conditional

26if validate_file "data.txt"; then

27echo "File valid"

28else

29echo "File missing"

30fi

Format Details

echo value: Output data to stdout

Capture output: Save echoed value

return code: 0 = success, non-zero = failure

Test code: Use function in if statement

Practical Bioinformatics Functions

Check File Format

Validate FASTQ Format

1#!/bin/bash

3is_fastq() {

4local file=$1

6# Check file exists and not empty

7if [ ! -s "$file" ]; then

8 echo "ERROR: File missing or empty: $file"

9 return 1

10fi

12# Check first character is @

13first_char=$(head -c 1 "$file")

14if [ "$first_char" != "@" ]; then

15 echo "ERROR: Not FASTQ format (must start with @)"

16 return 1

17fi

19# Check line count is multiple of 4

20lines=$(wc -l < "$file")

21if [ $((lines % 4)) -ne 0 ]; then

22 echo "ERROR: Line count not multiple of 4"

23 return 1

24fi

26return 0

27}

29# Use validation

30for file in *.fastq; do

31if is_fastq "$file"; then

32 echo "✓ $file is valid FASTQ"

33else

34 echo "✗ $file failed validation"

35fi

36done

Format Details

Check exists: -s tests file has content

Format check: FASTQ starts with @

Line check: Must be multiple of 4

Use validation: Test each file

Calculate Sequence Statistics

FASTA Statistics Function

1#!/bin/bash

3fasta_stats() {

4local fasta=$1

6if [ ! -f "$fasta" ]; then

7 echo "ERROR: File not found: $fasta" >&2

8 return 1

9fi

11# Count sequences

12local seq_count=$(grep -c "^>" "$fasta")

14# Calculate total length

15local total_length=$(grep -v "^>" "$fasta" | \

16 tr -d '\n' | wc -c)

18# Calculate average

19local avg_length=0

20if [ $seq_count -gt 0 ]; then

21 avg_length=$((total_length / seq_count))

22fi

24# Output results

25echo "File: $fasta"

26echo "Sequences: $seq_count"

27echo "Total length: $total_length bp"

28echo "Average length: $avg_length bp"

29echo ""

30}

32# Analyze all FASTA files

33for fasta in *.fasta; do

34[ -f "$fasta" ] || continue

35fasta_stats "$fasta"

36done

Format Details

Error to stderr: >&2 sends to stderr not stdout

Count seqs: Count > headers

Total length: Remove headers and newlines

Check divide: Avoid division by zero

Skip non-files: Handle no matches

Error Handling Patterns

Die Function

A standard pattern for fatal errors:

Die Function for Errors

1#!/bin/bash

3# Print error and exit

4die() {

5echo "ERROR: $*" >&2

6exit 1

9# Usage

10[ -f "required.txt" ] || die "Missing required file"

11[ $# -eq 2 ] || die "Usage: $0 <input> <output>"

13echo "All checks passed, continuing..."

Format Details

Error message: $* is all arguments combined

To stderr: >&2 for error messages

Exit fatal: Stop script immediately

Use die: || means run if first command fails

Warn Function

For non-fatal warnings:

Warn Function

1#!/bin/bash

3warn() {

4echo "WARNING: $*" >&2

7process_file() {

8local file=$1

10if [ ! -f "$file" ]; then

11 warn "File not found: $file, skipping"

12 return 1

13fi

15# Process file...

16}

18# Continue even if some files missing

19for file in *.txt; do

20process_file "$file" || continue

21done

Format Details

Warning: Like die but doesn't exit

Warn and skip: Report but continue

Continue on error: || continue skips to next iteration

Debugging Techniques

Add Debug Prints

Debug Mode

1#!/bin/bash

3DEBUG=0 # Set to 1 to enable debug output

5debug() {

6if [ "$DEBUG" = "1" ]; then

7 echo "DEBUG: $*" >&2

8fi

11process_sample() {

12local sample=$1

13debug "Entering process_sample with: $sample"

15# Processing...

16debug "Finished processing $sample"

17}

19# Run with: DEBUG=1 ./script.sh

20process_sample "Sample_01"

Format Details

Debug flag: Control debug output

Conditional print: Only if DEBUG=1

Debug points: Track execution flow

Enable debug: Set DEBUG=1 when running

Trace Execution

InputSuccess

bash -x script.sh

Output

+ sample=Sample_01
+ echo 'Processing Sample_01'
Processing Sample_01
+ count_reads Sample_01.fastq

-x shows each command before running. See exactly what executes.

Add tracing to specific sections:

Selective Tracing

1#!/bin/bash

3# Normal execution

4echo "Starting analysis"

6# Enable tracing for this section

7set -x

8complicated_command

9another_command

10set +x

12# Back to normal

13echo "Analysis complete"

Format Details

Enable trace: set -x turns on tracing

Disable trace: set +x turns off tracing

Check Syntax Without Running

InputSuccess

bash -n script.sh

-n checks syntax without executing. Catches typos before running.

InputSuccess

bash -n broken_script.sh

Output

broken_script.sh: line 15: syntax error near unexpected token `fi'
broken_script.sh: line 15: `fi'

Syntax errors reported immediately.

Error Handling with set Flags

Robust Script Template

1#!/bin/bash

2# Robust script with comprehensive error handling

4# Exit on error

5set -e

7# Exit on undefined variable

8set -u

10# Catch errors in pipes

11set -o pipefail

13# Optional: trace execution

14# set -x

16# Cleanup on exit

17cleanup() {

18echo "Cleaning up temporary files..."

19rm -f temp_*

20}

21trap cleanup EXIT

23# Your script here

24echo "Starting robust pipeline..."

Format Details

set -e: Exit if any command fails

set -u: Exit if using undefined variable

pipefail: Catch errors in pipe chains

cleanup: Function to run on exit

trap: Run cleanup when script exits

set -e has limitations: It doesn't catch errors in some contexts (like functions in conditions). Combine with explicit error checking for critical operations.

Trap for Cleanup

Always clean up temporary files, even if script fails:

Cleanup with Trap

1#!/bin/bash

3# Create temp directory

4temp_dir=$(mktemp -d)

6# Ensure cleanup happens

7cleanup() {

8echo "Cleaning up $temp_dir"

9rm -rf "$temp_dir"

10}

11trap cleanup EXIT

13# Use temp directory safely

14echo "Working in $temp_dir"

15# ... do work ...

17# cleanup runs automatically when script exits

Format Details

Make temp: mktemp creates secure temp dir

Cleanup func: Remove temp directory

trap EXIT: Run cleanup no matter how script ends

Automatic: cleanup called on normal or error exit

Complete Example: Robust Pipeline

Production-Ready Analysis Script

1#!/bin/bash

2# Complete analysis pipeline with error handling

4set -e -u -o pipefail

6# Functions

7die() {

8echo "ERROR: $*" >&2

9exit 1

10}

12log() {

13echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*"

14}

16validate_fastq() {

17local file=$1

18[ -f "$file" ] || die "FASTQ not found: $file"

19[ -s "$file" ] || die "FASTQ is empty: $file"

21local lines=$(wc -l < "$file")

22[ $((lines % 4)) -eq 0 ] || die "Invalid FASTQ: $file"

23}

25cleanup() {

26log "Cleaning up temporary files"

27rm -rf "$temp_dir"

28}

30# Setup

31[ $# -eq 1 ] || die "Usage: $0 <sample.fastq>"

33sample_fastq=$1

34temp_dir=$(mktemp -d)

35trap cleanup EXIT

37log "Starting analysis of $sample_fastq"

39# Validate input

40validate_fastq "$sample_fastq"

41log "Input validation passed"

43# Analysis

44log "Counting reads"

45lines=$(wc -l < "$sample_fastq")

46reads=$((lines / 4))

47log "Found $reads reads"

49# Generate report

50report="analysis_report.txt"

51{

52echo "Analysis Report"

53echo "==============="

54echo "Sample: $sample_fastq"

55echo "Reads: $reads"

56echo "Date: $(date)"

57} > "$report"

59log "Report written to $report"

60log "Analysis complete"

Format Details

Safety: Exit on errors

die(): Fatal error handling

log(): Timestamped logging

validate(): Input validation

cleanup(): Cleanup function

trap: Ensure cleanup runs

Report: Generate output

Debugging Common Errors

Undefined Variable

InputSuccess

#!/bin/bash\nset -u\necho $undefined_var

Output

script.sh: line 3: undefined_var: unbound variable

set -u catches undefined variables immediately.

Fix: Always initialize variables or use ${var:-default}:

InputSuccess

value=${optional_var:-default_value}
echo $value

Output

default_value

Use default if variable is unset or empty.

Command Not Found in Function

InputSuccess

my_function() {\n  ech 'typo'\n}\nmy_function

Output

script.sh: line 2: ech: command not found

Typos in function only caught when function runs.

Fix: Use bash -n to check syntax before running.

Silent Pipe Failures

InputSuccess

false | echo 'Still runs'

Output

Still runs

Without pipefail, errors in pipes are ignored!

InputSuccess

set -o pipefail\nfalse | echo 'Still runs'\necho 'Exit code:' $?

Output

Still runs
Exit code: 1

pipefail preserves error code from failed command in pipe.

Quick Reference

Function Syntax

function_name() {       # Define function
  local var=$1          # Get argument
  echo "result"         # Return data via echo
  return 0              # Return status code
}
 
result=$(function_name arg)  # Capture output
function_name arg            # Just run function

Error Handling

set -e                  # Exit on error
set -u                  # Exit on undefined variable
set -o pipefail         # Catch pipe errors
set -x                  # Trace execution
 
die() {                 # Fatal error function
  echo "ERROR: $*" >&2
  exit 1
}
 
cleanup() {             # Cleanup function
  rm -rf temp_*
}
trap cleanup EXIT       # Run on exit

Debugging

bash -x script.sh       # Trace execution
bash -n script.sh       # Check syntax
set -x                  # Enable tracing
set +x                  # Disable tracing
 
DEBUG=1 ./script.sh     # Run with debug flag

Best Practices

Function Best Practices

One purpose per function - Each function does one thing well
Use local variables - Prevent variable name conflicts
Validate inputs - Check arguments at function start
Return meaningful codes - 0 for success, 1+ for errors
Document functions - Comments explaining purpose and usage
Keep functions short - If > 50 lines, consider splitting
Test functions independently - Easier to debug
Use consistent naming - verb_noun pattern (e.g., validate_file)

Next Steps

You now have the tools for writing robust, maintainable scripts:

Functions for code reuse
Error handling for reliability
Debugging techniques for troubleshooting

The final page brings everything together with real-world bioinformatics pipelines, showing how to combine all these techniques into production-ready workflows.