Functions and Debugging
Functions organize code into reusable blocks. Debugging techniques help find and fix errors quickly. Together, they transform scripts from fragile one-offs into robust, maintainable tools.
Functions are like mini-scripts within your script. Write once, use many times. If you're copying and pasting code, you probably need a function.
Creating Functions
Functions encapsulate related commands under a single name:
Simple Function
./function_demo.shHello from function!Functions run when called. Define before calling.
Functions with Arguments
Functions access arguments like scripts do ($1, $2, etc.):
Function with Arguments
./count_reads_func.shSample_01.fastq: 1234567 reads
Sample_02.fastq: 2345678 readsReuse function for multiple files.
Local vs Global Variables
Variable Scope
Always declare function variables as local. This prevents accidentally overwriting global variables with the same name. Makes functions safer and more predictable.
Return Values
Functions return exit codes (0-255), not arbitrary values. To return data, use echo:
Returning Values
Practical Bioinformatics Functions
Check File Format
Validate FASTQ Format
Calculate Sequence Statistics
FASTA Statistics Function
Error Handling Patterns
Die Function
A standard pattern for fatal errors:
Die Function for Errors
Warn Function
For non-fatal warnings:
Warn Function
Debugging Techniques
Add Debug Prints
Debug Mode
Trace Execution
bash -x script.sh+ sample=Sample_01
+ echo 'Processing Sample_01'
Processing Sample_01
+ count_reads Sample_01.fastq-x shows each command before running. See exactly what executes.
Add tracing to specific sections:
Selective Tracing
Check Syntax Without Running
bash -n script.sh-n checks syntax without executing. Catches typos before running.
bash -n broken_script.shbroken_script.sh: line 15: syntax error near unexpected token `fi'
broken_script.sh: line 15: `fi'Syntax errors reported immediately.
Error Handling with set Flags
Robust Script Template
set -e has limitations: It doesn't catch errors in some contexts (like functions in conditions). Combine with explicit error checking for critical operations.
Trap for Cleanup
Always clean up temporary files, even if script fails:
Cleanup with Trap
Complete Example: Robust Pipeline
Production-Ready Analysis Script
Debugging Common Errors
Undefined Variable
#!/bin/bash\nset -u\necho $undefined_varscript.sh: line 3: undefined_var: unbound variableset -u catches undefined variables immediately.
Fix: Always initialize variables or use ${var:-default}:
value=${optional_var:-default_value}
echo $valuedefault_valueUse default if variable is unset or empty.
Command Not Found in Function
my_function() {\n ech 'typo'\n}\nmy_functionscript.sh: line 2: ech: command not foundTypos in function only caught when function runs.
Fix: Use bash -n to check syntax before running.
Silent Pipe Failures
false | echo 'Still runs'Still runsWithout pipefail, errors in pipes are ignored!
set -o pipefail\nfalse | echo 'Still runs'\necho 'Exit code:' $?Still runs
Exit code: 1pipefail preserves error code from failed command in pipe.
Quick Reference
Function Syntax
function_name() { # Define function
local var=$1 # Get argument
echo "result" # Return data via echo
return 0 # Return status code
}
result=$(function_name arg) # Capture output
function_name arg # Just run functionError Handling
set -e # Exit on error
set -u # Exit on undefined variable
set -o pipefail # Catch pipe errors
set -x # Trace execution
die() { # Fatal error function
echo "ERROR: $*" >&2
exit 1
}
cleanup() { # Cleanup function
rm -rf temp_*
}
trap cleanup EXIT # Run on exitDebugging
bash -x script.sh # Trace execution
bash -n script.sh # Check syntax
set -x # Enable tracing
set +x # Disable tracing
DEBUG=1 ./script.sh # Run with debug flagBest Practices
- One purpose per function - Each function does one thing well
- Use local variables - Prevent variable name conflicts
- Validate inputs - Check arguments at function start
- Return meaningful codes - 0 for success, 1+ for errors
- Document functions - Comments explaining purpose and usage
- Keep functions short - If > 50 lines, consider splitting
- Test functions independently - Easier to debug
- Use consistent naming - verb_noun pattern (e.g., validate_file)
Next Steps
You now have the tools for writing robust, maintainable scripts:
- Functions for code reuse
- Error handling for reliability
- Debugging techniques for troubleshooting
The final page brings everything together with real-world bioinformatics pipelines, showing how to combine all these techniques into production-ready workflows.