Free Software


TimeForScience Repo on GitHub


A blue snake

Spreadsheet Viewers

For viewing tabular data (plain text format only) directly on the command line. Generally expects tab-delimited input files. is an interactive terminal-based spreadsheet viewer for tab-delimited files. Sort of like a bare-bones read-only version of Excel. pre-processes input data into reasonable column-delimited tabular files that you can then pipe into less -S. Similar to unix ‘column’ command.

A blue cat face

Recommended Tools is a “safer rm” that moves files to a new temporary directory in /tmp/ instead of immediately removing them. May fill up your /tmp partition if you delete extremely large files, so beware. marks duplicated cells in a tab-delimited file (just like ditto marks in an old ledger). Good for finding duplicates in a visually obvious fashion.

This is a version of “cut” that allows you to output the results in an arbitrary order. For example, -f 2,1,3- would switch columns 2 and 1, and leave columns 3 and beyond in the same order.

A modified version of UNIX join. It can handle un-sorted input and deal with case-insensitive joins. Can also accept multiple input files all at once.

Can sort compressed (gzip/bzip2) files and can accept header line(s). It uses the fast UNIX sort internally. Frequency-of-use rating: 9/10. is a script to easily verify a bunch of files with md5 checksums. It runs on both Mac and Linux and can handle several types of input md5 file, unlike normal md5sum.

A blue octopus


SAM/BAM → UCSC Browser (.pl) converts input BAM/SAM files into tracks for the UC Santa Cruz Genome Browser (UCSC Genome Browser), and provides a track description file. (a bioinformatics-specific tool) takes a FASTA file and makes a GTF file that spans each chromosome.

A blue snake

Scientific / Data Processing ( (“queue please”) can submit jobs to a PBS Pro queue in user-friendly fashion. Tested with PBS Pro version 13 (August 2016). May also work with TORQUE.

Randomly chooses a certain number of lines from a file. Can sample with or without replacement. It can also pull out multi-line records (for example, in a FASTQ file, each record is actually 4 rows). Becomes very slow if files have > 1 million lines. can turn a 2- or 3-column file into a matrix. The matrix will either be an adjacency matrix (2 column input) or will have the values of each edge (3 column input). picks the best N items (rows) with a given key (in a user-specified column).


Other programs

There are a ton of additional programs on the TimeForScience GitHub repository, some of which have even been properly documented.

A dangerous snake with a sword

Programs that aren’t on GitHub (Philips Hue lights)