Skip to Main Content

Math, Statistics, & Research Methods: Statistical software

All statistical software packages have their good points and their bad points. Which to use is a difficult but important decision. The major software packages are listed below. Please be aware that if you have data in SAS format, for example, but prefer to use Stata (or SPSS), then you are not stuck using SAS. You can use StatTransfer to convert the SAS data into Stata.

Data Analysis:

  • SPSS: This is the statistical package provided by Missouri Valley College, and is one of the most popular ones overall. The interface may be the best, but it also includes the ability to write programs. Most people find that they can generally do everything they need in SPSS. While most work can be done with the interface, more advanced functions do require learning SPSS programing, which has a pretty steep learning curve.

  • Stata: Stata is a relatively (compared to SAS and SPSS) easy to learn package which give you a choice among a command-line interface, syntax or program file (called a "do-file" in Stata), and pull-down, fill-in-the-blank GUI interface. Stata is very good with time-series data and has many survival analysis routines. Stata also provides for the ability to program your own commands. One drawback to Stata is that it loads the entire dataset into memory, so if your dataset is very large, you may not be able to use Stata. This is a relatively rare occurrence, however. Generally, if you have little or no experience with any statistical package, Stata is probably your best choice.

  • SAS: SAS is the biggest of all statistical packages (as well as being the largest privately-owned software company). SAS can do just about anything you will ever need to do. However, the learning curve is quite a bit more difficult than SPSS or Stata. While there is a fill-in-the-blank interface (SAS/ASSIST), it is not as well-developed as Stata or SPSS. To really make the best use of SAS, you must write a program.

  • R: R is an open-source (free) data management, analysis and processing language. Although it is very versatile, it also has a rather steep learning curve. R is good if you need to conduct unusual or highly customized analyses and if you will be doing data analysis on a regular basis (at least weekly). If you will be analyzing data only a few times a year, you are better of with another package, particularly SPSS.

Researchers may find the following tools useful in their work. Emphasis is given to free (or at least having free components) and online tools or services.

Electronic Lab Notebooks:

  • Electronic Lab Notebooks - Guide for prospective users; Information for researchers who are interested in adopting an Electronic Lab Notebook system for documenting research and managing data.
  • ELN at Harvard Medical School - The Electronic Lab Notebook Matrix has been created to aid researchers in the process of identifying a usable Electronic Lab Notebook solutions to meet their specific research needs. Through this resource, researchers can compare and contrast the numerous solutions available today, and also explore individual options in-depth.
  • RSpace - An ELN for researchers to organize, manage and collaborate on their projects.
  • Hivebench - Biology-focused experiment, lab and project management.
  • Docollab - Project management system, collaboration.
  • Benchling - Life Sciences focused experiment, lab and project management.

Data Analysis/Visualization:

  • TableauPublic - Free version of their desktop and online data visualization platform. All data uploaded to TableauPublic is available to everyone on the Internet. The paid versions allow restricted access.
  • StatCrunch - Simple online data analysis and survey package.
  • Dataviz - Data visualization for time, geographic and comparative data.
  • OpenRefine - Data cleaning and exploration tool.

Directories of Research Tools: