Data file formats


Input files

ErmineJ uses files provided by the user together with files we provide to analyze your data. ErmineJ accepts four different types of input files, which you should read about by following these links:

  • Gene scores – supplied by you (optional for correlation scores)
  • Gene expression profiles – supplied by you (optional except for correlation scores)
  • Gene Set descriptions (GO) – supplied by the GO consortium (currently required even if you are not using GO)
  • Gene annotations – required, supplied by us (though you can provide your own). Provides mappings of genes to GO, or at least the human-readable names of the genes.

In addition, individual gene sets can be defined or modified by the user, and you can use non-GO gene sets. These are saved in files described here.

Important! Most problems people have with the software are because their files are not in the right format. Please:

  • Check that your files are in text format, not Excel spreadsheets
  • Check that you do not have extra blank rows at the top of your files
  • Ensure that your gene (or probe) identifiers are the same in all your files, and match the ones used in the annotation files. Mismatched identifiers are ignored.
  • Pay attention to the status messages, which attempt to inform you about file format problems. If you miss a message, look in the logs (help→view log)

There is an explanation of the output format.

Please note that all files are text files, not Excel spreadsheets or other binary formatted files. To convert your files to text in Excel, see these instructions.