The following dictionary files are available to download. By default they expect to be saved under directory /data/ETL/dictionaries. If you can use this path you can save the need to edit scripts and configuration files below.
Dictionary | File | Load script | Release | Description |
---|---|---|---|---|
Entrez | entrez-dictionary-human-mouse.tar.xz | load-entrez.sh | Nov 2016 | Entrez gene ids and names for human and mouse |
Gene Ontology | geneontology-dictionary.tar.xz | load-go.sh | Nov 2016 | Gene ontology and GOA files for human |
HMDB | hmdb-dictionary.tar.xz | load-hmdb.sh | 3.6 Nov 2016 | Human metabolome database extracted data, and extraction script |
KEGG | kegg-dictionary.tar.xz | load-kegg.sh | Nov 2016 | The last public release of KEGG retrieved from an old server. More recent pathway databases should be preferred, and suggestions are welcome |
MeSH | mesh-dictionary.tar.xz | load-mesh.sh | 2016 | Medline subject headings |
MiRBase | mirbase-dictionary.tar.xz | load-mirbase.sh | Nov 2016 | MiRBase microRNA database |
Taxonomy | taxonomy-dictionary.tar.xz | (none) | Nov 2016 | Taxonomy (species name and synonyms) for human and mouse |
UniProt | uniprot-human-dictionary.tar.xz | load-uniprot.sh | Nov 2016 | Human proteins from SwissProt and TrEMBL |
Observations | reference-observations-dictionary.tar.xz | load-obs.sh | Nov 2016 | Entrez gene ids and names for human and mouse |
SNPs for GWAS | reference-vcf-dictionary.tar.xz | load-snp.sh | Nov 2016 | VCF file of SNPs for GWAS MAGIC test data |
The full set of loading scripts can be downloaded from load-dictionaries.tar.xz.
Unpack (preferably into /data/ETL) with
tar -xJf load-dictionaries.tar.xz
if the dictionary files were not unpacked into /data/ETL/dictionaries then edit all the load*.sh scripts and conf/*.properties files to replace this with the path to your dictionary files.
The scripts expect to find tranSMART-ETL installed in /data/ETL/tranSMART-ETL by downloading and running "mvn package"
You can use a symlink in the expected path, or simply edit the load*.sh scripts.