Skip to content

echohenry2006/LatentStructure

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repository contains code and relevant files for the Latent Structure
modeling project described in:

Poldrack RA, Mumford JA, Schonberg T, Kalar D, Barman B, Yarkoni T (2012). Discovering relations between mind, brain, and mental disorders using topic mapping. PLOS Computational Biology, in press.

Preprint available at:

http://talyarkoni.org/papers/Poldrack_et_al_in_press_PLoS_CompBio.pdf

Guide to subdirectories:
CCA - contains results from CCA analysis

NIF-Disorders - contains files used for disorders topic modeling

clustering - contains files used for clustering of disorders

cogatlas - contains files used for cogatlas topic modeling

src: contains the source files used for all analyses. these require the following
external libraries:

http://numpy.scipy.org/
http://scikit-learn.org/stable/
http://rpy.sourceforge.net/rpy2.html
http://nipy.sourceforge.net/nibabel/

also note that src/utils needs to be in the python path

1_db_foci_to_image.py - obtains coordinates from neurosynth database and creates images
- uses utils/tal_to_mni.py

1.1_merge_peakfiles.sh - creates merged image and computes mask of voxels with activation
on at least 1% of papers

1.2_extract_nidag_text.py - extract full text of each article from database

2_get_cogat_concepts.py - read cognitive atlas concepts from RDF and get loading for each document

3_create_disorder_list.py - load NIF dysfunction ontology, grab synonyms, and add
additional missing terms

4_get_disorders_NIF.py - get loading for each disorder term from corpus

5_mk_cogatlas_docs.py - make documents based on cogatlas loadings

6_mk_disorders_docs.py - make documents based on disorder loadings

7_mk_pickled_data.py - created a pickle with the full dataset to make loading easier

8_mk_8fold_cogatlas.py - make files for 8-fold CV

9_mk_8fold_disorders.py - make files for 8-fold CV

10_mk_8fold_cogatlas_mallet_data.py - make mallet data from 8-fold files

11_mk_8fold_disorders_mallet_data.py - make mallet data from 8-fold files

12_mk_8fold_cogatlas_topic_models.py - create scripts to run mallet jobs

13_mk_8fold_disorders_topic_models.py - create scripts to run mallet jobs

14_get_best_topic_likelihood.py - check likelihoods to get get dimenstionality

15.1_get_disorders_dimensionality.py - generate additional topic models to get disorder dimensionality

15.3_get_best_disorder_dimensionality.py - get dimensionality that has unique topic dists

15_run_topic_models.py - generate scripts to run final topic models

16_mk_cogatlas_loadingdata.py - load topic data and save loadingdata.txt

17_mk_disorders_loadingdata.py - load topic data and save loadingdata.txt

18_mk_all_chisq_maps.py - make all chisquare maps - uses utils/mk_chisq_maps_filter.py

18.2_mk_6mm_chisq_maps.sh - make 6mm versions of topic

19_mk_slice_images_cogatlas.py - make slice images using p values

20_mk_slice_images_disorders.py - make slice images using p values

22_mk_latexreport_cogatlas.py - create latex report

23_mk_latexreport_disorders.py - create latex report

24_run_CCA_nonneg.py - run CCA analysis

25_cluster_disorders.R - run clustering on disorders data

26_make_cognitive_topic_figure.py - generate figure 2 from initial submission

27_make_disorders_topic_figure.py - generate figure 3 from initial submission

28_mk_8fold_fulltext_topic_models.py - create scripts to run fulltext topic models

28.1_mk_8fold_fulltext_mallet_data.py - make 8-fold data for full text analysis

28.2_run_all_fulltext_nfold.sh - run mallet jobs on 8-fold full text

28.3_get_best_dimensionality.py - get best dimensionality for full text

30_get_wordhist.py - get histograms of # of docs for each filtered paper

31_count_locations.py - count # of locations reported across all papers

32_doc_topic_hist.py - make histograms of docs/topic and topics/doc (for Figure 3)

33_plot_likelihood.py - plot empirical likelihood as function of # of topics (for Figure 2)

34_run_CCA_on_topics.py - run CCA directly on topic distributions

35_mk_fold1_loadingdata.py - get loadingdata for fold1 for each ntopics

36.1_mk_slice_corr_images.disorders.py - make images using correlation rather than p value

36_mk_slice_corr_images.cogatlas.py - make images using correlation rather than p value (for figure 

37_make_cognitive_topic_corr_figure.py - make Figure 4

38_make_disorder_topic_corr_figure.py - make Figure 6

39_get_topic_hierarchy.py - create graphviz .dot file for topic hierarchy (Figure 5)

About

code and data from Latent Structure modeling paper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published