The National Institute of Mental Health (NIMH) is the lead federal agency for research on mental disorders. NIMH is one of the 27 Institutes and Centers that make up the National Institutes of Health (NIH), which is responsible for all federally funded biomedical research in US. NIH is part of the U.S. Department of Health and Human Services (HHS). The NIH is a highly rated employer at glassdoor.com with very competitive salary and benefits packages.
The Data Science and Sharing Team (DSST) is a new group created to develop and support data sharing and other data-intensive scientific projects within the NIMH Intramural Research Program (IRP) in Bethesda, MD. Working closely with the Office of Data Science the goal of the DSST is to make the NIMH IRP a leader in the open science and data sharing practices mandated by the Open Data Policy released by the White House on 9 May, 2013. We are building a team to make that happen.
You will work with a team of researchers and developers to build and deploy neuroimaging data processing pipelines for investigators within the NIMH IRP. You will collaborate with and contribute to other projects throughout the world that are building standards and tools for open and reproducible neuroscience (e.g. NiPy, BIDS, Binder, Rstudio). You'll have the resources of the NIH HPC Cluster at your disposal as well as additional help from the AWS cloud. All tools and code will be open source and freely distributed.
You will work to bolster data science skills within the NIMH IRP by teaching courses to scientists on best data practices (e.g., Software & Data Carpentry) as well as accessing and using specific neuroimaging repositories (e.g. The Human Connectome Project, OpenfMRI, UK Biobank).
There is no use building tools for open science if no one uses them. Part of the job of the DSST is to measure data sharing and open science practices within the NIMH IRP and progress toward their adoption. This will include bibliometrics for scientific publications from the NIMH IRP and other measures of data sharing and secondary data utilization. You'll work with DSST staff and external collaborators (e.g. Impactstory to make these metrics publicly available.
You should be very comfortable on the command line and have a rock-solid handle on one or more Unix-based operating systems. You should have some experience with distributed, high-performance computing tools such as Spark, OpenStack, Docker/Singularity, and batch processing systems such as SLURM and SGE. You should also have experience coding in modern languages currently used in data-intensive, scientific computing such as Python, R, and Javascript, as well as interfacing with a variety of APIs.
Ideally we would like to see a recent degree (BS, MS, or PhD) in a STEM field, but if you can prove you have an equivalent amount of expertise with your publications, projects, or github/kaggle ranking, we’re all ears. We are also interviewing students and part-time staff if you’re still working on your degree.
Data science is moving fast – we’re looking for someone who can move faster. You should be a self-learner and a self-starter. Provide some examples of things you have worked on independently.
Email your resume, a cover letter, and a code sample that demonstrates you are all three of the above to:
The National Institutes of Health is an equal opportunity employer.