Statistics

The only rule we have is that you use a statistical language or program that documents your code (i.e., R or Python). Using GUI-based programs can be helpful early in your career but they

are a curse when it comes to open science and replicating your work. If you need to learn R or Python, let me know as soon as you arrive and I will help! 

At CNE we have a healthy obsession with correct statistics. I trust you and do not want to review your code for every analysis (unless you want me to) but I do expect that on analyses of biostatistics (regression, ANOVA’s ect) you formally check for the assumptions of each model and present them when you are presenting the results (at least the first time). 

You are the best statistician that you know and there are multiple ways to analyze any given question. All you must do is justify each and every step of your analysis pipelines. If you think something is not quite right, its probably because it isn’t. 

Some concepts that will crop up consistently are outliers, multiple comparison corrections, movement thresholds (fRMI), ANOVA vs mixed models (hint- they’re the same thing but mixed models allow more flexibility – use them!). Every project will have its own unique way of dealing with these concepts but eventually we will have guidelines on at least where to start (outliers in regression checked with Cook’s distance for example). For now, let’s discuss them each time you come across them. 

p-values- Please do not ever show me a plot, graph or table that contains ONLY p-values. READ the APA statement on p-values every couple of months : https://amstat.tandfonline.com/doi/full/10.1080/00031305.2016.1154108 (hint- scroll down) past the commentary). 

Also, let us refrain from using the phrase “trending to significance” – READ THIS to find out why https://www.bmj.com/content/348/bmj.g2215 

The type of research we do will likely require you to have a firm grasp on both applied biostatistics (probability, regression, t-tests, mixed methods) as well as some not very advanced machine learning (penalized regression, random forests, clustering). You DO NOT need to know everything when you arrive. But you should be proactive about learning (at least graduate students and postdocs) – we will work on this together by defining the projects you will work on. 

Science

In TBI Lab, we believe slow science is good science. HOWEVER, I recognize that there is a fine line between doing slow science and you achieving what you need to achieve (often times this means results, posters, publications) and so I try my very best to strike a balance between us doing and producing good science and you getting what you need. 

Masters, grad students and postdocs: When you first arrive (or before) we will have a meeting one on one to define your goals for the coming year (if we haven’t already scheduled that by the time you are reading this- stop reading, and organize a time with me). These goals will then help us not only decide on which projects you will work on but also one what areas of your skill set we can focus on and decide on strategies to help you achieve improvements.

In terms of projects, a loose rule is that you should always have one primary project (A) and one secondary project (B). When you cannot make advances on A, you should be working on B. When A is done, B becomes A and C (which we will define when A is going out the door), becomes B. I will do my very best to make sure that you are always involved in a project collecting data, and another project in which you are either analyzing data or writing up results. Again, I expect that you take full responsibility and ownership of your projects and that you are managing your time between projects correctly.

Our job is to learn and to generate knowledge and understanding- not endless publications. We publish to survive currently in the scientific world but that criteria will eventually change and being a good citizen, open science practices, mentorship, collaboration, outreach and other metrics will soon become just as important as publications in your career advancement (right now though, publications matter so we will strike a balance!). 

Coding

For each of these following steps I am working on individual SOPs and tutorials. Until then, ask me or ask a peer who has been in the lab longer than you who may know and can teach you. 

To run statistics and machine learning pipelines, we use R (or python if you prefer)

To preprocess new imaging data, we use fmriprep (BASH and command line). 

To analyze brain structure data, we use FreeSurfer (BASH and command line).

To analyze resting state functional connectivity, we use Conn (MATLAB).

To analyze task fMRI, we use FSL (BASH and command line). 

To analyze EEG data, we use HAPPE and EEG lab (MATLAB).

We store data and run most analyses on the Discovery cluster and use Git and GitHub for version control and collaboration on our code. If you want to run statistics on your local computer that is fine too, (sometimes it’s just easier for small things) but remember to push code changes to the GitHub repository for projects. 

For large projects we use the Discovery cluster. Running jobs (using slurm) is another skill in and of itself and is worth learning- every so often RC will do bootcamps or provide consulting services to us. I will try to organize these when several people join the lab at a similar time. Ask me or your peers to get you up to speed in the meantime. You can always ask for a RC consultation individually too: https://rc.northeastern.edu/support/consulting/ . They are incredibly helpful. 

Authorship

The policy for authorship on a publication follows the ICMJE guidelines: 

https://www.icmje.org/recommendations/browse/roles-and-responsibilities/defining-the-role-of-authors-and-contributors.html

These guidelines state that criteria for authorship is based upon the following criteria;

  • Substantial contribution to the conception or design of the work; or the acquisition, analysis of interpretation of data for the work; 

AND

  • Drafting the work or revising it critically for important intellectual content; 

AND

  • Final approval of the version to be published;

AND

  • Agreement to be accountable for all aspects of the work and ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. 

Substantial contributions to data acquisition or analysis ALONE are sufficient for authorship so long as criteria 2-4 are met. As such, UGAs will be given the opportunity to satisfy criteria 2-4 if they have been working on a study for at least 2 semesters (rough rule). 

To decide the order of co-authors on papers we have conversations early and often such that everyone is as informed as they can be. Co-authors should direct any questions about their place in the author list to me, the PI, and not the first author.