Surface-based Group Analysis in FreeSurfer 1 Outline Motivation GLM: General Linear Model Theory Hypothesis GLM with FreeSurfer Modeling Command-line Stream 2 Motivation Model patterns of interactions and associations between groups

The parameters of the model provide measures of the strength of associations A General Linear Model (GLM) focuses on estimating the parameters of the model can be applied to new data sets to create reasonable inferences. 3 Types of Questions Does a specific variable have a significant association with an outcome? If we control for the effects of a second variable, is the association still significant? Is there a group difference? Is a specific variable related differently between groups of individuals? 4

Aging Exploratory Analysis In which areas does thickness Change with age? * Cortical Thickness vs Aging Salat et al, 2004, Cerebral Cortex 5 Aging Thickness Study N=40 (all in fsaverage space) Negative Age Correlation p<.01 * Positive Age

Correlation 6 Surface-based Measures Morphometric (e.g., thickness, area, volume) Functional PET MEG/EEG Diffusion (?) sampled just under the surface 7 Outline Motivation GLM: General Linear Model

Theory Hypothesis GLM with FreeSurfer Modeling Command-line Stream 8 GLM Theory Is Thickness correlated with Age? Thickness Dependent Variable, y1 Measurement Thickness IQ, Height, Weight, etc. Subject 1 Subject 2

y2 Age x1 Of course, you would need more then two subjects x2 Independent Variable 9 GLM: Theory Line equation: y=mx+ b Thickness Intercept: b

(=Offset) Slope: m y1 y2 Dependent variable (outcome) slope Age x1 x2 intercept Independent variable 10

GLM: Theory We can put this in matrix format: y=b+ mx y1 1 x1 y2 =1 x2 b y3 1 x3 m y4 1 x4 Design Matrix Regressio n Coefficient s (paramete

Thickness Intercept: b (=Offset) Slope: m y1 y2 Age x1 x2 -One row per data point -Add column of 1s for the offset term (b) -One set of parameters 11 Matrix Multiplication

y1 y2 y3 y4 1 x1 1 x2 = 2 x3 * 1 x4 b m y1 1*b + x1*m = 1*b + x2*m y2 = 1*b + x3*m y3 = 1*b + x4*m y4

= System of Linear Equation s 12 More than two data points: Errors If we have the same m and b for all data points, we will have errors. GOAL: minimize the sum of the square of error terms when estimating our m and b Thickness Intercept: b There are lots of ways to do this! (Beyond the scope of this talk, but FreeSurfer does it for you!) Slope: m

Age 13 More than two data points: Errors Matrix Formulation Y =Xb +n y1 1 y2 =1 y3 1 y4 1 Thickness Intercept: b n1 x1 x2 b n2 +

x3 m n3 x4 n4 Slope: m Age System of Linear Equations y1 =1b+ mx1 + n1 y2 =1b+ mx2 + n2 y3 =1b+ mx3 + n3 y4 =1b+ mx4 + n4 Model Error, noise Residuals: eres.mgh 14

GLM: summarizing Matrix Formulation y1 1 y2 =1 y3 1 y4 1 n1 x1 x2 b n2 + x3 m n3 x4 n4

Thickness Intercept: b Two parameters One row per subject X is an independent variable (age) Column of 1s is the offset term (to multiply by b) Slope: m Age System of Linear Equations y1 =1b+ mx1 + n1 y2 =1b+ mx2 + n2 y3 =1b+ mx3 + n3

y4 =1b+ mx4 + n4 Outcome Design matrix Y =Xb +n Noise term Regression coefficients/parameter estimates = betas beta.mgh (mri_glmfit 15 Two Groups Thickness Intercept: b1 Slope: m1 Slope: m2 Age y11 1 0

y12 =1 0 y21 0 1 y 22 0 1 x11 0 b1 x12 0 b2 +n 0 x21 m1 0 x21 m2 number of columns = (number of groups)*(number of parameters)

Intercept: b2 y11 =1b1 + 0b2 + x11m1 + 0m2 + n11 =b1 + x11m1 + n11 y12 =1b1 + 0b2 + x12 m1 + 0m2 + n12 =b1 + x12 m1 + n12 y21 =0b1 +1b2 + 0m1 + x21m2 + n21 =b2 + x21m2 + n21 y22 =0b1 +1b2 + 0m1 + x22 m2 + n22 =b2 + x22 m2 + n22 16 Outline Motivation GLM: General Linear Model Theory Hypothesis GLM with FreeSurfer Modeling Command-line Stream 17

Forming an Hypothesis Once we fit our parameters we can test our hypothesis Is there a significant association between age and thickness? Formal Hypothesis: The slope of age vs. thickness (m) is significantly different from zero thickne ss Null hypothesis: m=0 age 18 Testing Our Hypothesis We test for significance and the direction of the effect We do this with a contrast matrix C that isolates our parameter of interest With C and we compute g, which tells us the direction of our effect If g is negative, then the direction of our effect (slope) is also negative In our example, our hypothesis is about the slope m

Our contrast matrix will be [0 1] g=Cb b g=( 0 1) =0b+1m=m m Contrast matrix thickness1 1 age1 n1 thickness 1 age b

n2 2 2 = + thickness3 1 age3 m n3 thickness4 1 age4 n4 19 Testing our Hypothesis We still need to test for significance Well use our contrast matrix C [0 1] again here in a t-test: Contrast matrix

t= Cb Regression coefficients s 2C(XT X)- 1 CT Variance of noise (var(n)) Design matrix This t-value corresponds to a p-value that depends on your sample size. This p-value is between 0 and 1, values closer to 0 indicate a more significant result. 20 p-values p-value/significance

value between 0 and 1 depends on your sample size closer to 0 means more significant FreeSurfer stores p-values as log10(p): 0.1=10-1sig=1, 0.01=10-2sig=2 sig.mgh files Signed by sign of g p-value is unsigned 21 Two Groups Thickness Intercept: b1 Slope: m1 Slope: m2 Age Do groups differ in Intercept?

Do groups differ in Slope? Is average slope different from 0? Intercept: b2 22 Two Groups Do groups differ in Intercept? Does b1=b2? Does b1-b2 = 0? C = [+1 -1 0 0], g = C*b Y =Xb +n = Thickness b1

b2 m1 m2 Intercept: b1 Slope: m1 Slope: m2 Age Intercept: b2 23 Two Groups Do groups differ in Intercept? Does b1=b2? Does b1-b2 = 0? C = [+1 -1 0 0], g = C*b Do groups differ in Slope? Does m1=m2? Does m1-m2=0? C = [0 0 +1 -1], g = C*b

Y =Xb +n = Thickness b1 b2 m1 m2 Intercept: b1 Slope: m1 Slope: m2 Age Intercept: b2 24 Two Groups Do groups differ in Intercept? Does b1=b2? Does b1-b2 = 0? C = [+1 -1 0 0], g = C*b

Do groups differ in Slope? Does m1=m2? Does m1-m2=0? C = [0 0 +1 -1], g = C*b Y =Xb +n = Thickness Is average slope different than 0? Does (m1+m2)/2 = 0? C = [0 0 0.5 0.5], g = C*b b1 b2 m1 m2 Intercept: b1 Slope: m1 Slope: m2 Age

Intercept: b2 25 Putting it all together 1. We used our empirical data to form a design matrix: X 2. We fit regression coefficients (b and m) to our x,y data 3. We created a contrast matrix: C to test our hypothesis for: 1. Direction of effect: g = C* 2. Significance of effect: t-test 26 Outline Motivation GLM: General Linear Model Theory Hypothesis GLM with FreeSurfer Defining your model

Command-line Stream 27 Surface-based Group Analysis in FreeSurfer Create your own design and contrast matrices Create an FSGD File FreeSurfer creates design matrix You still have to specify contrasts This talk is for using the FSFG files 28 Processing Stages Specify subjects and surface measures Assemble data Resample into Common Space Smooth

Concatenate into one file Model and Contrasts (GLM) Fit Model (Estimate) Correct for multiple comparisons [Next talk!] Visualize 29 Specifying Subjects Subject ID $SUBJECTS_DIR bert fred jenny margaret SUBJECTS_DIR environment variable 30

FreeSurfer Directory Tree bert bem stats morph orig T1 brain mri rgb wm Subject ID scripts surf lh.aparc_annot rh.aparc_annot

aseg lh.white rh.white tiff label lh.thickness rh.thickness lh.sphere.reg rh.sphere.reg SUBJECTS_DIR environment variable 31 Example: Thickness Study 1. $SUBJECTS_DIR/bert/surf/lh.thickness 2. $SUBJECTS_DIR/fred/surf/lh.thickness 3. $SUBJECTS_DIR/jenny/surf/lh.thickness 4. $SUBJECTS_DIR/margaret/surf/lh.thickness

5. 32 FreeSurfer Group Descriptor (FSGD) File Simple text file List of all subjects in the study Accompanying demographics Automatic design matrix creation You must still specify the contrast matrices Note: Can specify design matrix explicitly with --design 33 FSGD Format GroupDescriptorFile 1 Class Male Class Female Variables Age Weight IQ Input bertMale 10 100 1000

Input fredMale 15 150 1500 Input jenny Female 20 200 2000 Input margaret Female25 250 2500 One Discrete Factor (Gender) with Two Levels (M&F) Three Continuous Variables: Age, Weight, IQ Class = Group Note: Can specify design matrix explicitly with --design 34 FSGDF X (Automatic) Male Age Female Group Male Group X =

Female Age 1 1 0 0 0 0 1 1 10 0 15 0 0 20 0 25 100 0 150 0 0 200 0 250

Age Weight 1000 0 1500 0 0 2000 0 2500 IQ DODS Different Offset, Different Slope. Different Offset Same Slope DOSS also possible, see https://surfer.nmr.mgh.harvard.edu/fswiki/DodsDoss 35 Contrasts You create Male Age Female Group Male Group Female Age

1 1 0 0 X = 10 0 15 0 0 20 0 25 100 0 150 0 0 200 0 250 Age Weight

[-1 1 0 } C= 0 0 1 1 0 0 1000 0 1500 0 0 2000 0 2500 IQ

0 0 0 ] Tests for the difference in intercept/offset between groups [ 0 0 -1 1 } C= 0 0 0 0]

Tests for the difference in age slope between groups Create contrast files with simple text editor 36 Normalizing Covariates GroupDescriptorFile 1 Class Male Class Female Variables Age Weight IQ Input bertMale 10 100 1000 Input fredMale 15 150 1500 Input jenny Female 20 200 2000 Input margaret Female25 250 2500 Very Different Scales will create problems! Normalize by subtracting mean and dividing by the standard deviation. Eg, Age mean = 17.5, stddev=6.455. So use -1.1619, -0.3873, 0.3873, 1.1619 in the FSGD file or 37

Covariates GroupDescriptorFile 1 Class Male Class Female DemeanFlag 1 RescaleFlag 1 Variables Age Weight IQ Input bertMale 10 100 1000 Input fredMale 15 150 1500 Input jenny Female 20 200 2000 Input margaret Female25 250 2500 If you set Flag to 0 instead of 1, then it gets turned off 38 Another FSGD Example Two Discrete Factors Gender: Two Levels (M&F) Handedness: Two Levels (L&R)

One Continuous Variable: Age GroupDescriptorFile 1 Class MaleRight Class MaleLeft Class FemaleRight Class FemaleLeft Variables Age Input bertMaleLeft 10 Input fredMaleRight 15 Input jenny FemaleRight 20 Input margaret FemaleLeft Class = Group 25 39

Interaction Contrast Two Discrete Factors (no continuous, for now) Gender: Two Levels (M&F) Handedness: Two Levels (L&R) Four Regressors (Offsets) MR (b1), ML (b2), FR (b3), FL (b4) GroupDescriptorFile 1 Class MaleRight Class MaleLeft Class FemaleRight Class FemaleLeft Input bert MaleLeft Input fred MaleRight Input jenny FemaleLeft Input margaret FemaleRight L

R b4 D2 b2 b3 D1 b1 M F ? g = D1 -D2=0 g = (b3-b1)- (b4-b2) = -b1+b2+ b3-b4 C = [-1 +1 +1 -1]

40 FSGD Examples https://surfer.nmr.mgh.harvard.edu/fswiki/FsgdExamples 41 Factors, Levels, Groups Usually each Group/Class: Has its own Intercept Has its own Slope (for each continuous variable) Different Offsets Different slopes (DODS): NRegressors = NClasses * (NVariables+1) Different Offsets Same slope (DOSS) Nregressors = NClasses + Nvariables Why is this important? Because you will need to create contrast matrices, and the contrast matrix must have Nregressor elements. 42 Factors, Levels, Groups, Classes

Continuous Variables/Factors: Age, IQ, Volume, etc. Discrete Variables/Factors: Gender, Handedness, Diagnosis Levels of Discrete Variables: Handedness: Left and Right Gender: Male and Female Diagnosis: Normal, MCI, AD Group or Class: Specification of All Discrete Factors Left-handed Male MCI Right-handed Female Normal 43 Outline Motivation GLM: General Linear Model Theory Hypothesis GLM with FreeSurfer Defining your model

Command-line Stream 44 Stages Assemble data Resample into Common Space Smooth Concatenate into one file Fit Model (Estimate) Correct for multiple comparisons [Next talk!] Visualize 45 Command-line stream: Assemble Data mris_preproc --help --fsgd FSGDFile : Specify subjects thru FSGD File --hemi lh : Process left hemisphere --meas thickness

: subjectid/surf/hemi.thickness --target fsaverage : common space is subject fsaverage --o lh.thickness.mgh : output volume-encoded surface file Lots of other options! Output: lh.thickness.mgh file with stacked thickness maps for all subjects Input to Smoother or GLM Can use meas area and meas volume 46 Command-line stream: Assemble Data Surface smoothing mri_surf2surf --help Loads stacked lh.thickness.mgh 2D surface-based smoothing Specify FWHM (eg, fwhm = 10 mm) Saves stack lh.thickness.sm10.mgh One frame for each subject

47 Stages Assemble data Resample into Common Space Smooth Concatenate into one file Fit Model (Estimate) Correct for multiple comparisons [Next talk!] Visualize 48 Command-line stream: Fit Model mri_glmfit Reads in FSGD File and constructs X Reads in your contrasts (C1, C2, etc.) Loads data (lh.thickness.sm10.mgh) Fits GLM (ie, computes b) Computes contrasts (g=C*b)

t or F ratios, significances Significance -log10(p) (.01 2, .001 3) 49 mri_glmfit mri_glmfit --y lh.thickness.sm10.mgh --fsgd gender_age.txt --C age.mtx C gender.mtx --surf fsaverage lh --cortex --glmdir lh.gender_age.glmdir Input file (output from smoothing). Stack of subjects, one frame per subject. mri_glmfit --help 50

mri_glmfit mri_glmfit --y lh.thickness.sm10.mgh --fsgd gender_age.txt --C age.mtx C gender.mtx --surf fsaverage lh --cortex --glmdir lh.gender_age.glmdir FreeSurfer Group Descriptor File (FSGD) Group membership Covariates mri_glmfit --help 51 mri_glmfit mri_glmfit --y lh.thickness.sm10.mgh

--fsgd gender_age.txt --C age.mtx C gender.mtx --surf fsaverage lh --cortex --glmdir lh.gender_age.glmdir Contrast Matrices Simple text/ASCII files Test hypotheses You must create these by hand! mri_glmfit --help 52 mri_glmfit

mri_glmfit --y lh.thickness.sm10.mgh --fsgd gender_age.txt --C age.mtx C gender.mtx --surf fsaverage lh --cortex --glmdir lh.gender_age.glmdir Perform analysis on left hemisphere of fsaverage subject Masks by fsaverage cortex.label Computes FWHM in 2D mri_glmfit --help 53 mri_glmfit Output directory: mri_glmfit

lh.gender_age.glmdir/ --y lh.thickness.sm10.mgh beta.mgh parameter estimates --fsgd gender_age.txt rvar.mgh residual error variance --C age.mtx C gender.mtx etc --surf fsaverage lh age/ --cortex sig.mgh -log10(p), uncorrected --glmdir lh.gender_age.glmdir gamma.mgh, F.mgh gender/ sig.mgh -log10(p), uncorrected mri_glmfit --help gamma.mgh, F.mgh 54 GLM Analysis Using Aseg/Aparc Stats Files

Use --table table.txt mri_glmfit --table aparc_lh_vol_stats.txt --fsgd gender_age.txt --C age.mtx --C gender.mtx --glmdir roi.gender_age.glmdir instead of --y to specify input The rest of the command- line is the same as you would use for a group study (eg, FSGD file and contrasts). Output is text file sig.table.dat that lists the significances (-log10(p)) for

each ROI and contrast. 55 Command-line stream: Processing Stages Assemble data Resample into Common Space Smooth Concatenate into one file Fit Model (Estimate) Correct for multiple comparisons [Next talk!] Visualize 56 Command-line stream: Processing Stages Assemble data Resample into Common Space Smooth Concatenate into one file

Fit Model (Estimate) Correct for multiple comparisons [Next talk!] Visualize 57 Visualization with freeview freeview f $FREESURFER_HOME/subjects/fsaverage/surf/lh.pial:overlay=sig.mgh Use Configure Overlay tool to change thresholds for visualization (recall: lower threshold of 1.3 will only display regions where p<0.05)

58 Tutorial Create an FSGD File and contrasts for a thickness study Age and Gender Run mris_preproc mri_surf2surf mri_glmfit 59