Going into my final year of computer science at Carleton university, I have begun preparing for a transition into neuroscience and NeuroAI. I hope to continue my studies in a graduate program focusing on the relationship between deep learning and biological intelligence. Part of my preparation included beginning to fill the knowledge gap between my undergrad in Computer Science and the field of Neuroscience. This involved reading neuroscience books like Neuroplasticity by Angela D. Friederici[1], and attempting to read Language of the Brain by Moheb Costandi[2]. I also began learning the vast nomenclature and anatomy terms used in such books. However, I also wanted hands-on experience in some educational neuroscience programs before the summer ended.
While watching a youtube interview with Patrick Minneault, where he discussed one of his computational neuroscience blogs about unsupervised models and the brain’s visual system, I learned of the summer neuroscience course he helped found called Neuromatch Academy (NMA). I applied for a position in the course’s Computational Neuroscience section; there was also a Deep Learning section.
Other than the brief description given by Patrick Minneault, I didn’t know much about the course going in, such as how it would be structured. I was still working at IMRSV Data Labs as a Machine Learning Intern, so I had hoped it wouldn’t be too intensive.
As soon as the course started, I quickly discovered that it was not what I had expected. The organizers break up the roughly 1000 students into pods, where each pod has approximately 20 students and two TAs. Each day has 7 hours of course time, 4 hours of which is most often spent working through tutorials, and the rest is used to work on our pod projects; more on that later.
Each tutorial is set up as a jupyter notebook and teaches the topic using a combination of pre-recorded embedded videos, text descriptions, coding exercises, and interactive widgets/demos. At the beginning of the tutorial time, the TA breaks the zoom call up into breakout rooms, and each group works through the jupyter notebooks.
The highlights of the tutorials for me were the ones on dimensionality reduction and deep learning. I came away from the tutorials on dimensionality reduction with a firmer understanding of principal component analysis. I had learned about it during my studies in CS but also had a hard time visualizing it. The combinations of the videos and interactive demos filled in this gap for me. The tutorials on deep learning were a highlight for me because I could do much of the teaching and question answering in my group. As the only computer science student, many students didn’t have a background in DL, so I had a lot of fun passing on the knowledge I had learned over the years.
While I enjoyed the collaborative tutorials, I most enjoyed the group project. Each pod was assigned a form of neural data to base its projects. My pod was the fMRI pod, meaning we had to choose from the given fMRI datasets for our projects. My group had many musicians; as such, they were most interested in doing a project related to audio or sound. We settled on a language-based dataset from the Human connectome project. The data came from an experiment where subjects’ behaviors and fMRI activity were recorded during the onset of a language-based task, with a math task as a semantically neutral control. The language task was story answering; subjects would be read a short story and then promptly questioned about it. The mathematics task was simple arithmetic.
Time-domain decoding:
As a group, we struggled to develop an exciting and relatively novel project to apply this dataset. Luckily our helpful project TA Brendan introduced us to an interesting and relatively new modeling technique where fMRI data is decoded within the time domain, resulting in better classification[3]. We all found this to be an interesting extension to the classical decoding methods we learned in week one; General Linear Models (GLMs). With a standard GLM regression, you create a design matrix with columns representing the onset of various events over the experiment and usually convolved with a hemodynamic response function (HRF)[3]. The coefficients, also called Beta values, are then estimated by solving the following regression equation: where Y represents the BOLD signal data. The beta value estimates resulting from a least squares estimation have one value per voxel; thus, they make up one brain image per event [3]. A classifier can then be made by fitting a logistic regression on the beta estimates and the target condition labels. The time-domain decoding method follows a similar approach to GLMs but with the goal of reducing dimensionality and statistical efficiency issues. In this method, the design matrix X has columns representing the entire time series for a given stimulus type. Ridge regression is then performed to find beta values which predict the design matrix, X_hat, given bold data. The X_hat is then used to classify each onset within a window of t scans. “Thus, if we denote by x_hat[i:i+t] the vector formed by concatenating the rows from i to i + t of X_hat, the temporal step consists in determining, for each onset time i, its corresponding class label l_i. This is done by multiclass logistic regression:”
From these methods, we came up with the following hypothesis: The time-domain decoding model (TDM) will outperform the standard GLM.
Lateralization:
As a group, we also decided we wanted to study something more brain-related instead of solely analysis methods. For this reason, we settled on the lateralization of language processing. This led to our second hypothesis; The left hemisphere will have greater predictive power over the right hemisphere for predicting semantically differentiated tasks.
Results:
For both the GLM and Time-domain decoding method, we compared the beta values from each task type (math vs. story) and visualized the results per brain region, i.e. parcellation, using the visualization tool from the nilearn python package.
We found there to be no significant differences between the accuracy of models fit to the left and right hemispheres for both the GLM and TDM.
To test the performance of each method, we split the main dataset into training and testing datasets. On the test set, TDM (left = 0.79, right = 0.77) had significantly higher accuracy compared to the standard GLM method (left = 0.71, right = 0.75) for both hemispheres.
Conclusions:
The left hemisphere does not have more predictive power compared to the right hemisphere when differentiating a language and math task from fMRI data. While brain areas involved in language processing/production are lateralized to the left hemisphere, the right hemisphere may contain enough information about language processing that there is no difference in predictive power. The left and right hemispheres could also have equivalent performance in predicting the math task, thus reducing the difference in overall accuracy between predicting math versus language states.
When the conditions are spaced in time throughout the scan, performing the analysis in the temporal domain retains relevant information. Time-decoding framework is an appropriate alternative to standard GLM-based approaches. In this case, the time decoding model outperformed the standard GLM.
A few times a week the organizers would host panels of established experts in the relevant fields from academia and industry. The panels would be comprised of various experts from related computational neuroscience fields. I found an industry panel very enlightening, where the speakers were employed by FANG (MANG?), but all had backgrounds rooted in neuroscience. They spoke about how they felt their research to be more impactful and supported in the industry as opposed to academia. Most of the panelists had prior experience working in research labs at academic institutions. They expressed that the level of funding and resources at the labs of large tech companies is far greater than those in academia. I had always known this to be true for pure machine learning; Google Brain, OpenAI, and Meta have all been leading research in the field for some time now. However, I wouldn’t have thought this to be obviously true for the field of computational neuroscience.
I would, and do, highly recommend NMA to anyone looking for hands-on experience in the fields of computational neuroscience and machine learning. I attended the computational neuroscience course, however, there is also a machine learning course that looks to be of equal quality. I met many wonderful and smart people during my time attending NMA and am happy to say I made a few friendships. I found the three weeks extremely intensive trying to balance a full-time MLE job and the course. However, after it was done I came away from the course feeling that I had just learned more skills within the 3 weeks than I would have over an entire equivalent undergraduate course.
Loula, J., Varoquaux, G., & Thirion, B. (2018). Decoding fmri activity in the time domain improves classification performance. NeuroImage, 180, 203–210. doi:10.1016/j.neuroimage.2017.08.018
Costandi, M. (2016). Neuroplasticity. Cambridge, MA: The MIT Press.
Friederici, A. D., & Chomsky, N. (2017). Language in our brain: The origins of a uniquely human capacity. London: MIT.