in

Specialization in transformer language models and human brain #specialization

Shared functional specialization in transformer-based language models and the human brain

Summarise this content to 300 words

  • Berwick, R. C., Friederici, A. D., Chomsky, N. & Bolhuis, J. J. Evolution, brain, and the nature of language. Trends Cogn. Sci. 17, 89–98 (2013).

    Article 
    PubMed 

    Google Scholar
     

  • Partee, B. Lexical semantics and compositionality. Invit. Cogn. Sci.: Lang. 1, 311–360 (1995).


    Google Scholar
     

  • Chomsky, N. Aspects of the theory of syntax. MIT Press. (1965)

  • Christiansen, M. H. & Chater, N. The now-or-never bottleneck: a fundamental constraint on language. Behav. Brain Sci. 39, e62 (2016).

    Article 
    PubMed 

    Google Scholar
     

  • Goldberg, A. E. Constructions at work: the nature of generalization in language. Oxford University Press (2006).

  • MacDonald, M. C., Pearlmutter, N. J. & Seidenberg, M. S. The lexical nature of syntactic ambiguity resolution. Psychol. Rev. 101, 676–703 (1994).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Bruner, J. S. Actual minds, possible worlds. Harvard University Press (1985).

  • Graesser, A. C., Singer, M. & Trabasso, T. Constructing inferences during narrative text comprehension. Psychol. Rev. 101, 371–395 (1994).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Martin, A. E. A compositional neural architecture for language. J. Cogn. Neurosci. 32, 1407–1427 (2020).

    Article 
    PubMed 

    Google Scholar
     

  • Martin, A. E. & Doumas, L. A. A. A mechanism for the cortical computation of hierarchical linguistic structure. PLoS Biol. 15, e2000663 (2017).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Pylkkänen, L. The neural basis of combinatory syntax and semantics. Science 366, 62–66 (2019).

    Article 
    ADS 
    PubMed 

    Google Scholar
     

  • Ding, N., Melloni, L., Zhang, H., Tian, X. & Poeppel, D. Cortical tracking of hierarchical linguistic structures in connected speech. Nat. Neurosci. 19, 158–164 (2016).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Friederici, A. D., Chomsky, N., Berwick, R. C., Moro, A. & Bolhuis, J. J. Language, mind and brain. Nat. Hum. Behav. 1, 713–722 (2017).

    Article 
    PubMed 

    Google Scholar
     

  • Hasson, U., Chen, J. & Honey, C. J. Hierarchical process memory: memory as an integral component of information processing. Trends Cogn. Sci. 19, 304–313 (2015).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hickok, G. & Poeppel, D. The cortical organization of speech processing. Nat. Rev. Neurosci. 8, 393–402 (2007).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Price, C. J. A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading. NeuroImage 62, 816–847 (2012).

    Article 
    PubMed 

    Google Scholar
     

  • Vigneau, M. et al. Meta-analyzing left hemisphere language areas: phonology, semantics, and sentence processing. NeuroImage 30, 1414–1432 (2006).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Bookheimer, S. Functional MRI of language: new approaches to understanding the cortical organization of semantic processing. Annu. Rev. Neurosci. 25, 151–188 (2002).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Friederici, A. D. The brain basis of language processing: from structure to function. Physiol. Rev. 91, 1357–1392 (2011).

    Article 
    PubMed 

    Google Scholar
     

  • Nastase, S. A., Goldstein, A. & Hasson, U. Keep it real: rethinking the primacy of experimental control in cognitive neuroscience. NeuroImage 222, 117254 (2020a).

    Article 
    PubMed 

    Google Scholar
     

  • Nastase, S. A., Liu, Y. F., Hillman, H., Norman, K. A. & Hasson, U. Leveraging shared connectivity to aggregate heterogeneous datasets into a common response space. NeuroImage 217, 116865 (2020b).

    Article 
    PubMed 

    Google Scholar
     

  • Willems, R. M., Nastase, S. A. & Milivojevic, B. Narratives for neuroscience. Trends Neurosci. 43, 271–273 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Hamilton, L. S. & Huth, A. G. The revolution will not be controlled: natural stimuli in speech neuroscience. Lang. Cognit. Neurosci. 35, 573–582 (2020).

    Article 

    Google Scholar
     

  • Mitchell, T. M. et al. Predicting human brain activity associated with the meanings of nouns. Science 320, 1191–1195 (2008).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Pereira, F. et al. Toward a universal decoder of linguistic meaning from brain activation. Nat. Commun. 9, 963 (2018).

    Article 
    ADS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wehbe, L. et al. Simultaneously uncovering the patterns of brain regions involved in different story reading subprocesses. PloS One 9, e112575 (2014).

    Article 
    ADS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Brennan, J. et al. Syntactic structure building in the anterior temporal lobe during natural story listening. Brain Lang. 120, 163–173 (2012).

    Article 
    PubMed 

    Google Scholar
     

  • Brennan, J. Naturalistic sentence comprehension in the brain. Lang. Linguist. Compass 10, 299–313 (2016).

    Article 

    Google Scholar
     

  • Hale, J. T. et al. Neurocomputational models of language processing. Annu. Rev. Linguist. 8, 427–446 (2022).

    Article 

    Google Scholar
     

  • Huth, A. G., de Heer, W. A., Griffiths, T. L., Theunissen, F. E. & Gallant, J. L. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature 532, 453–458 (2016).

    Article 
    ADS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Radford, A. et al. Language models are unsupervised multitask learners. OpenAI Blog. https://www.techbooky.com/wp-content/uploads/2019/02/Better-Language-Models-and-Their-Implications.pdf (2019).

  • Vaswani, A. et al. Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (Vol. 30, pp. 6000–6010). Curran Associates Inc. https://proceedings.neurips.cc/paper_files/paper/2019/file/749a8e6c231831ef7756db230b4359c8-Paper.pdf. (2017).

  • Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–4186. https://doi.org/10.18653/v1/N19-1423 (2019).

  • Elman, J. L. Finding structure in time. Cogn. Sci. 14, 179–211 (1990).

    Article 

    Google Scholar
     

  • Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. Distributed representations of words and phrases and their compositionality. In C. J. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems (Vol. 26). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdf (2013).

  • Pennington, J., Socher, R., & Manning, C. GloVe: Global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. https://doi.org/10.3115/v1/d14-1162 (2014).

  • Landauer, T. K. & Dumais, S. T. A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104, 211–240 (1997).

    Article 

    Google Scholar
     

  • Pavlick, E. Semantic structure in deep learning. Annu. Rev. Appl. Linguist. 8, 447–471 (2022).

    Article 

    Google Scholar
     

  • Piantadosi, S. Modern language models refute Chomsky’s approach to language. LingBuzz. https://lingbuzz.net/lingbuzz/007180 (2023).

  • Linzen, T. & Baroni, M. Syntactic structure from deep learning. Annu. Rev. Linguist. 7, 195–212 (2021).

    Article 

    Google Scholar
     

  • Manning, C. D., Clark, K., Hewitt, J., Khandelwal, U. & Levy, O. Emergent linguistic structure in artificial neural networks trained by self-supervision. Proc. Natl Acad. Sci. USA 117, 30046–30054 (2020).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Toneva, M., & Wehbe, L. Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain). In H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché-Buc, E. Fox, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (Vol. 32, pp. 14954–14964). Curran Associates Inc. https://dl.acm.org/doi/abs/10.5555/3454287.3455626 (2019).

  • Zada, Z. et al. A shared linguistic space for transmitting our thoughts from brain to brain in natural conversations. bioRxiv. https://doi.org/10.1101/2023.06.27.546708 (2023).

  • Caucheteux, C., Gramfort, A., & King, J.-R. Model-based analysis of brain activity reveals the hierarchy of language in 305 subjects. In M.-F. Moens, X. Huang, L. Specia, & S. W. Yih (Eds.) Findings of the Association for Computational Linguistics: EMNLP 2021 (pp. 3635–3644). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.findings-emnlp.308 (2021a).

  • Caucheteux, C., Gramfort, A., & King, J.-R. Disentangling syntax and semantics in the brain with deep networks. In M. Meila & T. Zhang (Eds.), Proceedings of the 38th International Conference on Machine Learning (Vol. 139, pp. 1336–1348). PMLR. https://proceedings.mlr.press/v139/caucheteux21a.html (2021b).

  • Caucheteux, C., Gramfort, A. & King, J.-R. Deep language algorithms predict semantic comprehension from brain activity. Sci. Rep. 12, 16327 (2022).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Caucheteux, C., Gramfort, A. & King, J. R. Evidence of a predictive coding hierarchy in the human brain listening to speech. Nat. Hum. Behav. 7, 430–441 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Antonello, R., Turek, J. S., Vo, V., & Huth, A. Low-dimensional structure in the space of language representations is reflected in brain responses. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. S. Liang, & J. W. Vaughan (Eds.), Advances in neural information processing systems (Vol. 34, pp. 8332–8344). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2021/file/464074179972cbbd75a39abc6954cd12-Paper.pdf (2021).

  • Goldstein, A. et al. Shared computational principles for language processing in humans and deep language models. Nat. Neurosci. 25, 369–380 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Heilbron, M., Armeni, K., Schoffelen, J.-M., Hagoort, P. & de Lange, F. P. A hierarchy of linguistic predictions during natural language comprehension. Proc. Natl Acad. Sci. USA 119, e2201968119 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Jain, S., & Huth, A. Incorporating context into language encoding models for fMRI. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, & R. Garnett (Eds.) Advances in Neural Information Processing Systems (Vol. 31, pp. 6628–6637). Curran Associates, Inc. http://papers.nips.cc/paper/7897-incorporating-context-into-language-encoding-models-for-fmri.pdf (2018).

  • Lyu, B., Marslen-Wilson, W. D., Fang, Y. & Tyler, L. K. Finding structure during incremental speech comprehension. eLife 12, RP89311 (2024).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Brennan, J. R., Dyer, C., Kuncoro, A. & Hale, J. T. Localizing syntactic predictions using recurrent neural network grammars. Neuropsychologia 146, 107479 (2020).

    Article 
    PubMed 

    Google Scholar
     

  • Tenney, I., Das, D., & Pavlick, E. BERT rediscovers the classical NLP pipeline. In Proceedings of the 57th annual meeting of the association for computational linguistics, 4593–4601. https://doi.org/10.18653/v1/P19-1452 (2019).

  • Clark, K., Khandelwal, U., Levy, O., & Manning, C. D. What does BERT look at? An analysis of BERT’s attention. Proceedings of the 2019 ACL Workshop BlackboxNLP: analyzing and interpreting neural networks for NLP, 276–286. https://doi.org/10.18653/v1/W19-4828 (2019).

  • Dyer, C., Kuncoro, A., Ballesteros, M., & Smith, N. A. Recurrent neural network grammars. In Knight, K., Nenkova, A., & Rambow, O. (Eds.) Proceedings of the 2016 Conference of the North American chapter of the association for computational linguistics: Human Language Technologies (pp. 199–209). https://doi.org/10.18653/v1/N16-1024 (2016).

  • Rogers, A., Kovaleva, O. & Rumshisky, A. A primer in BERTology: what we know about how BERT works. Trans. Assoc. Comput. Linguist. 8, 842–866 (2020).

    Article 

    Google Scholar
     

  • Naselaris, T., Kay, K. N., Nishimoto, S. & Gallant, J. L. Encoding and decoding in fMRI. NeuroImage 56, 400–410 (2011).

    Article 
    PubMed 

    Google Scholar
     

  • Richards, B. A. et al. A deep learning framework for neuroscience. Nat. Neurosci. 22, 1761–1770 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Yamins, D. L. K. & DiCarlo, J. J. Using goal-driven deep learning models to understand sensory cortex. Nat. Neurosci. 19, 356–365 (2016).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Mesulam, M.-M., Thompson, C. K., Weintraub, S. & Rogalski, E. J. The Wernicke conundrum and the anatomy of language comprehension in primary progressive aphasia. Brain 138, 2423–2437 (2015).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Reddy, A. J., & Wehbe, L. Can fMRI reveal the representation of syntactic structure in the brain? In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. S. Liang, & J. W. Vaughan (Eds.), Advances in Neural Information Processing Systems (Vol. 34, pp. 9843–9856). Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2021/file/51a472c08e21aef54ed749806e3e6490-Paper.pdf (2021).

  • Blank, I., Balewski, Z., Mahowald, K. & Fedorenko, E. Syntactic processing is distributed across the language system. NeuroImage 127, 307–323 (2016).

    Article 
    PubMed 

    Google Scholar
     

  • Fedorenko, E., Blank, I. A., Siegelman, M. & Mineroff, Z. Lack of selectivity for syntax relative to word meanings throughout the language network. Cognition 203, 104348 (2020).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Fedorenko, E., Nieto-Castañon, A. & Kanwisher, N. Lexical and syntactic representations in the brain: an fMRI investigation with multi-voxel pattern analyses. Neuropsychologia 50, 499–513 (2012).

    Article 
    PubMed 

    Google Scholar
     

  • Schaefer, A. et al. Local-global parcellation of the human cerebral cortex from intrinsic functional connectivity MRI. Cereb. Cortex 28, 3095–3114 (2018).

    Article 
    PubMed 

    Google Scholar
     

  • Fedorenko, E., Hsieh, P.-J., Nieto-Castañón, A., Whitfield-Gabrieli, S. & Kanwisher, N. New method for fMRI investigations of language: defining ROIs functionally in individual subjects. J. Neurophysiol. 104, 1177–1194 (2010).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Dupré la Tour, T., Eickenberg, M., Nunez-Elizalde, A. O. & Gallant, J. L. Feature-space selection with banded ridge regression. NeuroImage 264, 119728 (2022).

    Article 
    PubMed 

    Google Scholar
     

  • Nastase, S. A., Gazzola, V., Hasson, U. & Keysers, C. Measuring shared responses across subjects using intersubject correlation. Soc. Cogn. Affect. Neurosci. 14, 667–685 (2019).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • Abnar, S., & Zuidema, W. Quantifying attention flow in transformers. In Jurafsky, D., Chai, J., Schluter, N., & Tetreault, J. (Eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 4190–4197). https://doi.org/10.18653/v1/2020.acl-main.385 (2020).

  • DeRose, J. F., Wang, J. & Berger, M. Attention flows: analyzing and comparing attention mechanisms in language models. IEEE Trans. Vis. Computer Graph. 27, 1160–1170 (2020).

    Article 

    Google Scholar
     

  • Hawkins, R. D., Yamakoshi, T., Griffiths, T. L., & Goldberg, A. E. Investigating representations of verb bias in neural language models. In B. Webber, T. Cohn, Y. He, & Y. Liu (Eds.), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 4653–4663). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-main.376 (2020).

  • Hewitt, J., & Manning, C. D. A structural probe for finding syntax in word representations. In J. Burstein, C. Doran, & T. Solorio (Eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 4129–4138). Association for Computational Linguistics. https://doi.org/10.18653/v1/N19-1419 (2019).

  • Hoover, B., Strobelt, H., & Gehrmann, S. exBERT: a visual analysis tool to explore learned representations in transformer models. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 187–196. https://doi.org/10.18653/v1/2020.acl-demos.22 (2020)

  • Liu, N. F., Gardner, M., Belinkov, Y., Peters, M. E., & Smith, N. A. Linguistic knowledge and transferability of contextual representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 1073–1094. https://doi.org/10.18653/v1/N19-1112 (2019).

  • Elhage, N. et al. A mathematical framework for transformer circuits. Transformer Circuits Thread. https://transformer-circuits.pub/2021/framework/index.html (2021).

  • Schrimpf, M., et al. The neural architecture of language: integrative modeling converges on predictive processing. Proc. Natl. Acad. Sci. USA 118, e2105646118 (2021).

  • Stanojević, M., Brennan, J. R., Dunagan, D., Steedman, M. & Hale, J. T. Modeling structure‐building in the brain with CCG parsing and large language models. Cogn. Sci. 47, e13312 (2023).

    Article 
    PubMed 

    Google Scholar
     

  • Caucheteux, C. & King, J.-R. Brains and algorithms partially converge in natural language processing. Commun. Biol. 5, 134 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Goldstein, A. et al. Correspondence between the layered structure of deep language models and temporal structure of natural language processing in the human brain. bioRxiv. https://doi.org/10.1101/2022.07.11.499562 (2022).

  • Ni, W. et al. An event-related neuroimaging study distinguishing form and content in sentence processing. J. Cogn. Neurosci. 12, 120–133 (2000).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Schell, M., Zaccarella, E. & Friederici, A. D. Differential cortical contribution of syntax and semantics: an fMRI study on two-word phrasal processing. Cortex 96, 105–120 (2017).

    Article 
    PubMed 

    Google Scholar
     

  • Dapretto, M. & Bookheimer, S. Y. Form and content: dissociating syntax and semantics in sentence comprehension. Neuron 24, 427–432 (1999).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Embick, D., Marantz, A., Miyashita, Y., O’Neil, W. & Sakai, K. L. A syntactic specialization for Broca’s area. Proc. Natl Acad. Sci. USA 97, 6150–6154 (2000).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Friederici, A. D., Rüschemeyer, S.-A., Hahne, A. & Fiebach, C. J. The role of left inferior frontal and superior temporal cortex in sentence comprehension: localizing syntactic and semantic processes. Cereb. Cortex 13, 170–177 (2003).

    Article 
    PubMed 

    Google Scholar
     

  • Glaser, Y. G., Martin, R. C., Van Dyke, J. A., Hamilton, A. C. & Tan, Y. Neural basis of semantic and syntactic interference in sentence comprehension. Brain Lang. 126, 314–326 (2013).

    Article 
    PubMed 

    Google Scholar
     

  • Kuperberg, G. R. et al. Common and distinct neural substrates for pragmatic, semantic, and syntactic processing of spoken sentences: an fMRI study. J. Cogn. Neurosci. 12, 321–341 (2000).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Fedorenko, E., Behr, M. K. & Kanwisher, N. Functional specificity for high-level linguistic processing in the human brain. Proc. Natl Acad. Sci. USA 108, 16428–16433 (2011).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Mineroff, Z., Blank, I. A., Mahowald, K. & Fedorenko, E. A robust dissociation among the language, multiple demand, and default mode networks: evidence from inter-region correlations in effect size. Neuropsychologia 119, 501–511 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Kriegeskorte, N. Deep neural networks: a new framework for modeling biological vision and brain information processing. Annu. Rev. Vis. Sci. 1, 417–446 (2015).

    Article 
    PubMed 

    Google Scholar
     

  • He K, Zhang X, Ren S, & Sun J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 770–778. http://openaccess.thecvf.com/content_cvpr_2016/html/He_Deep_Residual_Learning_CVPR_2016_paper.html (2016).

  • Krizhevsky, A., Sutskever, I., & Hinton, G. E. ImageNet classification with deep convolutional neural networks. In F. Pereira, C. J. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems (Vol. 25, pp. 1097–1105). Curran Associates, Inc. https://proceedings.neurips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf (2012).

  • Dupré la Tour, T., Lu, M., Eickenberg, M., & Gallant, J. L. A finer mapping of convolutional neural network layers to the visual cortex. SVRHM 2021 Workshop @ NeurIPS. https://openreview.net/pdf?id=EcoKpq43Ul8 (2021).

  • Güçlü, U. & van Gerven, M. A. J. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J. Neurosci. 35, 10005–10014 (2015).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Baroni, M. Linguistic generalization and compositionality in modern artificial neural networks. Philos. Trans. R. Soc. Lond.: Ser. B, Biol. Sci. 375, 20190307 (2020).

    Article 

    Google Scholar
     

  • Binder, J. R., Desai, R. H., Graves, W. W. & Conant, L. L. Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies. Cereb. Cortex 19, 2767–2796 (2009).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Murphy, E. et al. Minimal phrase composition revealed by intracranial recordings. J. Neurosci. 42, 3216–3227 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Flick, G. & Pylkkänen, L. Isolating syntax in natural language: MEG evidence for an early contribution of left posterior temporal cortex. Cortex 127, 42–57 (2020).

    Article 
    PubMed 

    Google Scholar
     

  • Hickok, G. & Poeppel, D. Towards a functional neuroanatomy of speech perception. Trends Cogn. Sci. 4, 131–138 (2000).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Ben-Shachar, M., Hendler, T., Kahn, I., Ben-Bashat, D. & Grodzinsky, Y. The neural reality of syntactic transformations: evidence from functional magnetic resonance imaging. Psychol. Sci. 14, 433–440 (2003).

    Article 
    PubMed 

    Google Scholar
     

  • Bornkessel, I., Zysset, S., Friederici, A. D., von Cramon, D. Y. & Schlesewsky, M. Who did what to whom? The neural basis of argument hierarchies during language comprehension. NeuroImage 26, 221–233 (2005).

    Article 
    PubMed 

    Google Scholar
     

  • Vo, V. A. et al. A unifying computational account of temporal context effects in language across the human cortex. bioRxiv. https://doi.org/10.1101/2023.08.03.551886 (2023).

  • Chang, C. H. C., Nastase, S. A. & Hasson, U. Information flow across the cortical timescale hierarchy during narrative construction. Proc. Natl Acad. Sci. USA 119, e2209307119 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Lerner, Y., Honey, C. J., Silbert, L. J. & Hasson, U. Topographic mapping of a hierarchy of temporal receptive windows using a narrated story. J. Neurosci. 31, 2906–2915 (2011).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Vandenberghe, R., Nobre, A. C. & Price, C. J. The response of left temporal cortex to sentences. J. Cogn. Neurosci. 14, 550–560 (2002).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Ferstl, E. C., Neumann, J., Bogler, C. & von Cramon, D. Y. The extended language network: a meta-analysis of neuroimaging studies on text comprehension. Hum. Brain Mapp. 29, 581–593 (2008).

    Article 
    PubMed 

    Google Scholar
     

  • Baldassano, C., Hasson, U. & Norman, K. A. Representation of real-world event schemas during narrative perception. J. Neurosci. 38, 9689–9699 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Bašnáková, J., Weber, K., Petersson, K. M., van Berkum, J. & Hagoort, P. Beyond the language given: the neural correlates of inferring speaker meaning. Cereb. Cortex 24, 2572–2578 (2014).

    Article 
    PubMed 

    Google Scholar
     

  • Maguire, E. A., Frith, C. D. & Morris, R. G. The functional neuroanatomy of comprehension and memory: the importance of prior knowledge. Brain 122, 1839–1850 (1999).

    Article 
    PubMed 

    Google Scholar
     

  • Makuuchi, M., Bahlmann, J., Anwander, A. & Friederici, A. D. Segregating the core computational faculty of human language from working memory. Proc. Natl Acad. Sci. USA 106, 8362–8367 (2009).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Braga, R. M., DiNicola, L. M., Becker, H. C. & Buckner, R. L. Situating the left-lateralized language network in the broader organization of multiple specialized large-scale distributed networks. J. Neurophysiol. 124, 1415–1448 (2020).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Fedorenko, E. & Blank, I. A. Broca’s area is not a natural kind. Trends Cogn. Sci. 24, 270–284 (2020).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Matchin, W. & Hickok, G. The cortical organization of syntax. Cereb. Cortex 30, 1481–1498 (2020).

    Article 
    PubMed 

    Google Scholar
     

  • Schaeffer, R., Khona, M. & Fiete, I. No free lunch from deep learning in neuroscience: a case study through models of the entorhinal-hippocampal circuit. In Advances in Neural Information Processing Systems 35 (eds Koyejo, S. et al.) 16052–16067 (Curran Associates, Inc., 2022).

  • Antonello, R. & Huth, A. Predictive coding or just feature discovery? An alternative account of why language models fit brain data. Neurobiol. Lang. 5, 64–79 (2024).

  • Guest, O. & Martin, A. E. On Logical Inference over Brains, Behaviour, and Artificial Neural Networks. Comput. Brain. Behav. 6, 213–227 (2023).

  • Hasson, U., Nastase, S. A. & Goldstein, A. Direct fit to nature: an evolutionary perspective on biological and artificial neural networks. Neuron 105, 416–434 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wang, A. et al. SuperGLUE: a stickier benchmark for general-purpose language understanding systems. In Advances in Neural Information Processing Systems 32 (eds Wallach, H. et al.) (Curran Associates, Inc., 2019).

  • Wang, A. et al. GLUE: a multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP (eds Linzen, T., Chrupała, G. & Alishahi, A.) 353–355 (Association for Computational Linguistics, 2018).

  • Warstadt, A. et al. BLiMP: The benchmark of linguistic minimal pairs for English. Transactions of the Association for Computational Linguistics 8, 377–392 (2020).

  • Mahowald, K. et al. Dissociating language and thought in large language models. Trends. Cogn. Sci. 28, 517–540 (2024).

  • Raghu, M., Unterthiner, T., Kornblith, S., Zhang, C., & Dosovitskiy, A. Do vision transformers see like convolutional neural networks? In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. S. Liang, & J. W. Vaughan (Eds.), Advances in Neural Information Processing Systems (Vol. 34, pp. 12116–12128). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2021/file/652cf38361a209088302ba2b8b7f51e0-Paper.pdf (2021).

  • Santoro, R. et al. Encoding of natural sounds at multiple spectral and temporal resolutions in the human auditory cortex. PLoS Comput. Biol. 10, e1003412 (2014).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • de Heer, W. A., Huth, A. G., Griffiths, T. L., Gallant, J. L. & Theunissen, F. E. The hierarchical cortical organization of human speech processing. J. Neurosci. 37, 6539–6557 (2017).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Millet, J. et al. Toward a realistic model of speech processing in the brain with self-supervised learning. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Eds.) Advances in Neural Information Processing Systems (Vol. 35) (pp. 33428–33443). Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2022/file/d81ecfc8fb18e833a3fa0a35d92532b8-Paper-Conference.pdf (2022).

  • Vaidya, A. R., Jain, S. & Huth, A. G. Self-supervised models of audio effectively explain human cortical responses to speech. Proc. 39th Int. Conf. Mach. Learn. 162, 21927–21944 (2022).


    Google Scholar
     

  • Goldstein, A. et al. Deep speech-to-text models capture the neural basis of spontaneous speech in everyday conversations. bioRxiv https://www.biorxiv.org/content/10.1101/2023.06.26.546557v1 (2023).

  • Li, Y. et al. Dissecting neural computations in the human auditory pathway using deep neural networks for speech. Nat.) Neurosci. 26, 2213–2225 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Saur, D. et al. Ventral and dorsal pathways for language. Proc. Natl Acad. Sci. USA 105, 18035–18040 (2008).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Catani, M., Jones, D. K. & Ffytche, D. H. Perisylvian language networks of the human brain. Ann. Neurol. 57, 8–16 (2005).

    Article 
    PubMed 

    Google Scholar
     

  • Dick, A. S. & Tremblay, P. Beyond the arcuate fasciculus: consensus and controversy in the connectional anatomy of language. Brain 135, 3529–3550 (2012).

    Article 
    PubMed 

    Google Scholar
     

  • McClelland, J. L. et al. Letting structure emerge: connectionist and dynamical systems approaches to cognition. Trends Cogn. Sci. 14, 348–356 (2010).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Nasr, K., Viswanathan, P., & Nieder, A. Number detectors spontaneously emerge in a deep neural network designed for visual object recognition. Sci. Adv. 5, eaav7903 (2019).

  • Yang, G. R., Joglekar, M. R., Song, H. F., Newsome, W. T. & Wang, X.-J. Task representations in neural networks trained to perform many cognitive tasks. Nat. Neurosci. 22, 297–306 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Dobs, K., Martinez, J., Kell, A. J. E. & Kanwisher, N. Brain-like functional specialization emerges spontaneously in deep neural networks. Sci. Adv. 8, eabl8913 (2022).

    Article 
    ADS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Nastase, S. A. et al. The “Narratives” fMRI dataset for evaluating models of naturalistic language comprehension. Sci. Data 8, 250 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Gorgolewski, K. J. et al. The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Sci. Data 3, 160044 (2016).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Esteban, O. et al. fMRIPrep: a robust preprocessing pipeline for functional MRI. Nat. Methods 16, 111–116 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Cox, R. W. AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Comput. Biomed. Res. Int. J. 29, 162–173 (1996).

    Article 
    CAS 

    Google Scholar
     

  • Behzadi, Y., Restom, K., Liau, J. & Liu, T. T. A component based noise correction method (CompCor) for BOLD and perfusion based fMRI. NeuroImage 37, 90–101 (2007).

    Article 
    PubMed 

    Google Scholar
     

  • Baldassano, C. et al. Discovering event structure in continuous narrative perception and memory. Neuron 95, 709–721 (2017).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Abraham, A. et al. Machine learning for neuroimaging with scikit-learn. Front. Neuroinform. 8, 14 (2014).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).

    Article 

    Google Scholar
     

  • Waskom, M. L. seaborn: statistical data visualization. J. Open Source Softw. 6, 3021 (2021).

    Article 
    ADS 

    Google Scholar
     

  • Honnibal, M., Montani, I., Van Landeghem, S., & Boyd, A. SpaCy: industrial-strength natural language processing in python. Zenodo. https://doi.org/10.5281/zenodo.1212303 (2020).

  • LeCun, Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998).

    Article 

    Google Scholar
     

  • Brown, T. et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33 (eds Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M. F. & Lin, H.)1877–1901 (Curran Associates, Inc., 2020).

  • Carden, G. Backwards anaphora in discourse context. J. Linguist. 18, 361–387 (1982).

    Article 

    Google Scholar
     

  • Meng, K., Bau, D., Andonian, A., & Belinkov, Y. Locating and editing factual associations in GPT. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Eds.) Advances in neural information processing systems (Vol. 35) (pp. 17359–17372). Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2022/file/6f1d43d5a82a37e89b0665b33bf3a182-Paper-Conference.pdf (2022).

  • Wolf, T. et al. Transformers: state-of-the-art natural language processing. In Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, 38–45 (2020).

  • Vig, J., & Belinkov, Y. Analyzing the structure of attention in a transformer language model. In Proceedings of the 2019 ACL Workshop BlackboxNLP: analyzing and interpreting neural networks for NLP, 63–76 (2019).

  • Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830, http://www.jmlr.org/papers/v12/pedregosa11a.html (2011).

    MathSciNet 

    Google Scholar
     

  • Brodersen, K. H., Ong, C. S., Stephan, K. E. & Buhmann, J. M. The balanced accuracy and its posterior distribution. 20th Int. Conf. Pattern Recognit. 2010, 3121–3124 (2010).


    Google Scholar
     

  • Nunez-Elizalde, A. O., Huth, A. G. & Gallant, J. L. Voxelwise encoding models with non-spherical multivariate normal priors. NeuroImage 197, 482–492 (2019).

    Article 
    PubMed 

    Google Scholar
     

  • Lee Masson, H. & Isik, L. Functional selectivity for social interaction perception in the human superior temporal sulcus during natural viewing. NeuroImage 245, 118741 (2021).

    Article 
    PubMed 

    Google Scholar
     

  • Aly, M., Chen, J., Turk-Browne, N. B. & Hasson, U. Learning naturalistic temporal structure in the posterior medial network. J. Cogn. Neurosci. 30, 1345–1365 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Nili, H. et al. A toolbox for representational similarity analysis. PLoS Comput. Biol. 10, e1003553 (2014).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • LeBel, A. et al. A natural language fMRI dataset for voxelwise encoding models. Sci. Data 10, 555 (2023).

  • Van Uden, C. E. et al. Modeling semantic encoding in a common neural representational space. Front. Neurosci. 12, 437 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hall, P. & Wilson, S. R. Two guidelines for bootstrap hypothesis testing. Biometrics 47, 757–762 (1991).

    Article 
    MathSciNet 

    Google Scholar
     

  • Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc.: Ser. B Stat. Methodol. 57, 289–300 (1995).

    Article 
    MathSciNet 

    Google Scholar
     

  • Huth, A. G., Nishimoto, S., Vu, A. T. & Gallant, J. L. A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron 76, 1210–1224 (2012).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Source link

    Source link: https://www.nature.com/articles/s41467-024-49173-5

    What do you think?

    Leave a Reply

    GIPHY App Key not set. Please check settings

    AKA: John French

    GPT creates Love Poem about Art, untangling emotions #ArtLove

    Rumor has it that Apple will integrate Google's chatbot Gemini into Apple Intelligence this fall

    Apple rumored to integrate Google’s chatbot Gemini into Apple Intelligence. #AI