Vowel Sound Synthesis from Electroencephalography during Listening and Recalling
-
- Wataru Akashi
- Institute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 Japan
-
- Hiroyuki Kambara
- Institute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 Japan
-
- Yousuke Ogata
- Institute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 Japan
-
- Yasuharu Koike
- Institute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 Japan
-
- Ludovico Minati
- Institute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 Japan
-
- Natsue Yoshimura
- Institute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 Japan
Description
<jats:sec><jats:label/><jats:p>Recent advances in brain imaging technology have furthered our knowledge of the neural basis of auditory and speech processing, often via contributions from invasive brain signal recording and stimulation studies conducted intraoperatively. Herein, an approach for synthesizing vowel sounds straightforwardly from scalp‐recorded electroencephalography (EEG), a noninvasive neurophysiological recording method is demonstrated. Given cortical current signals derived from the EEG acquired while human participants listen to and recall (i.e., imagined) two vowels, /a/ and /i/, sound parameters are estimated by a convolutional neural network (CNN). The speech synthesized from the estimated parameters is sufficiently natural to achieve recognition rates >85% during a subsequent sound discrimination task. Notably, the CNN identifies the involvement of the brain areas mediating the “what” auditory stream, namely the superior, middle temporal, and Heschl's gyri, demonstrating the efficacy of the computational method in extracting auditory‐related information from neuroelectrical activity. Differences in cortical sound representation between listening versus recalling are further revealed, such that the fusiform, calcarine, and anterior cingulate gyri contributes during listening, whereas the inferior occipital gyrus is engaged during recollection. The proposed approach can expand the scope of EEG in decoding auditory perception that requires high spatial and temporal resolution.</jats:p></jats:sec>
Journal
-
- Advanced Intelligent Systems
-
Advanced Intelligent Systems 3 (2), 2021-01-07
Wiley
- Tweet
Keywords
- Neuroinformatics
- brain activity signals
- Computer engineering. Computer hardware
- cortical current source estimations
- Control engineering systems. Automatic machinery (General)
- deep-learning
- speech syntheses
- TK7885-7895
- TJ212-225
- General Earth and Planetary Sciences
- EBRAINS
- electroencephalography
- General Environmental Science
Details 詳細情報について
-
- CRID
- 1360290617527876736
-
- ISSN
- 26404567
-
- Article Type
- journal article
-
- Data Source
-
- Crossref
- KAKEN
- OpenAIRE