The original paper appears to be Open Access, and is an interesting read:
(Nature)Scientific Reports: Identifying musical pieces from fMRI data using encoding and decoding models
The original voxel size is 3.75 × 3.75 × 4.75 = 66 mm³. According to the Wikipedia article on fMRI "A voxel typically contains a few million neurons and tens of billions of synapses, with the actual number depending on voxel size and the area of the brain being imaged.", so you are correlating music with the aggregate behaviour of a few million neurons smoothed over a number of seconds for each data point.
I'll stress I am not an expert in this field, but I am slightly worried that they have not made reference to the 'multiple comparisons' problem highlighted in the 'Dead Salmon' paper, and not used the key phrase "(Bonferroni) Correction of Pearson's Correlation" or similar. There is a background on this here:
The thing about using Pearson's correlation is that it assumes a linear relation between the things being tested. This does not always hold, as Anscombe's Quartet amply demonstrates. In other words, you need to justify why a linear relationship is the correct one to choose to test.
Now this is all the kind of stuff you would expect a referee to check before a paper is published. Sadly it often does not happen, and referees themselves can be blissfully ignorant of the correct statistical methods to use.
Essentially, what fMRI studies are saying is: we can see changes in blood flow in real time to particular regions of the brain. When we present a stimulus to people in fMRI machines, we see the blood flow changes. We think that there is a correlation between the blood flow, brain region, and stimulus, which we demonstrate with a lot of processing of noisy information.
What we don't see a lot of is how people have rigorously excluded false positives, and nor do we see papers on occasions where no correlation was found. It would also help if experiments were replicated by independent research groups and getting similar results.
All that said, I will underline that I am not an expert, and the paper may well be founded on excruciatingly correct statistical methods, and I am insufficiently knowledgeable to tell. It is like cryptography - I am not a cryptographer, and at some point I have to trust that people who are cryptographers (and programmers) are doing their job well. And, just like in cryptography where someone in their ignorance might design a weak cipher, experimenters can be ignorant of crucial things too. It makes things hard to decide.
Having wrenched things sort-of back on topic I will now close this interesting side discussion.