Recently, Google added support for rich results for podcasts but that doesn't mean Google can understand what you are saying in those audio files. Meaning, if you do a 30 minute podcast, GoogleBot is not sitting through the whole 30 minutes listening to what you are saying and then parsing out the words for indexing and rankings.
Rich results are markup, so you are marking up your podcasts so Google can richen up the search results that have podcasts embedded on them. It isn't that GoogleBot can understand the actual audio content in the file.
Gary Illyes clarified this on Twitter in one word - saying "nope" when asked if GoogleBot understands the words in audio files.
@kirwanseo No
— Gary Illyes ᕕ( ᐛ )ᕗ (@methode) April 24, 2017
We do see YouTube do an okay job with machine translation but it is often wrong, espesially with how fast I talk through my weekly SEO videos but I did notice that over time, the transcripts get better and better.
Of course, it would be awesome if Google was able to very accurately parse, understand, index and rank audio files. Potentially then do this jump to feature where they jump you to the part of the podcast relevant to your query. We saw them test this with videos recently.
That would be cool, no?
Forum discussion at Twitter.