F. Huettig and R. Hartsuiker have a paper in Language and Cognitive Processes, Listening to yourself is like listening to others: External, but not internal, verbal self-monitoring is based on speech perception. (here) The abstract is below.
Theories of verbal self-monitoring generally assume an internal (pre-articulatory) monitoring channel, but there is debate about whether this channel relies on speech perception or on production-internal mechanisms. Perception-based theories predict that listening to one’s own inner speech has similar behavioural consequences as listening to someone else’s speech. Our experiment therefore registered eye-movements while speakers named objects accompanied by phonologically related or unrelated written words. The data showed that listening to one’s own speech drives eye-movements to phonologically related words, just as listening to someone else’s speech does in perception experiments. The time-course of these eye-movements was very similar to that in other-perception (starting 300 ms post-articulation), which demonstrates that these eye-movements were driven by the perception of overt speech, not inner speech. We conclude that external, but not internal monitoring, is based on speech perception.
This appears quite complex. The paper differentiates between our consciousness of our speech when it is not actually produced aloud and when spoken. The implication is that we produce and monitor our speech but are only consciously aware of the speech until we hear it. However, we become conscious of our internal, unspoken speech in a different way. This makes consciousness simpler but language more complicated. Consciousness is again a question of perception. But as BPS Research Digest puts it:
It’s important to clarify: we definitely do monitor our speech internally. For example, speakers can detect their speech errors even when their vocal utterances are masked by noise. What this new research suggests is that this internal monitoring isn’t done perceptually - we don’t ‘hear’ a pre-release copy of our own utterances. What’s the alternative? Huettig and Hartsuiker said error-checking is somehow built into the speech production system, but they admit: ‘there are presently no elaborated theories of [this] alternative viewpoint.’