I record myself without monitoring/hearing myself. One issue is, if the monitoring has a slight delay, my brain gets extremely confused and I begin to stutter. The other is, real time monitoring of my own voice isn't the same as hearing the voice as if it is from a different person. And this is how viewers perceive someone's voice: a different person.
So I did test and training recordings of myself and listened to them afterwards, to learn how I will sound if I speak this or the other way. I took some text and read this aloud, so at the same time I learned how to read stuff and make it appear alive and not dull. Or I just commented random meaningless stuff ad hoc. I tried different speaking patterns, I tried how I sound if I speak louder or more silent. I learned to keep the same volume while speaking. In the end, I knew how I will sound if I speak this or the other way, so monitoring was not necessary any more.
The only thing you need to do is the general volume adjustment of your mic or mic source. Of course, the volume must correspond to the volume of all other sound sources you might include. You do this before starting any stream or recording, and usually don't ever need to adjust afterwards as long as you use the same microphone device.