OCR - Text Recognition & Detection built-in OBS

OCR - Text Recognition & Detection built-in OBS v0.0.8

royshilkrot

Member
royshilkrot updated OCR - Text Recognition & Detection built-in OBS with a new update entry:

v0.0.6 - Binarization! Preview! Scaling, dilation... the good stuff!

Adding several binarization methods - threshold, Otsu/Triangle and adaptive
As well as a way to visualize the binarized image before it goes to Tesseract for processing.
Also adding "Rescaling" to an input size optimal for Tesseract
And finally Dilate operation which can help "fuse together" broken characters.

Enjoy!

Download:

Read the rest of this update entry...
 

AntDX316

New Member
Can it be possible to API out an OCR area and even make it be saved automatically in hotkeys and at least saved to the clipboard for a quick ctrl+V?
 

monkeejuice

New Member
Hey Roy,

I just got to run this plugin through some paces over the last week. I really like it. Let me give you a quick rundown:

I am pulling 5 different texts with the OCR. Each one has a mask to close in. I am pulling from an extended monitor that I have the source URL displayed on. (I tried pulling straight from the URL but, it appears that I have to do this for each text I pull from the same URL. This stacked up my CPU usage pretty quickly. Approximately, 7% per OCR.)

So rather, it seemed to run better from one browser tab on an extended display.

What's peculiar is that, while the OCR data is presented to the PGM output, it runs at about 10% CPU usage. However, when I change the PGM so to cut the OCRs scene, the CPU goes up to 25%.

Since I am using this to populate an overlay for real-time data, I will put this scene in a DSK to cut it from the PGM output. When I do this, the CPU load shoots up.
Is there a better way to do this or, is this something that you think should be addressed?
 
Last edited:

royshilkrot

Member
Hey Roy,

I just got to run this plugin through some paces over the last week. I really like it. Let me give you a quick rundown:

I am pulling 5 different texts with the OCR. Each one has a mask to close in. I am pulling from an extended monitor that I have the source URL displayed on. (I tried pulling straight from the URL but, it appears that I have to do this for each text I pull from the same URL. This stacked up my CPU usage pretty quickly. Approximately, 7% per OCR.)

So rather, it seemed to run better from one browser tab on an extended display.

What's peculiar is that, while the OCR data is presented to the PGM output, it runs at about 10% CPU usage. However, when I change the PGM so to cut the OCRs scene, the CPU goes up to 25%.

Since I am using this to populate an overlay for real-time data, I will put this scene in a DSK to cut it from the PGM output. When I do this, the CPU load shoots up.
Is there a better way to do this or, is this something that you think should be addressed?
@monkeejuice this sounds strange. but it's hard to visualize and understand the problem fully from the description. do you want to send some info like screenshots etc. on out discord channel? https://discord.gg/W7fEgzdSY4
 

FonixVR

New Member
First of all, this is dope if you would like to get spotify lyrics on stream. Could you add an option where if no text is detected, it just clears out the line and makes the text source blank?
 

royshilkrot

Member
First of all, this is dope if you would like to get spotify lyrics on stream. Could you add an option where if no text is detected, it just clears out the line and makes the text source blank?
That's an awesome use case

I believe it should clear out
However there's an option to ignore empty values, which you can disable
Then empty values would still be pushed through
 

FonixVR

New Member
That's an awesome use case

I believe it should clear out
However there's an option to ignore empty values, which you can disable
Then empty values would still be pushed through
Discord Link to Video This is how I plan to use it. However when it goes to "..." while reading from the app I got going, it just freezes and keeps the words of last time on there.
 

FonixVR

New Member
Could you maybe add an option where it turns off the text source if no value is detected (or readable)? I would love to be able to have the text fade in and out if its an instrumental or like between songs.
 

royshilkrot

Member
Could you maybe add an option where it turns off the text source if no value is detected (or readable)? I would love to be able to have the text fade in and out if its an instrumental or like between songs.
i created an issue on github for this
 

JumpinBeans

New Member
Hey Roy,

Love the work.
I'm looking to set up a dynamic stream overlay for fps games and hoping you can provide your recommendations/constructive criticism for my use of OCR.
Firstly, as there is no websocket currently, I was going to use streamerbot to detect a file change to initiate.
I was looking to use OCR to detect my game name in the kill feed (detection on the kill and death sides), then give an output where I can use streamerbot to tally them up and display on screen through OBS. I am also looking to do the same with the win/loss scene at the end of a round, detecting if you won or lost and giving an output that can be tracked.
I don't think I will have much difficulty implementing those, though I was also looking to identify the rank symbol at the end of the game. I would like to record my current rank so I can display it on my overlay. I am looking to display it via streamerbot toggling visibility of a corresponding source in OBS. My issue is I'm not sure how I could get OCR to detect and identify the correct rank and give an output I could use in streamerbot.
I was also considering trying to detect as much as I can through one OCR scene. So was going to create a nested scene with a png to mask out unnecessary areas of the game screen.

Any assistance would be greatly appreciated.
 
Last edited:

JumpinBeans

New Member
Once I can pull the critical information from the game reliably, I can make the stream react to the game. Camera pulse and flash red on deaths. Counters for wins/losses and kills/deaths. Automatic scene switching when entering and exiting a match. Etc.
 

Imagineer

New Member
Very nice tool, works quite well on text recognition, but i'm struggling with my particular usecase; Dice.

I'm trying to use the OCR function for my DnD Dice. The update from 0.0.5 to 0.0.7 has helped a lot already in recognition, but it still struggles.

My setup:
- DnD dice with numbers written on them (no dots, just the 1,2,3...6 d6 for testing, but eventually needing up to d20)
- clear feed of a fullhd stream of my dicetray
- binary cleaned (this update helped a lot!!) to only show a full black background, white square with the number in black.
- one character (others wont recognize at all)
- valid characters 012346789
- english language
(- dakboard and scoreboard didnt work at all for this)

This route seemed to work best, but only when the dice is exactly horizontal to the feed and only when the setting is 'one character'. Using 'one word' and word length 2 doesnt help.
As soon as the dice is turned even the slightest, it fails recognition.
Using 2 dice in the same screen also wont work. Using a dice with numbers 10..20 wont work either. (These are probably due to the setting 'one character')

Any suggestions on what to change in my settings?
 
Top