so I came across this weird "issue". I wrote a program in C# and JS to extract the numbers from an image, however only the JS code, which uses the tesseract.js library can successfully get the text. The image given to both programs are identical and they both use the same model. I took the model from the Tesseract.JS GitHub to ensure they were both using the same models. The model can be found here
I presumed that the tesseract.js library may be altering the image in some way, so I looked through the source code and didn't manage to find anything.
Js Library: Tesseract.JS
C# library: Tesseract.Net.SDK
Image I used:
Here is the image I gave each program
C# code:
using var objOcr = OcrApi.Create();
objOcr.SetVariable("tessedit_char_whitelist", "0123456789");
objOcr.Init(Patagames.Ocr.Enums.Languages.English);
Bitmap image= new Bitmap("image.png")
var text = objOcr.GetTextFromImage(image);
JS code:
import Tesseract from 'tesseract.js';
Tesseract.recognize(
'image.png',
'eng',
{ logger: m => console.log(m) }
).then(({ data: { text } }) => {
console.log(text);
})