Is there true type font file for 'raster font'?

smwikipedia

I am using Tesseract to do OCR for some screenshots. The characters in screenshots are in raster fonts. But Tesseract requires True Type Font file for training.

I can find many true type font files at Windows/Fonts folder. I am wondering if there's one for raster fonts?

Mike 'Pomax' Kamermans

"raster fonts" aren't a real thing though: OpenType (of which truetype is one of the two internal encodings) are true fonts, conforming to a highly detailed, authoritative specification, but raster fonts are pretty much "there is no single spec, you can invent whatever you want, as long as your program knows how to unpack the thing you made". There's a whole bunch of different ways to define a raster/bitmap font, and all of them are basically of the form bitmap image + header that says which letter maps to which x/y/w/h rectangle in the image.

OCR won't want to work with them because bitmap fonts cannot be scaled: simplest reason is "there is no official bitmap font spec", but even if there was, if you're trying to match a bitmap font to an OCR result then the entire page being even 1 pixel off in width or hight with respect to what your bitmap font needs can lead to no text being matchable at all. Bbitmap fonts are encoded to fixed to font sizes (usually only one, sometimes more than one, but still rigidly fixed) and so if the scanned document isn't exactly the right size, none of the pixels will perfectly overlap, leading to ridiculous things like the O and V matching either V and O with the same reliability, because a tiny pixel shift vertically can make V and O overlap with the same number of error pixels.

OpenType fonts on the other hand use vector outlines, and can be scaled to best-match with a variety of extremely successful algorithms. Unless the document you scanned in is "drastically too small" vector transforms will yield 90-100% matching without any problems.

What you want to do instead is hit up something like MyFont.com's What The Font! and drop in a crop of your scanned document with a sentence, maybe two, then have it tell you which font is the closest match for it, and then simply use that font for your OCR training. Super effective!

이 기사는 인터넷에서 수집됩니다. 재 인쇄 할 때 출처를 알려주십시오.

침해가 발생한 경우 연락 주시기 바랍니다[email protected] 삭제

에서 수정
0

몇 마디 만하겠습니다

0리뷰
로그인참여 후 검토

관련 기사

분류에서Dev

how to put True type font file in assets folder in android studio

분류에서Dev

Difference between OTF (Open Type) or TTF (True Type) font formats?

분류에서Dev

Font type of items in UITableView

분류에서Dev

Font type of items in UITableView

분류에서Dev

Remove unnecessary css font-face font type

분류에서Dev

Compass @font-face can't find font file

분류에서Dev

Change font size in project file tree in PHPStorm

분류에서Dev

Sweave-LaTeX file: use a new font in text but keep original font for R code

분류에서Dev

How to set column colors, title colors , font size , font family in external css file for highcharts?

분류에서Dev

Gnuplot: how to use type-1 font in PDF

분류에서Dev

File tab menu font on Notepad++ with Windows 10

분류에서Dev

Ubuntu Font Family and Mac OS X confusing font name and font

분류에서Dev

What font will be used for PDFs if Helvetica font unavailable

분류에서Dev

URxvt.font ignoring font size

분류에서Dev

Is it possible to change the Terminal font?

분류에서Dev

how to change font for <select>

분류에서Dev

Font looks screwy in Chrome

분류에서Dev

Use Font Awesome in Inkscape

분류에서Dev

IcoMoon Font Doubling Up

분류에서Dev

ghostscript DejaVu font error

분류에서Dev

emacs font on MacBook Air

분류에서Dev

Font Awesome 4.1 URL

분류에서Dev

Change font in my listView

분류에서Dev

Changing Table font in CSS

분류에서Dev

Unset PDF font with script

분류에서Dev

Changing a JMenuBar's font

분류에서Dev

VBA font color loop

분류에서Dev

Font name of windows taskbar

분류에서Dev

Change li font in ul?