void tesseract::TessBaseAPI::AdaptToCharacter | ( | const char * | unichar_repr, | |
int | length, | |||
float | baseline, | |||
float | xheight, | |||
float | descender, | |||
float | ascender | |||
) | [protected, inherited] |
Adapt to recognize the current image as the given character. The image must be preloaded and be just an image of a single character.
bool tesseract::TessBaseAPI::CreateCubeObjects | ( | ) | [protected, inherited] |
Create the necessary Cube Objects
CubeLineObject** tesseract::TessBaseAPI::CreateLineObjects | ( | Pixa * | pixa_lines | ) | [protected, inherited] |
Create a Cube line object for each line
TBOX* tesseract::TessBaseAPI::CreatePhraseBoxes | ( | Boxa * | boxa_lines, | |
CubeLineObject ** | line_objs, | |||
int * | phrase_cnt | |||
) | [protected, inherited] |
Create a TBox array corresponding to the phrases in the array of line objects
int tesseract::TessBaseAPI::Cube | ( | ) | [protected, inherited] |
Call the Cube OCR engine. Takes the Region, line and word segmentation information from Tesseract as inputs. Makes changes or populates the output PAGE_RES object which contains the recogntion results. The behavior of this function depends on the current language and the value of the tessedit_accuracyvspeed: For English (and other Latin based scripts): If the accuracyvspeed flag is set to any value other than AVS_FASTEST, Cube uses the word information passed by Tesseract. Cube will run on a subset of the words segmented and recognized by Tesseract. The value of the accuracyvspeed and the Tesseract confidence of a word determines whether Cube runs on it or not and whether Cube's results override Tesseract's For Arabic & Hindi: Cube uses the Region information passed by Tesseract. It then performs its own line segmentation. This will change once Tesseract's line segmentation works for Arabic. Cube then segments each line into phrases. Each phrase is then recognized in phrase mode which allows spaces in the results. Note that at this point, the line segmentation algorithm might have some problems with ill spaced Arabic document.
int tesseract::TessBaseAPI::CubePostProcessWords | ( | ) | [protected, inherited] |
Run Cube on a subset of the words already present in the page_res_ object The subset, and whether Cube overrides the results is determined by the SpeedVsAccuracy flag
void tesseract::TessBaseAPI::DeleteBlockList | ( | BLOCK_LIST * | block_list | ) | [static, protected, inherited] |
Delete a block list. This is to keep BLOCK_LIST pointer opaque and let go of including the other headers.
BLOCK_LIST * tesseract::TessBaseAPI::FindLinesCreateBlockList | ( | ) | [protected, inherited] |
Find lines from the image making the BLOCK_LIST.
PAGE_RES * tesseract::TessBaseAPI::RecognitionPass1 | ( | BLOCK_LIST * | block_list | ) | [protected, inherited] |
Recognize text doing one pass only, using settings for a given pass.
PAGE_RES * tesseract::TessBaseAPI::RecognitionPass2 | ( | BLOCK_LIST * | block_list, | |
PAGE_RES * | pass1_result | |||
) | [protected, inherited] |
bool tesseract::TessBaseAPI::RecognizePhrase | ( | CubeObject * | phrase, | |
PAGE_RES_IT * | result | |||
) | [protected, inherited] |
Recognize a single phrase saving the results to the page_res_ object
bool tesseract::TessBaseAPI::RecognizePhrases | ( | int | line_cnt, | |
int | phrase_cnt, | |||
CubeLineObject ** | line_objs, | |||
TBOX * | phrase_boxes | |||
) | [protected, inherited] |
Recognize the phrases saving the results to the page_res_ object
int tesseract::TessBaseAPI::RunCubeOnLines | ( | ) | [protected, inherited] |
Run Cube on the lines extracted by Tesseract.
int tesseract::TessBaseAPI::TesseractExtractResult | ( | char ** | text, | |
int ** | lengths, | |||
float ** | costs, | |||
int ** | x0, | |||
int ** | y0, | |||
int ** | x1, | |||
int ** | y1, | |||
PAGE_RES * | page_res | |||
) | [static, protected, inherited] |
Extract the OCR results, costs (penalty points for uncertainty), and the bounding boxes of the characters.