Tesseract 3.01
Advanced API

Functions

void tesseract::TessBaseAPI::SetImage (const unsigned char *imagedata, int width, int height, int bytes_per_pixel, int bytes_per_line)
void tesseract::TessBaseAPI::SetImage (const Pix *pix)
void tesseract::TessBaseAPI::SetRectangle (int left, int top, int width, int height)
void tesseract::TessBaseAPI::SetThresholder (ImageThresholder *thresholder)
Pix * tesseract::TessBaseAPI::GetThresholdedImage ()
Boxa * tesseract::TessBaseAPI::GetRegions (Pixa **pixa)
Boxa * tesseract::TessBaseAPI::GetTextlines (Pixa **pixa, int **blockids)
Boxa * tesseract::TessBaseAPI::GetWords (Pixa **pixa)
Boxa * tesseract::TessBaseAPI::GetConnectedComponents (Pixa **cc)
Boxa * tesseract::TessBaseAPI::GetComponentImages (PageIteratorLevel level, Pixa **pixa, int **blockids)
void tesseract::TessBaseAPI::DumpPGM (const char *filename)
PageIterator * tesseract::TessBaseAPI::AnalyseLayout ()
int tesseract::TessBaseAPI::Recognize (ETEXT_DESC *monitor)
int tesseract::TessBaseAPI::RecognizeForChopTest (ETEXT_DESC *monitor)
bool tesseract::TessBaseAPI::ProcessPages (const char *filename, const char *retry_config, int timeout_millisec, STRING *text_out)
bool tesseract::TessBaseAPI::ProcessPage (Pix *pix, int page_index, const char *filename, const char *retry_config, int timeout_millisec, STRING *text_out)
ResultIterator * tesseract::TessBaseAPI::GetIterator ()
char * tesseract::TessBaseAPI::GetUTF8Text ()
char * tesseract::TessBaseAPI::GetHOCRText (int page_number)
char * tesseract::TessBaseAPI::GetBoxText (int page_number)
char * tesseract::TessBaseAPI::GetUNLVText ()
int tesseract::TessBaseAPI::MeanTextConf ()
int * tesseract::TessBaseAPI::AllWordConfidences ()
bool tesseract::TessBaseAPI::AdaptToWordStr (PageSegMode mode, const char *wordstr)
void tesseract::TessBaseAPI::Clear ()
void tesseract::TessBaseAPI::End ()
int tesseract::TessBaseAPI::IsValidWord (const char *word)
bool tesseract::TessBaseAPI::GetTextDirection (int *out_offset, float *out_slope)
void tesseract::TessBaseAPI::SetDictFunc (DictFunc f)
void tesseract::TessBaseAPI::SetProbabilityInContextFunc (ProbabilityInContextFunc f)
bool tesseract::TessBaseAPI::DetectOS (OSResults *)
void tesseract::TessBaseAPI::GetFeaturesForBlob (TBLOB *blob, const DENORM &denorm, INT_FEATURE_ARRAY int_features, int *num_features, int *FeatureOutlineIndex)
static ROWtesseract::TessBaseAPI::FindRowForBox (BLOCK_LIST *blocks, int left, int top, int right, int bottom)
void tesseract::TessBaseAPI::RunAdaptiveClassifier (TBLOB *blob, const DENORM &denorm, int num_max_matches, int *unichar_ids, float *ratings, int *num_matches_returned)
const char * tesseract::TessBaseAPI::GetUnichar (int unichar_id)
const Dawg * tesseract::TessBaseAPI::GetDawg (int i) const
int tesseract::TessBaseAPI::NumDawgs () const
const char * tesseract::TessBaseAPI::GetLastInitLanguage () const
static ROWtesseract::TessBaseAPI::MakeTessOCRRow (float baseline, float xheight, float descender, float ascender)
static TBLOBtesseract::TessBaseAPI::MakeTBLOB (Pix *pix)
static void tesseract::TessBaseAPI::NormalizeTBLOB (TBLOB *tblob, ROW *row, bool numeric_mode, DENORM *denorm)
Tesseract *const tesseract::TessBaseAPI::tesseract () const
void tesseract::TessBaseAPI::InitTruthCallback (TruthCallback *cb)
CubeRecoContext * tesseract::TessBaseAPI::GetCubeRecoContext () const
void tesseract::TessBaseAPI::set_min_orientation_margin (double margin)
void tesseract::TessBaseAPI::GetBlockTextOrientations (int **block_orientation, bool **vertical_writing)
BLOCK_LIST * tesseract::TessBaseAPI::FindLinesCreateBlockList ()
static void tesseract::TessBaseAPI::DeleteBlockList (BLOCK_LIST *block_list)

Detailed Description

The following methods break TesseractRect into pieces, so you can get hold of the thresholded image, get the text in different formats, get bounding boxes, confidences etc.


Function Documentation

bool tesseract::TessBaseAPI::AdaptToWordStr ( PageSegMode  mode,
const char *  wordstr 
)

Applies the given word to the adaptive classifier if possible. The word must be SPACE-DELIMITED UTF-8 - l i k e t h i s , so it can tell the boundaries of the graphemes. Assumes that SetImage/SetRectangle have been used to set the image to the given word. The mode arg should be PSM_SINGLE_WORD or PSM_CIRCLE_WORD, as that will be used to control layout analysis. The currently set PageSegMode is preserved. Returns false if adaption was not possible for some reason.

int * tesseract::TessBaseAPI::AllWordConfidences ( )

Returns all word confidences (between 0 and 100) in an array, terminated by -1. The calling function must delete [] after use. The number of confidences should correspond to the number of space- delimited words in GetUTF8Text.

PageIterator * tesseract::TessBaseAPI::AnalyseLayout ( )
void tesseract::TessBaseAPI::Clear ( )

Free up recognition results and any stored image data, without actually freeing any recognition data that would be time-consuming to reload. Afterwards, you must call SetImage or TesseractRect before doing any Recognize or Get* operation.

void tesseract::TessBaseAPI::DeleteBlockList ( BLOCK_LIST *  block_list) [static]

Delete a block list. This is to keep BLOCK_LIST pointer opaque and let go of including the other headers.

bool tesseract::TessBaseAPI::DetectOS ( OSResults osr)

Estimates the Orientation And Script of the image.

Returns:
true if the image was processed successfully.
void tesseract::TessBaseAPI::DumpPGM ( const char *  filename)

Dump the internal binary image to a PGM file.

Deprecated:
Use GetThresholdedImage and write the image using pixWrite instead if possible.
void tesseract::TessBaseAPI::End ( )

Close down tesseract and free up all memory. End() is equivalent to destructing and reconstructing your TessBaseAPI. Once End() has been used, none of the other API functions may be used other than Init and anything declared above it in the class definition.

BLOCK_LIST * tesseract::TessBaseAPI::FindLinesCreateBlockList ( )

Find lines from the image making the BLOCK_LIST.

ROW * tesseract::TessBaseAPI::FindRowForBox ( BLOCK_LIST *  blocks,
int  left,
int  top,
int  right,
int  bottom 
) [static]
void tesseract::TessBaseAPI::GetBlockTextOrientations ( int **  block_orientation,
bool **  vertical_writing 
)
char * tesseract::TessBaseAPI::GetBoxText ( int  page_number)

The recognized text is returned as a char* which is coded in the same format as a box file used in training. Returned string must be freed with the delete [] operator. Constructs coordinates in the original image - not just the rectangle. page_number is a 0-based page index that will appear in the box file.

Boxa * tesseract::TessBaseAPI::GetComponentImages ( PageIteratorLevel  level,
Pixa **  pixa,
int **  blockids 
)
Boxa * tesseract::TessBaseAPI::GetConnectedComponents ( Pixa **  cc)
CubeRecoContext * tesseract::TessBaseAPI::GetCubeRecoContext ( ) const
const Dawg * tesseract::TessBaseAPI::GetDawg ( int  i) const

Return the pointer to the i-th dawg loaded into tesseract_ object.

void tesseract::TessBaseAPI::GetFeaturesForBlob ( TBLOB blob,
const DENORM denorm,
INT_FEATURE_ARRAY  int_features,
int *  num_features,
int *  FeatureOutlineIndex 
)

This method returns the features associated with the input image.

char * tesseract::TessBaseAPI::GetHOCRText ( int  page_number)

Make a HTML-formatted string with hOCR markup from the internal data structures. page_number is 0-based but will appear in the output as 1-based.

ResultIterator * tesseract::TessBaseAPI::GetIterator ( )
const char * tesseract::TessBaseAPI::GetLastInitLanguage ( ) const

Return the language used in the last valid initialization.

Boxa * tesseract::TessBaseAPI::GetRegions ( Pixa **  pixa)

Get the result of page layout analysis as a leptonica-style Boxa, Pixa pair, in reading order. Can be called before or after Recognize.

bool tesseract::TessBaseAPI::GetTextDirection ( int *  out_offset,
float *  out_slope 
)
Boxa * tesseract::TessBaseAPI::GetTextlines ( Pixa **  pixa,
int **  blockids 
)

Get the textlines as a leptonica-style Boxa, Pixa pair, in reading order. Can be called before or after Recognize. If blockids is not NULL, the block-id of each line is also returned as an array of one element per line. delete [] after use.

Pix * tesseract::TessBaseAPI::GetThresholdedImage ( )

Get a copy of the internal thresholded image from Tesseract. Caller takes ownership of the Pix and must pixDestroy it. May be called any time after SetImage, or after TesseractRect.

const char * tesseract::TessBaseAPI::GetUnichar ( int  unichar_id)
char * tesseract::TessBaseAPI::GetUNLVText ( )

The recognized text is returned as a char* which is coded as UNLV format Latin-1 with specific reject and suspect codes and must be freed with the delete [] operator.

char * tesseract::TessBaseAPI::GetUTF8Text ( )

The recognized text is returned as a char* which is coded as UTF8 and must be freed with the delete [] operator.

Boxa * tesseract::TessBaseAPI::GetWords ( Pixa **  pixa)

Get the words as a leptonica-style Boxa, Pixa pair, in reading order. Can be called before or after Recognize.

void tesseract::TessBaseAPI::InitTruthCallback ( TruthCallback cb) [inline]
int tesseract::TessBaseAPI::IsValidWord ( const char *  word)

Check whether a word is valid according to Tesseract's language model

Returns:
0 if the word is invalid, non-zero if valid.
Warning:
temporary! This function will be removed from here and placed in a separate API at some future time.
TBLOB * tesseract::TessBaseAPI::MakeTBLOB ( Pix *  pix) [static]
ROW * tesseract::TessBaseAPI::MakeTessOCRRow ( float  baseline,
float  xheight,
float  descender,
float  ascender 
) [static]
int tesseract::TessBaseAPI::MeanTextConf ( )

Returns the (average) confidence value between 0 and 100.

void tesseract::TessBaseAPI::NormalizeTBLOB ( TBLOB tblob,
ROW row,
bool  numeric_mode,
DENORM denorm 
) [static]
int tesseract::TessBaseAPI::NumDawgs ( ) const

Return the number of dawgs loaded into tesseract_ object.

bool tesseract::TessBaseAPI::ProcessPage ( Pix *  pix,
int  page_index,
const char *  filename,
const char *  retry_config,
int  timeout_millisec,
STRING text_out 
)

Recognizes a single page for ProcessPages, appending the text to text_out. The pix is the image processed - filename and page_index are metadata used by side-effect processes, such as reading a box file or formatting as hOCR. If non-zero timeout_millisec terminates processing after the timeout. If non-NULL and non-empty, and some page fails for some reason, the page is reprocessed with the retry_config config file. Useful for interactively debugging a bad page. The text is returned in text_out. Returns false on error.

bool tesseract::TessBaseAPI::ProcessPages ( const char *  filename,
const char *  retry_config,
int  timeout_millisec,
STRING text_out 
)

Recognizes all the pages in the named file, as a multi-page tiff or list of filenames, or single image, and gets the appropriate kind of text according to parameters: tessedit_create_boxfile, tessedit_make_boxes_from_boxes, tessedit_write_unlv, tessedit_create_hocr. Calls ProcessPage on each page in the input file, which may be a multi-page tiff, single-page other file format, or a plain text list of images to read. If tessedit_page_number is non-negative, processing begins at that page of a multi-page tiff file, or filelist. The text is returned in text_out. Returns false on error. If non-zero timeout_millisec terminates processing after the timeout on a single page. If non-NULL and non-empty, and some page fails for some reason, the page is reprocessed with the retry_config config file. Useful for interactively debugging a bad page.

int tesseract::TessBaseAPI::Recognize ( ETEXT_DESC monitor)

Recognize the image from SetAndThresholdImage, generating Tesseract internal structures. Returns 0 on success. Optional. The Get*Text functions below will call Recognize if needed. After Recognize, the output is kept internally until the next SetImage.

int tesseract::TessBaseAPI::RecognizeForChopTest ( ETEXT_DESC monitor)

Methods to retrieve information after SetAndThresholdImage(), Recognize() or TesseractRect(). (Recognize is called implicitly if needed.) Variant on Recognize used for testing chopper.

void tesseract::TessBaseAPI::RunAdaptiveClassifier ( TBLOB blob,
const DENORM denorm,
int  num_max_matches,
int *  unichar_ids,
float *  ratings,
int *  num_matches_returned 
)
void tesseract::TessBaseAPI::set_min_orientation_margin ( double  margin)
void tesseract::TessBaseAPI::SetDictFunc ( DictFunc  f)

Sets Dict::letter_is_okay_ function to point to the given function.

void tesseract::TessBaseAPI::SetImage ( const unsigned char *  imagedata,
int  width,
int  height,
int  bytes_per_pixel,
int  bytes_per_line 
)

Provide an image for Tesseract to recognize. Format is as TesseractRect above. Does not copy the image buffer, or take ownership. The source image may be destroyed after Recognize is called, either explicitly or implicitly via one of the Get*Text functions. SetImage clears all recognition results, and sets the rectangle to the full image, so it may be followed immediately by a GetUTF8Text, and it will automatically perform recognition.

void tesseract::TessBaseAPI::SetImage ( const Pix *  pix)

Provide an image for Tesseract to recognize. As with SetImage above, Tesseract doesn't take a copy or ownership or pixDestroy the image, so it must persist until after Recognize. Pix vs raw, which to use? Use Pix where possible. A future version of Tesseract may choose to use Pix as its internal representation and discard IMAGE altogether. Because of that, an implementation that sources and targets Pix may end up with less copies than an implementation that does not.

void tesseract::TessBaseAPI::SetProbabilityInContextFunc ( ProbabilityInContextFunc  f)

Sets Dict::probability_in_context_ function to point to the given function.

void tesseract::TessBaseAPI::SetRectangle ( int  left,
int  top,
int  width,
int  height 
)

Restrict recognition to a sub-rectangle of the image. Call after SetImage. Each SetRectangle clears the recogntion results so multiple rectangles can be recognized with the same image.

void tesseract::TessBaseAPI::SetThresholder ( ImageThresholder thresholder) [inline]

In extreme cases only, usually with a subclass of Thresholder, it is possible to provide a different Thresholder. The Thresholder may be preloaded with an image, settings etc, or they may be set after. Note that Tesseract takes ownership of the Thresholder and will delete it when it it is replaced or the API is destructed.

Tesseract* const tesseract::TessBaseAPI::tesseract ( ) const [inline]
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines