acronymDecomposer::Acro Class Reference

Inheritance diagram for acronymDecomposer::Acro:

AD_token::Token List of all members.

Public Member Functions

def __init__
def __str__
def findAcroCandidate
def computeSearchSpace
def checkTrueExp
def checkDigitDict
def countFirstChar
def __init__
def tagTextWithMT

Public Attributes

 acronym
 expansion
 candidate
 expCandidate
 position
 digitDict
 token
 tag
 tagSetDict

Detailed Description

class Acro:
    arguments:
    acronym: the acronym self
    expansion: the resulting expansion
    expCandidate: the candidate string for the expansion
    candidate: the candidate string for the acronym
    position: the current possition necessary to calculate the search space 
    digitDict: possible matches for some digits

Definition at line 16 of file acronymDecomposer.py.


Member Function Documentation

def AD_token::Token::__init__ (   self,
  token,
  tag 
) [inherited]

Definition at line 13 of file AD_token.py.

def acronymDecomposer::Acro::__init__ (   self  ) 

Definition at line 19 of file acronymDecomposer.py.

def acronymDecomposer::Acro::__str__ (   self  ) 

Representation from Acro for printing.

Reimplemented from AD_token::Token.

Definition at line 28 of file acronymDecomposer.py.

def acronymDecomposer::Acro::checkDigitDict (   self,
  ac,
  ec 
)

Check if the digit in the acronym stands for
a word like "for"(4) or "to"(2).

Definition at line 310 of file acronymDecomposer.py.

def acronymDecomposer::Acro::checkTrueExp (   self,
  candidate,
  expCandidate,
  tagSetDict 
)

Compares the Acronym-Candidate and the Expansion-Candidate backwards.
Each character from the Acronym-Candidate must appear in one of the tokens in the Expansion-Candidate
in the same order as in the Acronym-Candidate; the first character of the Acronym-Candidate must match
a character in the initial position of the first word in the Expansion-Candidate.

Returns an Acro instance.        

Definition at line 175 of file acronymDecomposer.py.

def acronymDecomposer::Acro::computeSearchSpace (   self,
  candidate,
  position,
  tokenList,
  tagSetDict 
)

Computes the search space for Acronym-Candidates:
- if the Acronym-Candidate is longer than 5 characters the searchspace is definedto be the lenght of the Acronym-Candidate+5;
- if it is shorter than 5 characters the searchspace is the lenght*2

Returns the Acronym-Candiate as list of AD_Token instances.

Definition at line 142 of file acronymDecomposer.py.

def acronymDecomposer::Acro::countFirstChar (   self,
  ac,
  acPos,
  ec,
  ecPos 
)

If the acronym starts with a digit, or some digit is inside
of the acronym, count the words in the expansion, which start with
the character preceding or following the digit and if the number of the
characters is equal to the digit return 1.

Definition at line 323 of file acronymDecomposer.py.

def acronymDecomposer::Acro::findAcroCandidate (   self,
  text 
)

Surches for Acronym-candidates: uppercase or capitalized tokens
- in parentheses
- infront of parentheses
- inftont of ", or" 
- after ", or"

Returns Candidates list.

Definition at line 35 of file acronymDecomposer.py.

def AD_token::Token::tagTextWithMT (   self,
  text 
) [inherited]

The input text is tokenized and tagged using the Penn-Treebank Tag Set.

Returns list of tokens.

Definition at line 31 of file AD_token.py.


Member Data Documentation

acronymDecomposer::Acro::acronym

Definition at line 21 of file acronymDecomposer.py.

acronymDecomposer::Acro::candidate

Definition at line 23 of file acronymDecomposer.py.

acronymDecomposer::Acro::digitDict

Definition at line 26 of file acronymDecomposer.py.

acronymDecomposer::Acro::expansion

Definition at line 22 of file acronymDecomposer.py.

acronymDecomposer::Acro::expCandidate

Definition at line 24 of file acronymDecomposer.py.

acronymDecomposer::Acro::position

Definition at line 25 of file acronymDecomposer.py.

AD_token::Token::tag [inherited]

Definition at line 17 of file AD_token.py.

AD_token::Token::tagSetDict [inherited]

Definition at line 18 of file AD_token.py.

AD_token::Token::token [inherited]

Definition at line 16 of file AD_token.py.


The documentation for this class was generated from the following file:
Generated on Fri Aug 11 17:55:37 2006 for AcronymDecomposer by  doxygen 1.4.7