AcronymDecomposer: acronymDecomposer::Acro Class Reference

class Acro:
    arguments:
    acronym: the acronym self
    expansion: the resulting expansion
    expCandidate: the candidate string for the expansion
    candidate: the candidate string for the acronym
    position: the current possition necessary to calculate the search space 
    digitDict: possible matches for some digits

Member Function Documentation

def AD_token::Token::__init__	(	self,
		token,
		tag
	)		`[inherited]`

Definition at line 13 of file AD_token.py.

def acronymDecomposer::Acro::__init__ ( self )

Definition at line 19 of file acronymDecomposer.py.

def acronymDecomposer::Acro::__str__ ( self )

Representation from Acro for printing.

Reimplemented from AD_token::Token.

Definition at line 28 of file acronymDecomposer.py.

def acronymDecomposer::Acro::checkDigitDict	(	self,
		ac,
		ec
	)

Check if the digit in the acronym stands for
a word like "for"(4) or "to"(2).

Definition at line 310 of file acronymDecomposer.py.

def acronymDecomposer::Acro::checkTrueExp	(	self,
		candidate,
		expCandidate,
		tagSetDict
	)

Compares the Acronym-Candidate and the Expansion-Candidate backwards.
Each character from the Acronym-Candidate must appear in one of the tokens in the Expansion-Candidate
in the same order as in the Acronym-Candidate; the first character of the Acronym-Candidate must match
a character in the initial position of the first word in the Expansion-Candidate.

Returns an Acro instance.

Definition at line 175 of file acronymDecomposer.py.

def acronymDecomposer::Acro::computeSearchSpace	(	self,
		candidate,
		position,
		tokenList,
		tagSetDict
	)

Computes the search space for Acronym-Candidates:
- if the Acronym-Candidate is longer than 5 characters the searchspace is definedto be the lenght of the Acronym-Candidate+5;
- if it is shorter than 5 characters the searchspace is the lenght*2

Returns the Acronym-Candiate as list of AD_Token instances.

Definition at line 142 of file acronymDecomposer.py.

def acronymDecomposer::Acro::countFirstChar	(	self,
		ac,
		acPos,
		ec,
		ecPos
	)

If the acronym starts with a digit, or some digit is inside
of the acronym, count the words in the expansion, which start with
the character preceding or following the digit and if the number of the
characters is equal to the digit return 1.

Definition at line 323 of file acronymDecomposer.py.

def acronymDecomposer::Acro::findAcroCandidate	(	self,
		text
	)

Surches for Acronym-candidates: uppercase or capitalized tokens
- in parentheses
- infront of parentheses
- inftont of ", or" 
- after ", or"

Returns Candidates list.

Definition at line 35 of file acronymDecomposer.py.

def AD_token::Token::tagTextWithMT	(	self,
		text
	)		`[inherited]`

The input text is tokenized and tagged using the Penn-Treebank Tag Set.

Returns list of tokens.

Definition at line 31 of file AD_token.py.

Member Data Documentation

acronymDecomposer::Acro::acronym

Definition at line 21 of file acronymDecomposer.py.

acronymDecomposer::Acro::candidate

Definition at line 23 of file acronymDecomposer.py.

acronymDecomposer::Acro::digitDict

Definition at line 26 of file acronymDecomposer.py.

acronymDecomposer::Acro::expansion

Definition at line 22 of file acronymDecomposer.py.

acronymDecomposer::Acro::expCandidate

Definition at line 24 of file acronymDecomposer.py.

acronymDecomposer::Acro::position

Definition at line 25 of file acronymDecomposer.py.

AD_token::Token::tag [inherited]

Definition at line 17 of file AD_token.py.

AD_token::Token::tagSetDict [inherited]

Definition at line 18 of file AD_token.py.

AD_token::Token::token [inherited]

Definition at line 16 of file AD_token.py.

acronymDecomposer::Acro Class Reference

Public Member Functions

Public Attributes

Detailed Description

Member Function Documentation

Member Data Documentation


Public Member Functions
def	__init__
def	__str__
def	findAcroCandidate
def	computeSearchSpace
def	checkTrueExp
def	checkDigitDict
def	countFirstChar
def	__init__
def	tagTextWithMT
Public Attributes
	acronym
	expansion
	candidate
	expCandidate
	position
	digitDict
	token
	tag
	tagSetDict