Inheritance diagram for acronymDecomposer::Acro:
Public Member Functions | |
def | __init__ |
def | __str__ |
def | findAcroCandidate |
def | computeSearchSpace |
def | checkTrueExp |
def | checkDigitDict |
def | countFirstChar |
def | __init__ |
def | tagTextWithMT |
Public Attributes | |
acronym | |
expansion | |
candidate | |
expCandidate | |
position | |
digitDict | |
token | |
tag | |
tagSetDict |
class Acro: arguments: acronym: the acronym self expansion: the resulting expansion expCandidate: the candidate string for the expansion candidate: the candidate string for the acronym position: the current possition necessary to calculate the search space digitDict: possible matches for some digits
Definition at line 16 of file acronymDecomposer.py.
def AD_token::Token::__init__ | ( | self, | ||
token, | ||||
tag | ||||
) | [inherited] |
Definition at line 13 of file AD_token.py.
def acronymDecomposer::Acro::__init__ | ( | self | ) |
Definition at line 19 of file acronymDecomposer.py.
def acronymDecomposer::Acro::__str__ | ( | self | ) |
Representation from Acro for printing.
Reimplemented from AD_token::Token.
Definition at line 28 of file acronymDecomposer.py.
def acronymDecomposer::Acro::checkDigitDict | ( | self, | ||
ac, | ||||
ec | ||||
) |
Check if the digit in the acronym stands for a word like "for"(4) or "to"(2).
Definition at line 310 of file acronymDecomposer.py.
def acronymDecomposer::Acro::checkTrueExp | ( | self, | ||
candidate, | ||||
expCandidate, | ||||
tagSetDict | ||||
) |
Compares the Acronym-Candidate and the Expansion-Candidate backwards. Each character from the Acronym-Candidate must appear in one of the tokens in the Expansion-Candidate in the same order as in the Acronym-Candidate; the first character of the Acronym-Candidate must match a character in the initial position of the first word in the Expansion-Candidate. Returns an Acro instance.
Definition at line 175 of file acronymDecomposer.py.
def acronymDecomposer::Acro::computeSearchSpace | ( | self, | ||
candidate, | ||||
position, | ||||
tokenList, | ||||
tagSetDict | ||||
) |
Computes the search space for Acronym-Candidates: - if the Acronym-Candidate is longer than 5 characters the searchspace is definedto be the lenght of the Acronym-Candidate+5; - if it is shorter than 5 characters the searchspace is the lenght*2 Returns the Acronym-Candiate as list of AD_Token instances.
Definition at line 142 of file acronymDecomposer.py.
def acronymDecomposer::Acro::countFirstChar | ( | self, | ||
ac, | ||||
acPos, | ||||
ec, | ||||
ecPos | ||||
) |
If the acronym starts with a digit, or some digit is inside of the acronym, count the words in the expansion, which start with the character preceding or following the digit and if the number of the characters is equal to the digit return 1.
Definition at line 323 of file acronymDecomposer.py.
def acronymDecomposer::Acro::findAcroCandidate | ( | self, | ||
text | ||||
) |
Surches for Acronym-candidates: uppercase or capitalized tokens - in parentheses - infront of parentheses - inftont of ", or" - after ", or" Returns Candidates list.
Definition at line 35 of file acronymDecomposer.py.
def AD_token::Token::tagTextWithMT | ( | self, | ||
text | ||||
) | [inherited] |
The input text is tokenized and tagged using the Penn-Treebank Tag Set. Returns list of tokens.
Definition at line 31 of file AD_token.py.
Definition at line 21 of file acronymDecomposer.py.
Definition at line 23 of file acronymDecomposer.py.
Definition at line 26 of file acronymDecomposer.py.
Definition at line 22 of file acronymDecomposer.py.
Definition at line 24 of file acronymDecomposer.py.
Definition at line 25 of file acronymDecomposer.py.
AD_token::Token::tag [inherited] |
Definition at line 17 of file AD_token.py.
AD_token::Token::tagSetDict [inherited] |
Definition at line 18 of file AD_token.py.
AD_token::Token::token [inherited] |
Definition at line 16 of file AD_token.py.