HFST - Helsinki Finite-State Transducer Technology - Python API  version 3.12.2
Public Member Functions | List of all members
HfstBasicTransducer Class Reference

A simple transducer class with tropical weights. More...

Public Member Functions

def __enumerate__ (self)
 Return an enumeration of the states and transitions of the transducer. More...
 
def __init__ (self)
 Create a transducer with one initial state that has state number zero and is not a final state, i.e. More...
 
def __init__ (self, transducer)
 Create a transducer equivalent to transducer. More...
 
def __str__ (self)
 Return a string representation of the transducer. More...
 
def add_state (self)
 Add a new state to this transducer and return its number. More...
 
def add_state (self, state)
 Add a state s to this graph. More...
 
def add_symbol_to_alphabet (self, symbol)
 Explicitly add symbol to the alphabet of the graph. More...
 
def add_symbols_to_alphabet (self, symbols)
 Explicitly add symbols to the alphabet of the graph. More...
 
def add_transition (self, state, transition, add_symbols_to_alphabet=True)
 Add a transition transition to state state, add_symbols_to_alphabet defines whether the transition symbols are added to the alphabet. More...
 
def add_transition (self, source, target, input, output, weight=0)
 Add a transition from state source to state target with input symbol input, output symbol output and weight weight. More...
 
def disjunct (self, stringpairpath, weight)
 Disjunct this transducer with a one-path transducer defined by consecutive string pairs in spv that has weight weight. More...
 
def get_alphabet (self)
 The symbols in the alphabet of the transducer. More...
 
def get_final_weight (self, state)
 Get the final weight of state state in this transducer. More...
 
def get_max_state (self)
 Get the biggest state number in use. More...
 
def get_transition_pairs (self)
 Get a list of all input/output symbol pairs used in the transitions of this transducer. More...
 
def harmonize (self, another)
 Harmonize this transducer and another. More...
 
def insert_freely (self, symbol_pair, weight)
 Insert freely any number of symbol_pair in the transducer with weight weight. More...
 
def insert_freely (self, transducer)
 Insert freely any number of transducer in this transducer. More...
 
def is_final_state (self, state)
 Whether state state is final. More...
 
def is_infinitely_ambiguous (self)
 Whether the transducer is infinitely ambiguous. More...
 
def is_lookup_infinitely_ambiguous (self, str)
 Whether the transducer is infinitely ambiguous with input str. More...
 
def longest_path_size (self)
 The length of the longest path in transducer. More...
 
def lookup (self, input, kwargs)
 
def prune_alphabet (self)
 Remove all symbols that do not occur in transitions of the transducer from its alphabet. More...
 
def read_att (f, epsilon_symbol, linecount)
 Read a transducer in AT&T format from file f. More...
 
def read_prolog (f, linecount)
 Read a transducer from prolog file f. More...
 
def remove_symbol_from_alphabet (self, symbol)
 Remove symbol symbol from the alphabet of the graph. More...
 
def remove_symbols_from_alphabet (self, symbols)
 Remove symbols symbols from the alphabet of the graph. More...
 
def remove_transition (self, s, transition, remove_symbols_from_alphabet=False)
 Remove transition transition from state s. More...
 
def set_final_weight (self, state, weight)
 Set the final weight of state state in this transducer to weight. More...
 
def sort_arcs (self)
 Sort the arcs of this transducer according to input and output symbols. More...
 
def states (self)
 The states of the transducer. More...
 
def states_and_transitions (self)
 The states and transitions of the transducer. More...
 
def substitute (self, s, S=None, kwargs)
 Substitute symbols or transitions in the transducer. More...
 
def symbols_used (self)
 Get a list of all symbols used in the transitions of this transducer. More...
 
def transitions (self, state)
 Get the transitions of state state in this transducer. More...
 
def write_att (self, f, bool, write_weights=True)
 Write this transducer in AT&T format to file f, write_weights defines whether weights are written. More...
 
def write_prolog (self, f, name, write_weights=True)
 Write the transducer in prolog format to file f. More...
 
def write_xfst (self, f, write_weights=True)
 Write the transducer in xfst format to file f. More...
 

Detailed Description

A simple transducer class with tropical weights.

An example of creating an HfstBasicTransducer [foo:bar baz:baz] with weight 0.4 from scratch:

  # Create an empty transducer
  # The transducer has initially one start state (number zero)
  # that is not final
  fsm = hfst.HfstBasicTransducer()
  # Add two states to the transducer
  fsm.add_state(1)
  fsm.add_state(2)
  # Create a transition [foo:bar] leading to state 1 with weight 0.1
  tr = hfst.HfstBasicTransition(1, 'foo', 'bar', 0.1)
  # and add it to state zero
  fsm.add_transition(0, tr)
  # Add a transition [baz:baz] with weight 0 from state 1 to state 2
  fsm.add_transition(1, hfst.HfstBasicTransition(2, 'baz', 'baz', 0.0))
  # Set state 2 as final with weight 0.3
  fsm.set_final_weight(2, 0.3)

An example of iterating through the states and transitions of the above transducer when printing them in AT&T format to standard output:

  # Go through all states
  for state, arcs in enumerate(fsm):
    for arc in arcs:
      print('%i ' % (state), end='')
      print(arc)
    if fsm.is_final_state(state):
      print('%i %f' % (state, fsm.get_final_weight(state)) )
See also
hfst.HfstBasicTransition

Constructor & Destructor Documentation

def __init__ (   self)

Create a transducer with one initial state that has state number zero and is not a final state, i.e.

create an empty transducer.

 tr = hfst.HfstBasicTransducer()
def __init__ (   self,
  transducer 
)

Create a transducer equivalent to transducer.

Parameters
transducerThe transducer to be copied, hfst.HfstBasicTransducer or hfst.HfstTransducer.
 tr = hfst.regex('foo') # creates an HfstTransducer
 TR = hfst.HfstBasicTransducer(tr)
 TR2 = hfst.HfstBasicTransducer(TR)

Member Function Documentation

def __enumerate__ (   self)

Return an enumeration of the states and transitions of the transducer.

 for state, arcs in enumerate(fsm):
   for arc in arcs:
     print('%i ' % (state), end='')
     print(arc)
   if fsm.is_final_state(state):
     print('%i %f' % (state, fsm.get_final_weight(state)) )
def __str__ (   self)

Return a string representation of the transducer.

 print(fsm)
def add_state (   self)

Add a new state to this transducer and return its number.

Returns
The next (smallest) free state number.
def add_state (   self,
  state 
)

Add a state s to this graph.

Parameters
stateThe number of the state to be added.
Returns
state

If the state already exists, it is not added again. All states with state number smaller than s are also added to the transducer if they did not exist before.

def add_symbol_to_alphabet (   self,
  symbol 
)

Explicitly add symbol to the alphabet of the graph.

Note
Usually the user does not have to take care of the alphabet of a graph. This function can be useful in some special cases. @ param symbol The string to be added.
def add_symbols_to_alphabet (   self,
  symbols 
)

Explicitly add symbols to the alphabet of the graph.

Note
Usually the user does not have to take care of the alphabet of a graph. This function can be useful in some special cases.
Parameters
symbolsA tuple of strings to be added.
def add_transition (   self,
  state,
  transition,
  add_symbols_to_alphabet = True 
)

Add a transition transition to state state, add_symbols_to_alphabet defines whether the transition symbols are added to the alphabet.

Parameters
stateThe number of the state where the transition is added. If it does not exist, it is created.
transitionA hfst.HfstBasicTransition that is added to state.
add_symbols_to_alphabetWhether the transition symbols are added to the alphabet of the transducer. (In special cases this is not wanted.)
def add_transition (   self,
  source,
  target,
  input,
  output,
  weight = 0 
)

Add a transition from state source to state target with input symbol input, output symbol output and weight weight.

Parameters
sourceThe number of the state where the transition is added. If it does not exist, it is created.
targetThe number of the state where the transition leads. If it does not exist, it is created. (?)
inputThe input symbol of the transition.
outputThe output symbol of the transition.
weightThe weight of the transition.
def disjunct (   self,
  stringpairpath,
  weight 
)

Disjunct this transducer with a one-path transducer defined by consecutive string pairs in spv that has weight weight.

Precondition
This graph must be a trie where all weights are in final states, i.e. all transitions have a zero weight.

There is no way to test whether a graph is a trie, so the use of this function is probably limited to fast construction of a lexicon. Here is an example:

 lexicon = hfst.HfstBasicTransducer()
 tok = hfst.HfstTokenizer()
 lexicon.disjunct(tok.tokenize('dog'), 0.3)
 lexicon.disjunct(tok.tokenize('cat'), 0.5)
 lexicon.disjunct(tok.tokenize('elephant'), 1.6)
def get_alphabet (   self)

The symbols in the alphabet of the transducer.

The symbols do not necessarily occur in any transitions of the transducer. Epsilon, unknown and identity symbols are always included in the alphabet.

Returns
A tuple of strings.
def get_final_weight (   self,
  state 
)

Get the final weight of state state in this transducer.

Parameters
stateThe number of the state. If it does not exist, a StateIsNotFinalException is thrown.
Exceptions
hfst.exceptions.StateIsNotFinalException.
def get_max_state (   self)

Get the biggest state number in use.

Returns
The biggest state number in use.
def get_transition_pairs (   self)

Get a list of all input/output symbol pairs used in the transitions of this transducer.

def harmonize (   self,
  another 
)

Harmonize this transducer and another.

In harmonization the unknown and identity symbols in transitions of both graphs are expanded according to the symbols that are previously unknown to the graph.

For example the graphs

 [a:b ?:?]
 [c:d ? ?:c]

are expanded to

 [ a:b [?:? | ?:c | ?:d | c:d | d:c | c:? | d:?] ]
 [ c:d [? | a | b] [?:c| a:c | b:?] ]

when harmonized.

The symbol '?' means hfst.UNKNOWN in either or both sides of a transition (transitions of type [?:x], [x:?] and [?:?]). The transition [?] means hfst.IDENTITY.

Note
This function is always called for all transducer arguments of functions that take two or more graphs as their arguments, unless otherwise said.
def insert_freely (   self,
  symbol_pair,
  weight 
)

Insert freely any number of symbol_pair in the transducer with weight weight.

Parameters
symbol_pairA string pair to be inserted.
weightThe weight of the inserted symbol pair.
def insert_freely (   self,
  transducer 
)

Insert freely any number of transducer in this transducer.

param transducer An HfstBasicTransducer to be inserted.

def is_final_state (   self,
  state 
)

Whether state state is final.

Parameters
stateThe state whose finality is returned.
def is_infinitely_ambiguous (   self)

Whether the transducer is infinitely ambiguous.

A transducer is infinitely ambiguous if there exists an input that will yield infinitely many results, i.e. there are input epsilon loops that are traversed with that input.

def is_lookup_infinitely_ambiguous (   self,
  str 
)

Whether the transducer is infinitely ambiguous with input str.

Parameters
strThe input.

A transducer is infinitely ambiguous with a given input if the input yields infinitely many results, i.e. there are input epsilon loops that are traversed with the input.

def longest_path_size (   self)

The length of the longest path in transducer.

Length of a path means number of arcs on that path.

def lookup (   self,
  input,
  kwargs 
)
def prune_alphabet (   self)

Remove all symbols that do not occur in transitions of the transducer from its alphabet.

Epsilon, unknown and identity symbols are always included in the alphabet.

def read_att (   f,
  epsilon_symbol,
  linecount 
)

Read a transducer in AT&T format from file f.

epsilon_symbol defines the symbol used for epsilon, linecount is incremented as lines are read.

Returns
A transducer constructed by reading from file file. This function is a static one.
def read_prolog (   f,
  linecount 
)

Read a transducer from prolog file f.

linecount is incremented as lines are read (is it in python?).

Returns
A transducer constructed by reading from file file. This function is a static one.
def remove_symbol_from_alphabet (   self,
  symbol 
)

Remove symbol symbol from the alphabet of the graph.

Note
Use with care, removing symbols that occur in the transitions of the graph can have unexpected results.
Parameters
symbolThe string to be removed.
def remove_symbols_from_alphabet (   self,
  symbols 
)

Remove symbols symbols from the alphabet of the graph.

Note
Use with care, removing symbols that occur in the transitions of the graph can have unexpected results.
Parameters
symbolsA tuple of strings to be removed.
def remove_transition (   self,
  s,
  transition,
  remove_symbols_from_alphabet = False 
)

Remove transition transition from state s.

Parameters
sThe state which transition belongs to.
transitionThe transition to be removed.
remove_symbols_from_alphabetWhether
def set_final_weight (   self,
  state,
  weight 
)

Set the final weight of state state in this transducer to weight.

If the state does not exist, it is created.

def sort_arcs (   self)

Sort the arcs of this transducer according to input and output symbols.

Returns
This transducer.
def states (   self)

The states of the transducer.

Returns
A tuple of state numbers.

An example:

 for state in fsm.states():
 for arc in fsm.transitions(state):
     print('%i ' % (state), end='')
     print(arc)
 if fsm.is_final_state(state):
    print('%i %f' % (state, fsm.get_final_weight(state)) )
def states_and_transitions (   self)

The states and transitions of the transducer.

Returns
A tuple of tuples of HfstBasicTransitions.
See also
hfst.HfstBasicTransducer.__enumerate__
def substitute (   self,
  s,
  S = None,
  kwargs 
)

Substitute symbols or transitions in the transducer.

Parameters
sThe symbol or transition to be substituted. Can also be a dictionary of substitutions, if S == None.
SThe symbol, transition, a tuple of transitions or a transducer (hfst.HfstBasicTransducer) that substitutes s.
kwargsArguments recognized are 'input' and 'output', their values can be False or True, True being the default. These arguments are valid only if s and S are strings, else they are ignored.
inputWhether substitution is performed on input side, defaults to True. Valid only if s and S are strings.
outputWhether substitution is performed on output side, defaults to True. Valid only if s and S are strings.

Possible combinations of arguments and their types are:

(1) substitute(str, str, input=bool, output=bool): substitute symbol with symbol on input, output or both sides of each transition in the transducer. (2) substitute(strpair, strpair): substitute transition with transition (3) substitute(strpair, strpairtuple): substitute transition with several transitions (4) substitute(strpair, transducer): substitute transition with a transducer (5) substitute(dict): perform several symbol-to-symbol substitutions (6) substitute(dict): perform several transition-to-transition substitutions

Examples:

(1) tr.substitute('a', 'A', input=True, output=False): substitute lowercase a:s with uppercase ones (2) tr.substitute(('a','b'),('A','B')): substitute transitions that map lowercase a into lowercase b with transitions that map uppercase a into uppercase b (3) tr.substitute(('a','b'), (('A','B'),('a','B'),('A','b'))): change either or both sides of a transition [a:b] to uppercase (4) tr.substitute(('a','b'), hfst.regex('[a:b]+')) change [a:b] transition into one or more consecutive [a:b] transitions (5) tr.substitute({'a':'A', 'b':'B', 'c':'C'}) change lowercase a, b and c into their uppercase variants (6) tr.substitute( {('a','a'):('A','A'), ('b','b'):('B','B'), ('c','c'):('C','C')} ): change lowercase a, b and c into their uppercase variants

In case (4), epsilon transitions are used to attach copies of transducer S between the SOURCE and TARGET state of each transition that is substituted. The transition itself is deleted, but its weight is copied to the epsilon transition leading from SOURCE to the initial state of S. Each final state of S is made non-final and an epsilon transition leading to TARGET is attached to it. The final weight is copied to the epsilon transition.

def symbols_used (   self)

Get a list of all symbols used in the transitions of this transducer.

def transitions (   self,
  state 
)

Get the transitions of state state in this transducer.

If the state does not exist, a StateIndexOutOfBoundsException is thrown.

Returns
A tuple of HfstBasicTransitions.
 for state in fsm.states():
 for arc in fsm.transitions(state):
     print('%i ' % (state), end='')
     print(arc)
 if fsm.is_final_state(state):
    print('%i %f' % (state, fsm.get_final_weight(state)) )
def write_att (   self,
  f,
  bool,
  write_weights = True 
)

Write this transducer in AT&T format to file f, write_weights defines whether weights are written.

def write_prolog (   self,
  f,
  name,
  write_weights = True 
)

Write the transducer in prolog format to file f.

Name the transducer name.

def write_xfst (   self,
  f,
  write_weights = True 
)

Write the transducer in xfst format to file f.


The documentation for this class was generated from the following file: