HFST  Helsinki FiniteState Transducer Technology  Python API
version 3.12.2

A synchronous finitestate transducer. More...
Public Member Functions  
def  __init__ (self) 
Create an empty transducer. More...  
def  __init__ (self, another) 
Create a deep copy of HfstTransducer another or a transducer equivalent to HfstBasicTransducer another. More...  
def  __init__ (self, t, type) 
Create an HFST transducer equivalent to HfstBasicTransducer t. More...  
def  __str__ (self) 
An AT&T representation of the transducer. More...  
def  compare (self, another) 
Whether this transducer and another are equivalent. More...  
def  compose (self, another) 
Compose this transducer with another. More...  
def  compose_intersect (self, v, invert=False) 
Compose this transducer with the intersection of transducers in v. More...  
def  concatenate (self, another) 
Concatenate this transducer with another. More...  
def  conjunct (self, another) 
Alias for intersect. More...  
def  convert (self, type, options='') 
Convert the transducer into an equivalent transducer in format type. More...  
def  copy (self) 
Return a deep copy of the transducer. More...  
def  cross_product (self, another) 
Make cross product of this transducer with another. More...  
def  determinize (self) 
Determinize the transducer. More...  
def  disjunct (self, another) 
Disjunct this transducer with another. More...  
def  eliminate_flag (self, symbol) 
Eliminate flag diacritic symbol from the transducer. More...  
def  eliminate_flags (self, symbols) 
Eliminate flag diacritics listed in symbols from the transducer. More...  
def  extract_longest_paths (self, kwargs) 
Extract longest paths of the transducer. More...  
def  extract_paths (self, kwargs) 
Extract paths that are recognized by the transducer. More...  
def  extract_shortest_paths (self) 
Extract shortest paths of the transducer. More...  
def  get_alphabet (self) 
Get the alphabet of the transducer. More...  
def  get_name (self) 
Get the name of the transducer. More...  
def  get_properties (self) 
Get all properties from the transducer. More...  
def  get_property (self, property) 
Get arbitrary string propert property. More...  
def  get_type (self) 
The implementation type of the transducer. More...  
def  has_flag_diacritics (self) 
Whether the transducer has flag diacritics in its transitions. More...  
def  input_project (self) 
Extract the input language of the transducer. More...  
def  insert_freely (self, ins) 
Freely insert a transition or a transducer into the transducer. More...  
def  insert_to_alphabet (self, symbol) 
Explicitly insert symbol to the alphabet of the transducer. More...  
def  intersect (self, another) 
Intersect this transducer with another. More...  
def  invert (self) 
Swap the input and output symbols of each transition in the transducer. More...  
def  is_automaton (self) 
Whether each transition in the transducer has equivalent input and output symbols. More...  
def  is_cyclic (self) 
Whether the transducer is cyclic. More...  
def  is_implementation_type_available (type) 
Whether HFST is linked to the transducer library needed by implementation type type. More...  
def  is_infinitely_ambiguous (self) 
Whether the transducer is infinitely ambiguous. More...  
def  is_lookup_infinitely_ambiguous (self, tok_input) 
Whether lookup of path input will have infinite results. More...  
def  lenient_composition (self, another) 
Perform a lenient composition on this transducer and another. More...  
def  longest_path_size (self, kwargs) 
Get length of longest path of the transducer. More...  
def  lookup_optimize (self) 
Lookup string input. More...  
def  minimize (self) 
Minimize the transducer. More...  
def  minus (self, another) 
Alias for subtract. More...  
def  n_best (self, n) 
Extract n best paths of the transducer. More...  
def  number_of_arcs (self) 
The number of transitions in the transducer. More...  
def  number_of_states (self) 
The number of states in the transducer. More...  
def  optionalize (self) 
Disjunct the transducer with an epsilon transducer. More...  
def  output_project (self) 
Extract the output language of the transducer. More...  
def  priority_union (self, another) 
Make priority union of this transducer with another. More...  
def  prune (self) 
Make transducer coaccessible. More...  
def  push_weights_to_end (self) 
Push weights towards final state(s). More...  
def  push_weights_to_start (self) 
Push weights towards initial state. More...  
def  remove_epsilons (self) 
Remove all epsilon:epsilon transitions from the transducer so that the resulting transducer is equivalent to the original one. More...  
def  remove_from_alphabet (self, symbol) 
Remove symbol from the alphabet of the transducer. More...  
def  remove_optimization (self) 
Remove lookup optimization. More...  
def  repeat_n (self, n) 
A concatenation of n transducers. More...  
def  repeat_n_minus (self, n) 
A concatenation of N transducers where N is any number from zero to n, inclusive. More...  
def  repeat_n_plus (self, n) 
A concatenation of N transducers where N is any number from n to infinity, inclusive. More...  
def  repeat_n_to_k (self, n, k) 
A concatenation of N transducers where N is any number from n to k, inclusive. More...  
def  repeat_plus (self) 
A concatenation of N transducers where N is any number from one to infinity. More...  
def  repeat_star (self) 
A concatenation of N transducers where N is any number from zero to infinity. More...  
def  reverse (self) 
Reverse the transducer. More...  
def  set_final_weights (self, weight) 
Set the weights of all final states to weight. More...  
def  set_name (self, name) 
Rename the transducer name. More...  
def  set_property (self, property, value) 
Set arbitrary string property property to value. More...  
def  shuffle (self, another) 
Shuffle this transducer with transducer another. More...  
def  substitute (self, s, S=None, kwargs) 
Substitute symbols or transitions in the transducer. More...  
def  subtract (self, another) 
Subtract transducer another from this transducer. More...  
def  write (self, ostr) 
Write the transducer in binary format to ostr. More...  
def  write_att (self, f, write_weights=True) 
Write the transducer in AT&T format to file f, write_weights defined whether weights are written. More...  
def  write_att (self, ofile, write_weights=True) 
Write the transducer in AT&T format to file ofile, write_weights defines whether weights are written. More...  
def  write_att (self, filename, write_weights=True) 
Write the transducer in AT&T format to file named filename. More...  
def  write_prolog (self, f, name, write_weights=True) 
Write the transducer in prolog format with name name to file f, write_weights defined whether weights are written. More...  
A synchronous finitestate transducer.
Transducer functions modify their calling object and return a reference to the calling object after modification, unless otherwise mentioned. Transducer arguments are usually not modified.
# transducer is reversed transducer.reverse() # transducer2 is not modified, but a copy of it is disjuncted with # transducer1 transducer1.disjunct(transducer2) # a chain of functions is possible transducer.reverse().determinize().reverse().determinize()
Currently, an HfstTransducer has three implementation types that are well supported. When an HfstTransducer is created, its type is defined with an argument. For functions that take a transducer as an argument, the type of the calling transducer must be the same as the type of the argument transducer:
# this will cause a TransducerTypeMismatchException: tropical_transducer.disjunct(foma_transducer) # this works, but weights are lost in the conversion tropical_transducer.convert(hfst.ImplementationType.SFST_TYPE).disjunct(sfst_transducer) # this works, information is not lost tropical_transducer.disjunct(sfst_transducer.convert(hfst.ImplementationType.TROPICAL_OPENFST_TYPE))
With HfstTransducer constructors it is possible to create empty, epsilon, onetransition and singlepath transducers. Transducers can also be created from scratch with hfst.HfstBasicTransducer and converted to an HfstTransducer. More complex transducers can be combined from simple ones with various functions.
def __init__  (  self  ) 
Create an empty transducer.
tr = hfst.HfstTransducer() assert(tr.compare(hfst.empty_fst()))
def __init__  (  self,  
another  
) 
Create a deep copy of HfstTransducer another or a transducer equivalent to HfstBasicTransducer another.
another  An HfstTransducer or HfstBasicTransducer. 
An example:
tr1 = hfst.regex('foo bar foo') tr2 = hfst.HfstTransducer(tr1) tr2.substitute('foo','FOO') tr1.concatenate(tr2)
def __init__  (  self,  
t,  
type  
) 
Create an HFST transducer equivalent to HfstBasicTransducer t.
The type of the created transducer is defined by type.
t  An HfstBasicTransducer. 
type  The type of the resulting transducer. If you want to use the default type, you can just call hfst.HfstTransducer(fsm) 
def __str__  (  self  ) 
An AT&T representation of the transducer.
Defined for print command. An example:
>>> print(hfst.regex('[foo:bar::2]+')) 0 1 foo bar 2.000000 1 1 foo bar 2.000000 1 0.000000
def compare  (  self,  
another  
) 
Whether this transducer and another are equivalent.
another  The compared transducer. 
Two transducers are equivalent iff they accept the same input/output string pairs with the same weights and the same alignments.
def compose  (  self,  
another  
) 
Compose this transducer with another.
another  The second argument in the composition. Not modified. 
def compose_intersect  (  self,  
v,  
invert = False 

) 
Compose this transducer with the intersection of transducers in v.
If invert is true, then compose the intersection of the transducers in v with this transducer.
The algorithm used by this function is faster than intersecting all transducers one by one and then composing this transducer with the intersection.
v  A tuple of transducers. 
invert  Whether the intersection of the transducers in v is composed with this transducer. 
def concatenate  (  self,  
another  
) 
Concatenate this transducer with another.
def conjunct  (  self,  
another  
) 
Alias for intersect.
def convert  (  self,  
type,  
options = '' 

) 
Convert the transducer into an equivalent transducer in format type.
If a weighted transducer is converted into an unweighted one, all weights are lost. In the reverse case, all weights are initialized to the semiring's one.
A transducer of type hfst.ImplementationType.SFST_TYPE, hfst.ImplementationType.TROPICAL_OPENFST_TYPE, hfst.ImplementationType.LOG_OPENFST_TYPE or hfst.ImplementationType.FOMA_TYPE can be converted into an hfst.ImplementationType.HFST_OL_TYPE or hfst.ImplementationType.HFST_OLW_TYPE transducer, but an hfst.ImplementationType.HFST_OL_TYPE or hfst.ImplementationType.HFST_OLW_TYPE transducer cannot be converted to any other type.
def copy  (  self  ) 
Return a deep copy of the transducer.
tr = hfst.regex('[foo:bar::0.3]*') TR = tr.copy() assert(tr.compare(TR))
def cross_product  (  self,  
another  
) 
Make cross product of this transducer with another.
It pairs every string of this with every string of another. If strings are not the same length, epsilon padding will be added in the end of the shorter string.
def determinize  (  self  ) 
Determinize the transducer.
Determinizing a transducer yields an equivalent transducer that has no state with two or more transitions whose input:output symbol pairs are the same.
def disjunct  (  self,  
another  
) 
Disjunct this transducer with another.
def eliminate_flag  (  self,  
symbol  
) 
Eliminate flag diacritic symbol from the transducer.
symbol  The flag to be eliminated. TODO: explain more. 
An equivalent transducer with no flags symbol.
def eliminate_flags  (  self,  
symbols  
) 
Eliminate flag diacritics listed in symbols from the transducer.
symbols  The flags to be eliminated. TODO: explain more. 
An equivalent transducer with no flags listed in symbols.
def extract_longest_paths  (  self,  
kwargs  
) 
Extract longest paths of the transducer.
def extract_paths  (  self,  
kwargs  
) 
Extract paths that are recognized by the transducer.
kwargs  Arguments recognized are filter_flags, max_cycles, max_number, obey_flags, output, random. 
filter_flags  Whether flags diacritics are filtered out from the result (default True). 
max_cycles  Indicates how many times a cycle will be followed, with negative numbers indicating unlimited (default 1 i.e. unlimited). 
max_number  The total number of resulting strings is capped at this value, with 0 or negative indicating unlimited (default 1 i.e. unlimited). 
obey_flags  Whether flag diacritics are validated (default True). 
output  Output format. Values recognized: 'text', 'raw', 'dict' (the default). 'text' returns a string where paths are separated by newlines and each path is represented as input_string + ":" + output_string + "\t" t weight. 'raw' yields a tuple of all paths where each path is a 2tuple consisting of a weight and a tuple of all transition symbol pairs, each symbol pair being a 2tuple of an input and an output symbol. 'dict' gives a dictionary that maps each input string into a list of possible outputs, each output being a 2tuple of an output string and a weight. 
random  Whether result strings are fetched randomly (default False). 
An example:
>>> tr = hfst.regex('a:b+ (a:c+)') >>> print(tr) 0 1 a b 0.000000 1 1 a b 0.000000 1 2 a c 0.000000 1 0.000000 2 2 a c 0.000000 2 0.000000 >>> print(tr.extract_paths(max_cycles=1, output='text')) a:b 0 aa:bb 0 aaa:bbc 0 aaaa:bbcc 0 aa:bc 0 aaa:bcc 0 >>> print(tr.extract_paths(max_number=4, output='text')) a:b 0 aa:bc 0 aaa:bcc 0 aaaa:bccc 0 >>> print(tr.extract_paths(max_cycles=1, max_number=4, output='text')) a:b 0 aa:bb 0 aa:bc 0 aaa:bcc 0
TransducerIsCyclicException 
def extract_shortest_paths  (  self  ) 
Extract shortest paths of the transducer.
def get_alphabet  (  self  ) 
Get the alphabet of the transducer.
The alphabet is defined as the set of symbols known to the transducer.
def get_name  (  self  ) 
Get the name of the transducer.
def get_properties  (  self  ) 
Get all properties from the transducer.
def get_property  (  self,  
property  
) 
Get arbitrary string propert property.
property  The name of the property whose value is returned. get_property('name') works like get_name(). 
def get_type  (  self  ) 
The implementation type of the transducer.
def has_flag_diacritics  (  self  ) 
Whether the transducer has flag diacritics in its transitions.
def input_project  (  self  ) 
Extract the input language of the transducer.
All transition symbol pairs isymbol:osymbol are changed to isymbol:isymbol.
def insert_freely  (  self,  
ins  
) 
Freely insert a transition or a transducer into the transducer.
ins  The transition or transducer to be inserted. 
If ins is a transition, i.e. a 2tuple of strings: A transition is added to each state in this transducer. The transition leads from that state to itself with input and output symbols defined by ins. The weight of the transition is zero.
If ins is an hfst.HfstTransducer: A copy of ins is attached with epsilon transitions to each state of this transducer. After the operation, for each state S in this transducer, there is an epsilon transition that leads from state S to the initial state of ins, and for each final state of ins, there is an epsilon transition that leads from that final state to state S in this transducer. The weights of the final states in ins are copied to the epsilon transitions leading to state S.
def insert_to_alphabet  (  self,  
symbol  
) 
Explicitly insert symbol to the alphabet of the transducer.
symbol  The symbol (string) to be inserted. 
def intersect  (  self,  
another  
) 
Intersect this transducer with another.
def invert  (  self  ) 
Swap the input and output symbols of each transition in the transducer.
def is_automaton  (  self  ) 
Whether each transition in the transducer has equivalent input and output symbols.
def is_cyclic  (  self  ) 
Whether the transducer is cyclic.
def is_implementation_type_available  (  type  ) 
Whether HFST is linked to the transducer library needed by implementation type type.
def is_infinitely_ambiguous  (  self  ) 
Whether the transducer is infinitely ambiguous.
A transducer is infinitely ambiguous if there exists an input that will yield infinitely many results, i.e. there are input epsilon loops that are traversed with that input.
def is_lookup_infinitely_ambiguous  (  self,  
tok_input  
) 
Whether lookup of path input will have infinite results.
Currently, this function will return whether the transducer is infinitely ambiguous on any lookup path found in the transducer, i.e. the argument input is ignored.
def lenient_composition  (  self,  
another  
) 
Perform a lenient composition on this transducer and another.
TODO: explain more.
def longest_path_size  (  self,  
kwargs  
) 
Get length of longest path of the transducer.
def lookup_optimize  (  self  ) 
Lookup string input.
input  The input. A string or a pretokenized tuple of symbols (i.e. a tuple of strings). 
kwargs  Possible parameters and their default values are: obey_flags=True, max_number=1, time_cutoff=0.0, output='tuple' 
obey_flags  Whether flag diacritics are obeyed. Always True for HFST_OL(W)_TYPE transducers. 
max_number  Maximum number of results returned, defaults to 1, i.e. infinity. 
time_cutoff  How long the function can search for results before returning, expressed in seconds. Defaults to 0.0, i.e. infinitely. Always 0.0 for transducers that are not of HFST_OL(W)_TYPE. 
output  Possible values are 'tuple', 'text' and 'raw', 'tuple' being the default. 
def minimize  (  self  ) 
Minimize the transducer.
Minimizing a transducer yields an equivalent transducer with the smallest number of states.
def minus  (  self,  
another  
) 
Alias for subtract.
def n_best  (  self,  
n  
) 
Extract n best paths of the transducer.
In the case of a weighted transducer (hfst.ImplementationType.TROPICAL_OPENFST_TYPE or hfst.ImplementationType.LOG_OPENFST_TYPE), best paths are defined as paths with the lowest weight. In the case of an unweighted transducer (hfst.ImplementationType.SFST_TYPE or hfst.ImplementationType.FOMA_TYPE), the function returns random paths.
This function is not implemented for hfst.ImplementationType.FOMA_TYPE or hfst.ImplementationType.SFST_TYPE. If this function is called by an HfstTransducer of type hfst.ImplementationType.FOMA_TYPE or hfst.ImplementationType.SFST_TYPE, it is converted to hfst.ImplementationType.TROPICAL_OPENFST_TYPE, paths are extracted and it is converted back to hfst.ImplementationType.FOMA_TYPE or hfst.ImplementationType.SFST_TYPE. If HFST is not linked to OpenFst library, an hfst.exceptions.ImplementationTypeNotAvailableException is thrown.
def number_of_arcs  (  self  ) 
The number of transitions in the transducer.
def number_of_states  (  self  ) 
The number of states in the transducer.
def optionalize  (  self  ) 
Disjunct the transducer with an epsilon transducer.
def output_project  (  self  ) 
Extract the output language of the transducer.
All transition symbol pairs isymbol:osymbol are changed to osymbol:osymbol.
def priority_union  (  self,  
another  
) 
Make priority union of this transducer with another.
For the operation t1.priority_union(t2), the result is a union of t1 and t2, except that whenever t1 and t2 have the same string on left side, the path in t2 overrides the path in t1.
Example
Transducer 1 (t1): a : a b : b Transducer 2 (t2): b : B c : C Result ( t1.priority_union(t2) ): a : a b : B c : C
For more information, read fsmbook.
def prune  (  self  ) 
Make transducer coaccessible.
A transducer is coaccessible iff there is a path from every state to a final state.
def push_weights_to_end  (  self  ) 
Push weights towards final state(s).
If the HfstTransducer is of unweighted type (hfst.ImplementationType.SFST_TYPE or hfst.ImplementationType.FOMA_TYPE), nothing is done.
An example:
>>> import hfst >>> tr = hfst.regex('[a::1 a:b::0.3 (b::0)]::0.7;') >>> tr.push_weights_to_end() >>> print(tr) 0 1 a a 0.000000 1 2 a b 0.000000 2 3 b b 0.000000 2 2.000000 3 2.000000
def push_weights_to_start  (  self  ) 
Push weights towards initial state.
If the HfstTransducer is of unweighted type (hfst.ImplementationType.SFST_TYPE or hfst.ImplementationType.FOMA_TYPE), nothing is done.
An example:
>>> import hfst >>> tr = hfst.regex('[a::1 a:b::0.3 (b::0)]::0.7;') >>> tr.push_weights_to_start() >>> print(tr) 0 1 a a 2.000000 1 2 a b 0.000000 2 3 b b 0.000000 2 0.000000 3 0.000000
def remove_epsilons  (  self  ) 
Remove all epsilon:epsilon transitions from the transducer so that the resulting transducer is equivalent to the original one.
def remove_from_alphabet  (  self,  
symbol  
) 
Remove symbol from the alphabet of the transducer.
symbol  The symbol (string) to be removed. 
def remove_optimization  (  self  ) 
Remove lookup optimization.
This effectively converts transducer (back) into default fst type.
def repeat_n  (  self,  
n  
) 
A concatenation of n transducers.
def repeat_n_minus  (  self,  
n  
) 
A concatenation of N transducers where N is any number from zero to n, inclusive.
def repeat_n_plus  (  self,  
n  
) 
A concatenation of N transducers where N is any number from n to infinity, inclusive.
def repeat_n_to_k  (  self,  
n,  
k  
) 
A concatenation of N transducers where N is any number from n to k, inclusive.
def repeat_plus  (  self  ) 
A concatenation of N transducers where N is any number from one to infinity.
def repeat_star  (  self  ) 
A concatenation of N transducers where N is any number from zero to infinity.
def reverse  (  self  ) 
Reverse the transducer.
A reverted transducer accepts the string 'n(0) n(1) ... n(N)' iff the original transducer accepts the string 'n(N) n(N1) ... n(0)'
def set_final_weights  (  self,  
weight  
) 
Set the weights of all final states to weight.
If the HfstTransducer is of unweighted type (hfst.ImplementationType.SFST_TYPE or hfst.ImplementationType.FOMA_TYPE), nothing is done.
def set_name  (  self,  
name  
) 
def set_property  (  self,  
property,  
value  
) 
Set arbitrary string property property to value.
property  A string naming the property. 
value  A string expressing the value of property. 
set_property('name', 'name of the transducer') equals set_name('name of the transducer').
def shuffle  (  self,  
another  
) 
Shuffle this transducer with transducer another.
If transducer A accepts string 'foo' and transducer B string 'bar', the transducer that results from shuffling A and B accepts all strings [(fb)(oa)(or)].
def substitute  (  self,  
s,  
S = None , 

kwargs  
) 
Substitute symbols or transitions in the transducer.
s  The symbol or transition to be substituted. Can also be a dictionary of substitutions, if S == None. 
S  The symbol, transition, a tuple of transitions or a transducer (hfst.HfstTransducer) that substitutes s. 
kwargs  Arguments recognized are 'input' and 'output', their values can be False or True, True being the default. These arguments are valid only if s and S are strings, else they are ignored. 
input  Whether substitution is performed on input side, defaults to True. Valid only if s and S are strings. 
output  Whether substitution is performed on output side, defaults to True. Valid only if s and \ S are strings. 
For more information, see hfst.HfstBasicTransducer.substitute. The function works similarly, with the exception of argument S, which must be hfst.HfstTransducer instead of hfst.HfstBasicTransducer.
def subtract  (  self,  
another  
) 
Subtract transducer another from this transducer.
def write  (  self,  
ostr  
) 
Write the transducer in binary format to ostr.
ostr  A hfst.HfstOutputStream where the transducer is written. 
def write_att  (  self,  
f,  
write_weights = True 

) 
Write the transducer in AT&T format to file f, write_weights defined whether weights are written.
f  A python file where transducer is written. 
write_weights  Whether weights are written. 
def write_att  (  self,  
ofile,  
write_weights = True 

) 
Write the transducer in AT&T format to file ofile, write_weights defines whether weights are written.
The fields in the resulting AT&T format are separated by tabulator characters.
NOTE: If the transition symbols contain space characters,the spaces are printed as '@_SPACE_@' because whitespace characters are used as field separators in AT&T format. Epsilon symbols are printed as '@0@'.
If several transducers are written in the same file, they must be separated by a line of two consecutive hyphens "", so that they will be read correctly by hfst.read_att.
An example:
tr1 = hfst.regex('[foo:bar baz:0 " "]::0.3') tr2 = hfst.empty_fst() tr3 = hfst.epsilon_fst(0.5) tr4 = hfst.regex('[foo]') tr5 = hfst.empty_fst() f = hfst.hfst_open('testfile.att', 'w') for tr in [tr1, tr2, tr3, tr4]: tr.write_att(f) f.write('\n') tr5.write_att(f) f.close()
This will yield a file 'testfile.att' that looks as follows:
0 1 foo bar 0.299805 1 2 baz @0@ 0.000000 2 3 @_SPACE_@ @_SPACE_@ 0.000000 3 0.000000   0 0.500000  0 1 foo foo 0.000000 1 0.000000 
StreamCannotBeWrittenException  
StreamIsClosedException 
def write_att  (  self,  
filename,  
write_weights = True 

) 
Write the transducer in AT&T format to file named filename.
write_weights defines whether weights are written.
If the file exists, it is overwritten. If the file does not exist, it is created.
def write_prolog  (  self,  
f,  
name,  
write_weights = True 

) 
Write the transducer in prolog format with name name to file f, write_weights defined whether weights are written.
f  A python file where the transducer is written. 
name  The name of the transducer that must be given in a prolog file. 
write_weights  Whether weights are written. 