| 
    HFST - Helsinki Finite-State Transducer Technology - Python API
    version 3.12.3 (under development)
    
   | 
 
After installing HFST on your computer, start python and execute import hfst.
For example, the following simple program
 import hfst
 tr1 = hfst.regex('foo:bar')
 tr2 = hfst.regex('bar:baz')
 tr1.compose(tr2)
 print(tr1)should print to standard output the following text when run:
0 1 foo baz 0 1 0
The HFST API is located in a package 'hfst' that includes the following classes:
There are also functions in package 'hfst' that are not part of any class. For example hfst.fst
There are also the following submodules:
An example of creating a simple transducer from scratch and converting between transducer formats and testing transducer properties and handling exceptions:
 import hfst
 # Create as HFST basic transducer [a:b] with transition weight 0.3 and final weight 0.5.
 t = hfst.HfstBasicTransducer()
 t.add_state(1)
 t.add_transition(0, 1, 'a', 'b', 0.3)
 t.set_final_weight(1, 0.5)
 # Convert to tropical OpenFst format (the default) and push weights toward final state.
 T = hfst.HfstTransducer(t)
 T.push_weights_to_end()
 # Convert back to HFST basic transducer.
 tc = hfst.HfstBasicTransducer(T)
 try:
     # Rounding might affect the precision.
     if (0.79 < tc.get_final_weight(1)) and (tc.get_final_weight(1) < 0.81):
         print("TEST PASSED")
         exit(0)
     else:
         print("TEST FAILED")
         exit(1)
 # If the state does not exist or is not final
 except hfst.exceptions.HfstException as e:
     print("TEST FAILED: An exception was thrown.")
     exit(1)An example of creating transducers from strings, applying rules to them and printing the string pairs recognized by the resulting transducer.
 import hfst
 hfst.set_default_fst_type(hfst.ImplementationType.FOMA_TYPE) # we use foma implementation as there are no weights involved
 # Create a simple lexicon transducer [[foo bar foo] | [foo bar baz]].
 tok = hfst.HfstTokenizer()
 tok.add_multichar_symbol('foo')
 tok.add_multichar_symbol('bar')
 tok.add_multichar_symbol('baz')
 words = hfst.tokenized_fst(tok.tokenize('foobarfoo'))
 t = hfst.tokenized_fst(tok.tokenize('foobarbaz'))
 words.disjunct(t)
 # Create a rule transducer that optionally replaces 'bar' with 'baz' between 'foo' and 'foo'.
 rule = hfst.regex('bar (->) baz || foo _ foo')
 # Apply the rule transducer to the lexicon.
 words.compose(rule)
 words.minimize()
 # Extract all string pairs from the result and print them to standard output.
 results = 0
 try:
     # Extract paths and remove tokenization
     results = words.extract_paths(output='dict')
 except hfst.exceptions.TransducerIsCyclicException as e:
     # This should not happen because transducer is not cyclic.
     print("TEST FAILED")
     exit(1)
 for input,outputs in results.items():
     print('%s:' % input)
     for output in outputs:
         print('  %s\t%f' % (output[0], output[1]))The output:
foobarfoo: foobarfoo 0.000000 foobazfoo 0.000000 foobarbaz: foobarbaz 0.000000
 
 1.8.11