HFST - Helsinki Finite-State Transducer Technology - C++ API  version 3.9.1
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
Public Member Functions | Static Public Member Functions | Protected Member Functions | Friends | List of all members
HfstTransducer Class Reference

A synchronous finite-state transducer. More...

#include <HfstTransducer.h>

Public Member Functions

HFSTDLL bool compare (const HfstTransducer &another, bool harmonize=true) const
 Whether this transducer and another are equivalent. More...
 
HFSTDLL HfstTransducercompose (const HfstTransducer &another, bool harmonize=true)
 Compose this transducer with another. More...
 
HFSTDLL HfstTransducercompose_intersect (const HfstTransducerVector &v, bool invert=false, bool harmonize=true)
 Compose this transducer with the intersection of transducers in v. If invert is true, then compose the intersection of the transducers in v with this transducer. More...
 
HFSTDLL HfstTransducerconcatenate (const HfstTransducer &another, bool harmonize=true)
 Concatenate this transducer with another. More...
 
HFSTDLL HfstTransducerconvert (ImplementationType type, std::string options="")
 Convert the transducer into an equivalent transducer in format type. More...
 
HFSTDLL HfstTransducercross_product (const HfstTransducer &another, bool harmonize=true)
 Make cross product of this transducer with . It pairs every string of this with every string of . More...
 
HFSTDLL HfstTransducerdeterminize ()
 Determinize the transducer. More...
 
HFSTDLL HfstTransducerdisjunct (const HfstTransducer &another, bool harmonize=true)
 Disjunct this transducer with another. More...
 
HFSTDLL void extract_paths (HfstTwoLevelPaths &results, int max_num=-1, int cycles=-1) const
 Extract a maximum of max_num paths that are recognized by the transducer following a maximum of cycles cycles and store the paths into results. More...
 
HFSTDLL void extract_paths_fd (HfstTwoLevelPaths &results, int max_num=-1, int cycles=-1, bool filter_fd=true) const
 Extract a maximum of max_num paths that are recognized by the transducer and are not invalidated by flag diacritic rules following a maximum of cycles cycles and store the paths into results. filter_fd defines whether the flag diacritics themselves are filtered out of the result strings. More...
 
HFSTDLL StringSet get_alphabet () const
 Get the alphabet of the transducer. More...
 
HFSTDLL StringSet get_first_input_symbols () const
 Get first input level symbols of strings recognized (or rejected, if they end in a non-final state) by the transducer. More...
 
HFSTDLL std::string get_name () const
 Get the name of the transducer. More...
 
HFSTDLL const std::map
< std::string, std::string > & 
get_properties () const
 Get all properties form transducer. More...
 
HFSTDLL std::string get_property (const std::string &property) const
 Get arbitrary string propert property. get_property("name") works like get_name. More...
 
HFSTDLL ImplementationType get_type (void) const
 The implementation type of the transducer. More...
 
HFSTDLL void harmonize (HfstTransducer &another)
 Harmonize transducers this and another. More...
 
HFSTDLL HfstTransducer ()
 Create an uninitialized transducer (use with care). More...
 
HFSTDLL HfstTransducer (ImplementationType type)
 Create an empty transducer, i.e. a transducer that does not recognize any string. The type of the transducer is defined by type. More...
 
HFSTDLL HfstTransducer (const std::string &utf8_str, const HfstTokenizer &multichar_symbol_tokenizer, ImplementationType type)
 Create a transducer by tokenizing the utf8 string utf8_string with tokenizer multichar_symbol_tokenizer. The type of the transducer is defined by type. More...
 
HFSTDLL HfstTransducer (const std::string &input_utf8_str, const std::string &output_utf8_str, const HfstTokenizer &multichar_symbol_tokenizer, ImplementationType type)
 Create a transducer by tokenizing the utf8 input string input_utf8_string and output string output_utf8_string with tokenizer multichar_symbol_tokenizer. The type of the transducer is defined by type. More...
 
HFSTDLL HfstTransducer (HfstInputStream &in)
 Read a binary transducer from transducer stream in. More...
 
HFSTDLL HfstTransducer (const HfstTransducer &another)
 Create a deep copy of transducer another. More...
 
HFSTDLL HfstTransducer (const hfst::implementations::HfstBasicTransducer &t, ImplementationType type)
 Create an HFST transducer equivalent to HFST basic transducer t. The type of the created transducer is defined by type. More...
 
HFSTDLL HfstTransducer (const std::string &symbol, ImplementationType type)
 Create a transducer that recognizes the string pair <"symbol","symbol">, i.e. [symbol:symbol]. The type of the transducer is defined by type. More...
 
HFSTDLL HfstTransducer (const std::string &isymbol, const std::string &osymbol, ImplementationType type)
 Create a transducer that recognizes the string pair <"isymbol","osymbol">, i.e [isymbol:osymbol]. The type of the transducer is defined by type. More...
 
HFSTDLL HfstTransducer (FILE *ifile, ImplementationType type, const std::string &epsilon_symbol, unsigned int &linecount)
 Create a transducer of type type as defined in AT&T format in FILE ifile. epsilon_symbol defines how epsilons are represented. More...
 
HFSTDLL HfstTransducerinput_project ()
 Extract the input language of the transducer. More...
 
HFSTDLL HfstTransducerinsert_freely (const StringPair &symbol_pair, bool harmonize=true)
 Freely insert symbol pair symbol_pair into the transducer. More...
 
HFSTDLL HfstTransducerinsert_freely (const HfstTransducer &tr, bool harmonize=true)
 Freely insert a copy of tr into the transducer. More...
 
HFSTDLL void insert_to_alphabet (const std::string &symbol)
 Explicitly insert symbol to the alphabet of the transducer. More...
 
HFSTDLL HfstTransducerintersect (const HfstTransducer &another, bool harmonize=true)
 Intersect this transducer with another. More...
 
HFSTDLL HfstTransducerinvert ()
 Swap the input and output symbols of each transition in the transducer. More...
 
HFSTDLL bool is_automaton (void) const
 Whether the transducer is an automaton. More...
 
HFSTDLL bool is_cyclic (void) const
 Whether the transducer is cyclic. More...
 
HFSTDLL bool is_lookdown_infinitely_ambiguous (const StringVector &s) const
 (Not implemented) Whether lookdown of path s will have infinite results. More...
 
HFSTDLL bool is_lookup_infinitely_ambiguous (const StringVector &s) const
 Whether lookup of path s will have infinite results. More...
 
HFSTDLL HfstTransducerlenient_composition (const HfstTransducer &another, bool harmonize=true)
 Make lenient composition of this transducer with . A .O. B = [ A .o. B ] .P. A. More...
 
HFSTDLL HfstOneLevelPathslookdown (const StringVector &s, ssize_t limit=-1) const
 (Not implemented) Lookdown a single string s and return a maximum of limit results. More...
 
HFSTDLL HfstOneLevelPathslookdown_fd (StringVector &s, ssize_t limit=-1) const
 (Not implemented) Lookdown a single string minding flag diacritics properly. More...
 
HFSTDLL HfstOneLevelPathslookup (const StringVector &s, ssize_t limit=-1, double time_cutoff=0.0) const
 Lookup or apply a single tokenized string s and return a maximum of limit results. More...
 
HFSTDLL HfstOneLevelPathslookup (const std::string &s, ssize_t limit=-1, double time_cutoff=0.0) const
 Lookup or apply a single string s and return a maximum of limit results. More...
 
HFSTDLL HfstOneLevelPathslookup (const HfstTokenizer &tok, const std::string &s, ssize_t limit=-1, double time_cutoff=0.0) const
 Lookup or apply a single string s and store a maximum of limit results to results. tok defined how s is tokenized. More...
 
HFSTDLL HfstOneLevelPathslookup_fd (const StringVector &s, ssize_t limit=-1, double time_cutoff=0.0) const
 Lookup or apply a single string s minding flag diacritics properly and store a maximum of limit results to results. More...
 
HFSTDLL HfstOneLevelPathslookup_fd (const std::string &s, ssize_t limit=-1, double time_cutoff=0.0) const
 Lookup or apply a single string s minding flag diacritics properly and store a maximum of limit results to results. More...
 
HFSTDLL HfstOneLevelPathslookup_fd (const HfstTokenizer &tok, const std::string &s, ssize_t limit=-1, double time_cutoff=0.0) const
 Lookup or apply a single string s minding flag diacritics properly and store a maximum of limit results to results. tok defines how s is tokenized. More...
 
HFSTDLL HfstTransducerminimize ()
 Minimize the transducer. More...
 
HFSTDLL HfstTransducern_best (unsigned int n)
 Extract n best paths of the transducer. More...
 
HFSTDLL HfstTransduceroperator= (const HfstTransducer &another)
 Assign this transducer a new value equivalent to transducer another. More...
 
HFSTDLL HfstTransduceroptionalize ()
 Disjunct the transducer with an epsilon transducer. More...
 
HFSTDLL HfstTransduceroutput_project ()
 Extract the output language of the transducer. More...
 
HFSTDLL HfstTransducerpriority_union (const HfstTransducer &another)
 Make priority union of this transducer with another. More...
 
HFSTDLL HfstTransducerprune ()
 Make transducer coaccessible. More...
 
HFSTDLL HfstTransducerprune_alphabet (bool force=true)
 Remove all symbols that do not occur in transitions of the transducer from its alphabet. More...
 
HFSTDLL HfstTransducerpush_weights (PushType type)
 Push weights towards initial or final state(s) as defined by type. More...
 
HFSTDLL HfstTransducerremove_epsilons ()
 Remove all epsilon:epsilon transitions from the transducer so that the transducer remains equivalent. More...
 
HFSTDLL void remove_from_alphabet (const std::string &symbol)
 Remove symbol from the alphabet of the transducer. CURRENTLY NOT IMPLEMENTED. More...
 
HFSTDLL HfstTransducerrepeat_n (unsigned int n)
 A concatenation of n transducers. More...
 
HFSTDLL HfstTransducerrepeat_n_minus (unsigned int n)
 A concatenation of N transducers where N is any number from zero to n, inclusive. More...
 
HFSTDLL HfstTransducerrepeat_n_plus (unsigned int n)
 A concatenation of N transducers where N is any number from n to infinity, inclusive. More...
 
HFSTDLL HfstTransducerrepeat_n_to_k (unsigned int n, unsigned int k)
 A concatenation of N transducers where N is any number from n to k, inclusive. More...
 
HFSTDLL HfstTransducerrepeat_plus ()
 A concatenation of N transducers where N is any number from one to infinity. More...
 
HFSTDLL HfstTransducerrepeat_star ()
 A concatenation of N transducers where N is any number from zero to infinity. More...
 
HFSTDLL HfstTransducerreverse ()
 Reverse the transducer. More...
 
HFSTDLL HfstTransducerset_final_weights (float weight, bool increment=false)
 Set the weights of all final states to weight. increment defines whether the old weight is incremented by weight or overwritten. More...
 
HFSTDLL void set_name (const std::string &name)
 Rename the transducer name. More...
 
HFSTDLL void set_property (const std::string &property, const std::string &value)
 Set arbitrary string property property to value. set_property("name") equals set_name(string&). More...
 
HFSTDLL HfstTransducersubstitute (bool(*func)(const StringPair &sp, StringPairSet &sps))
 Substitute all transition sp with transitions sps as defined by function func. More...
 
HFSTDLL HfstTransducersubstitute (const std::string &old_symbol, const std::string &new_symbol, bool input_side=true, bool output_side=true)
 Substitute all transition symbols equal to old_symbol with symbol new_symbol. input_side and output_side define whether the substitution is made on input and output sides. More...
 
HFSTDLL HfstTransducersubstitute (const StringPair &old_symbol_pair, const StringPair &new_symbol_pair)
 Substitute all transition symbol pairs equal to old_symbol_pair with new_symbol_pair. More...
 
HFSTDLL HfstTransducersubstitute (const StringPair &old_symbol_pair, const StringPairSet &new_symbol_pair_set)
 Substitute all transitions equal to old_symbol_pair with a set of transitions equal to new_symbol_pair_set. More...
 
HFSTDLL HfstTransducersubstitute (const HfstSymbolSubstitutions &substitutions)
 Substitute all transition symbols as defined in substitutions. More...
 
HFSTDLL HfstTransducersubstitute (const HfstSymbolPairSubstitutions &substitutions)
 Substitute all transition symbol pairs as defined in substitutions. More...
 
HFSTDLL HfstTransducersubstitute (const StringPair &symbol_pair, HfstTransducer &transducer, bool harmonize=true)
 Substitute all transitions equal to symbol_pair with a copy of transducer transducer. More...
 
HFSTDLL HfstTransducersubtract (const HfstTransducer &another, bool harmonize=true)
 Subtract transducer another from this transducer. More...
 
HFSTDLL HfstTransducertransform_weights (float(*func)(float))
 Transform all transition and state weights as defined in func. More...
 
HFSTDLL void write_in_att_format (FILE *ofile, bool write_weights=true) const
 Write the transducer in AT&T format to FILE ofile. write_weights defines whether weights are written. More...
 
HFSTDLL void write_in_att_format (const std::string &filename, bool write_weights=true) const
 Write the transducer in AT&T format to FILE named filename. write_weights defines whether weights are written. More...
 
virtual HFSTDLL ~HfstTransducer (void)
 Destructor. More...
 

Static Public Member Functions

static HFSTDLL HfstTransducer identity_pair (ImplementationType type)
 Create identity pair transducer of type. More...
 
static HFSTDLL HfstTransducerread_lexc_ptr (const std::string &filename, ImplementationType type, bool verbose)
 Compile a lexc file in file filename into an HfstTransducer of type type and return the transducer. More...
 
static HFSTDLL HfstTransducer universal_pair (ImplementationType type)
 Create universal pair transducer of type. More...
 

Protected Member Functions

HfstTransducerapply (SFST::Transducer *(*sfst_funct)(SFST::Transducer *), fst::StdVectorFst *(*tropical_ofst_funct)(fst::StdVectorFst *), fsm *(*foma_funct)(fsm *), bool dummy)
 declarations for HFST functions that take two or more parameters More...
 

Friends

HFSTDLL friend std::ostream & operator<< (std::ostream &out, const HfstTransducer &t)
 Write transducer t in AT&T format to ostream out. More...
 

Detailed Description

A synchronous finite-state transducer.

Argument handling

Transducer functions modify their calling object and return 
a reference to the calling object after modification, 
unless otherwise mentioned.
Transducer arguments are usually not modified.
    // transducer is reversed
    transducer.reverse();
    // transducer2 is not modified, but a copy of it is disjuncted with
    // transducer1 
    transducer1.disjunct(transducer2);                                       
    // a chain of functions is possible
    transducer.reverse().determinize().reverse().determinize();      

Implementation types

Currently, an HfstTransducer has four implementation types as 
defined by the enumeration ImplementationType.
When an HfstTransducer is created, its type is defined with an 
ImplementationType argument.
For functions that take a transducer as an argument, the type of 
the calling transducer
must be the same as the type of the argument transducer:
    // this will cause an error
    log_transducer.disjunct(sfst_transducer);                        
    // this works, but weights are lost in the conversion
    log_transducer.convert(SFST_TYPE).disjunct(sfst_transducer);     
    // this works, information is not lost
    log_transducer.disjunct(sfst_transducer.convert(LOG_OPENFST_TYPE)); 

Creating transducers

With HfstTransducer constructors it is possible to create empty, 
epsilon, one-transition and single-path transducers.
Transducers can also be created from scratch with HfstBasicTransducer
and converted to an HfstTransducer.
More complex transducers can be combined from simple ones with various 
functions.

<a name="symbols"></a> 

Special symbols

The HFST transducers support transitions with epsilon, unknown 
and identity symbols.
The special symbols are explained in documentation of datatype #String.

An example:

  // In the xerox formalism used here, "?" means the unknown symbol
  // and "?:?" the identity pair 

  HfstBasicTransducer tr1;
  tr1.add_state(1);
  tr1.set_final_weight(1, 0);
  tr1.add_transition
    (0, HfstBasicTransition(1, "@_UNKNOWN_SYMBOL_@", "foo", 0) );

  // tr1 is now [ ?:foo ]
  
  HfstBasicTransducer tr2;
  tr2.add_state(1);
  tr2.add_state(2);
  tr2.set_final_weight(2, 0);
  tr2.add_transition
    (0, HfstBasicTransition(1, "@_IDENTITY_SYMBOL_@", 
                    "@_IDENTITY_SYMBOL_@", 0) );
  tr2.add_transition
    (1, HfstBasicTransition(2, "bar", "bar", 0) );

  // tr2 is now [ [ ?:? ] [ bar:bar ] ]

  ImplementationType type = SFST_TYPE;
  HfstTransducer Tr1(tr1, type);
  HfstTransducer Tr2(tr2, type);
  Tr1.disjunct(Tr2);

  // Tr1 is now [ [ ?:foo | bar:foo ]  |  [[ ?:? | foo:foo ] [ bar:bar ]] ]

Constructor & Destructor Documentation

Create an uninitialized transducer (use with care).

Note
This constructor leaves the backend implementation variable uninitialized. An uninitialized transducer is likely to cause a TransducerHasWrongTypeException at some point unless it is given a value at some point.

Create an empty transducer, i.e. a transducer that does not recognize any string. The type of the transducer is defined by type.

Note
Use HfstTransducer("@_EPSILON_SYMBOL_@") to create an epsilon transducer.
HfstTransducer ( const std::string &  utf8_str,
const HfstTokenizer multichar_symbol_tokenizer,
ImplementationType  type 
)

Create a transducer by tokenizing the utf8 string utf8_string with tokenizer multichar_symbol_tokenizer. The type of the transducer is defined by type.

utf8_str is read one token at a time and for each token a new transition is created in the resulting transducer. The input and output symbols of that transition are the same as the token read.

An example:

       std::string ustring = "foobar";
       HfstTokenizer TOK;
       HfstTransducer tr(ustring, TOK, LOG_OPENFST_TYPE);
       // tr now contains one path [f o o b a r]
    @see HfstTokenizer  
HfstTransducer ( const std::string &  input_utf8_str,
const std::string &  output_utf8_str,
const HfstTokenizer multichar_symbol_tokenizer,
ImplementationType  type 
)

Create a transducer by tokenizing the utf8 input string input_utf8_string and output string output_utf8_string with tokenizer multichar_symbol_tokenizer. The type of the transducer is defined by type.

input_utf8_str and output_utf8_str are read one token at a time and for each token a new transition is created in the resulting transducer. The input and output symbols of that transition are the same as the input and output tokens read. If either string contains less tokens than another, epsilons are used as transition symbols for the shorter string.

An example:

       std::string input = "foo";
       std::string output = "barr";
       HfstTokenizer TOK;
       HfstTransducer tr(input, output, TOK, SFST_TYPE);
       // tr now contains one path [f:b o:a o:r 0:r]
    @see HfstTokenizer  

Read a binary transducer from transducer stream in.

The stream can contain HFST tranducers or OpenFst, foma or SFST transducers without an HFST header. If the backend implementations are used as such, they are converted into HFST transducers.

For more information on transducer conversions and the HFST header structure, see here.

Precondition
( in.is_eof() == in.is_bad() == false && in.is_fst() ). Otherwise, an exception is thrown.
Exceptions
NotTransducerStreamException
StreamNotReadableException
StreamIsClosedException
TransducerTypeMismatchException
MissingOpenFstInputSymbolTableException
See also
HfstInputStream
HfstTransducer ( const HfstTransducer another)

Create a deep copy of transducer another.

Create an HFST transducer equivalent to HFST basic transducer t. The type of the created transducer is defined by type.

HfstTransducer ( const std::string &  symbol,
ImplementationType  type 
)

Create a transducer that recognizes the string pair <"symbol","symbol">, i.e. [symbol:symbol]. The type of the transducer is defined by type.

See also
String
HfstTransducer ( const std::string &  isymbol,
const std::string &  osymbol,
ImplementationType  type 
)

Create a transducer that recognizes the string pair <"isymbol","osymbol">, i.e [isymbol:osymbol]. The type of the transducer is defined by type.

See also
String
HfstTransducer ( FILE *  ifile,
ImplementationType  type,
const std::string &  epsilon_symbol,
unsigned int &  linecount 
)

Create a transducer of type type as defined in AT&T format in FILE ifile. epsilon_symbol defines how epsilons are represented.

In AT&T format, the transition lines are of the form:

        [0-9]+[\w]+[0-9]+[\w]+[^\w]+[\w]+[^\w]([\w]+(-)[0-9]+(\.[0-9]+)) 
    and final state lines:
        [0-9]+[\w]+([\w]+(-)[0-9]+(\.[0-9]+))
    If several transducers are listed in the same file, 
    they are separated by lines of 
    two consecutive hyphens "--". If the weight 
    (<tt>([\\w]+(-)[0-9]+(\.[0-9]+))</tt>) 
    is missing, the transition or final state is given a zero weight.

    NOTE: If transition symbols contains spaces, they must be escaped
    as "@_SPACE_@" because spaces are used as field separators.
    Both "@0@" and "@_EPSILON_SYMBOL_@" are always interpreted as
    epsilons.

An example:

0      1      foo      bar      0.3
1      0.5
--
0      0.0
--
--
0      0.0
0      0      a        <eps>    0.2
    The example lists four transducers in AT&T format: 
    one transducer accepting the string pair &lt;"foo","bar"&gt;, one
    epsilon transducer, one empty transducer and one transducer 
    that accepts any number of 'a's and produces an empty string
    in all cases. The transducers
    can be read with the following commands (from a file named 
    "testfile.att"):
std::vector<HfstTransducer> transducers;
FILE * ifile = fopen("testfile.att", "rb");
try {
  while (not eof(ifile))
    {
    HfstTransducer t(ifile, TROPICAL_OPENFST_TYPE, "<eps>");
    transducers.push_back(t);
    printf("read one transducer\n");
    }
} catch (NotValidAttFormatException e) {
    printf("Error reading transducer: not valid AT&T format.\n"); }
fclose(ifile);
fprintf(stderr, "Read %i transducers in total.\n", (int)transducers.size());

Epsilon will be represented as "@_EPSILON_SYMBOL_@" in the resulting transducer. The argument epsilon_symbol only denotes how epsilons are represented in ifile.

Exceptions
NotValidAttFormatException
StreamNotReadableException
StreamIsClosedException
See also
write_in_att_format(FILE*,bool)const
String
~HfstTransducer ( void  )
virtual

Destructor.

Member Function Documentation

HfstTransducer& apply ( SFST::Transducer *(*)(SFST::Transducer *)  sfst_funct,
fst::StdVectorFst *(*)(fst::StdVectorFst *)  tropical_ofst_funct,
fsm *(*)(fsm *)  foma_funct,
bool  dummy 
)
protected

declarations for HFST functions that take two or more parameters

bool compare ( const HfstTransducer another,
bool  harmonize = true 
) const

Whether this transducer and another are equivalent.

Two transducers are equivalent iff they accept the same input/output string pairs with the same weights and the same alignments.

HfstTransducer & compose ( const HfstTransducer another,
bool  harmonize = true 
)

Compose this transducer with another.

HfstTransducer & compose_intersect ( const HfstTransducerVector v,
bool  invert = false,
bool  harmonize = true 
)

Compose this transducer with the intersection of transducers in v. If invert is true, then compose the intersection of the transducers in v with this transducer.

The algorithm used by this function is faster than intersecting all transducers one by one and then composing this transducer with the intersection.

Precondition
The transducers in v are deterministic and epsilon-free.
HfstTransducer & concatenate ( const HfstTransducer another,
bool  harmonize = true 
)

Concatenate this transducer with another.

HfstTransducer & convert ( ImplementationType  type,
std::string  options = "" 
)

Convert the transducer into an equivalent transducer in format type.

If a weighted transducer is converted into an unweighted one, all weights are lost. In the reverse case, all weights are initialized to the semiring's one.

A transducer of type SFST_TYPE, TROPICAL_OPENFST_TYPE, LOG_OPENFST_TYPE or FOMA_TYPE can be converted into an HFST_OL_TYPE or HFST_OLW_TYPE transducer, but an HFST_OL_TYPE or HFST_OLW_TYPE transducer cannot be converted to any other type.

Note
For conversion between implementations::HfstTransitionGraph and HfstTransducer, see HfstTransducer(const hfst::implementations::HfstBasicTransducer&, ImplementationType) and hfst::implementations::HfstTransitionGraph::HfstTransitionGraph(const hfst::HfstTransducer&).
HfstTransducer & cross_product ( const HfstTransducer another,
bool  harmonize = true 
)

Make cross product of this transducer with . It pairs every string of this with every string of .

Both transducers must be automata, i.e. map strings onto themselves.

If strings are not the same length, epsilon padding will be added in the end of the shorter string.

HfstTransducer & determinize ( )

Determinize the transducer.

Determinizing a transducer yields an equivalent transducer that has no state with two or more transitions whose input:output symbol pairs are the same.

HfstTransducer & disjunct ( const HfstTransducer another,
bool  harmonize = true 
)

Disjunct this transducer with another.

void extract_paths ( HfstTwoLevelPaths results,
int  max_num = -1,
int  cycles = -1 
) const

Extract a maximum of max_num paths that are recognized by the transducer following a maximum of cycles cycles and store the paths into results.

Parameters
resultsThe extracted paths are inserted here.
max_numThe total number of resulting strings is capped at max_num, with 0 or negative indicating unlimited.
cyclesIndicates how many times a cycle will be followed, with negative numbers indicating unlimited.

This is a version of extract_paths that handles flag diacritics as ordinary symbols and does not validate the sequences prior to outputting as opposed to extract_paths_fd(HfstTwoLevelPaths &, int, int, bool) const.

If this function is called on a cyclic transducer with unlimited values for both max_num and cycles, an exception will be thrown.

This example

    ImplementationType type = SFST_TYPE;
    HfstTransducer tr1("a", "b", type);
    tr1.repeat_star();
    HfstTransducer tr2("c", "d", type);
    tr2.repeat_star();
    tr1.concatenate(tr2).minimize();
    HfstTwoLevelPaths results;
    tr1.extract_paths(results, MAX_NUM, CYCLES);

    // Go through all paths.
    for (HfstTwoLevelPaths::const_iterator it = results.begin();
         it != results.end(); it++)
      {
        std::string istring;
        std::string ostring;

        for (StringPairVector::const_iterator IT = it->second.begin();
             IT != it->second.end(); IT++)
          {
            istring.append(IT->first);
            ostring.append(IT->second);
          }
        // Print input and output strings of each path
        std::cerr << istring << ":" << ostring; 
        // and optionally the weight of the path.
        //std::cerr << "\t" << it->first;
        std::cerr << std::endl; 
      }
    prints with values MAX_NUM == -1 and CYCLES == 1 all paths
    that have no consecutive cycles:
a : b
ac : bd
acc : bdd
c : d
cc : dd
    and with values MAX_NUM == 7 and CYCLES == 2 a maximum of 7 paths
    that follow a cycle a maximum of 2 times (there are 11 such paths,
    but MAX_NUM limits their number to 7):
a : b
aa : bb
aac : bbd
aacc : bbdd
c : d
cc : dd
ccc : ddd
Bug:
Does not work for HFST_OL_TYPE or HFST_OLW_TYPE?
Exceptions
TransducerIsCyclicException
See also
n_best
hfst::HfstTransducer::extract_paths_fd(hfst::HfstTwoLevelPaths&, int, int, bool) const
void extract_paths_fd ( HfstTwoLevelPaths results,
int  max_num = -1,
int  cycles = -1,
bool  filter_fd = true 
) const

Extract a maximum of max_num paths that are recognized by the transducer and are not invalidated by flag diacritic rules following a maximum of cycles cycles and store the paths into results. filter_fd defines whether the flag diacritics themselves are filtered out of the result strings.

Parameters
resultsThe extracted paths are inserted here.
max_numThe total number of resulting strings is capped at max_num, with 0 or negative indicating unlimited.
cyclesIndicates how many times a cycle will be followed, with negative numbers indicating unlimited.
filter_fdWhether the flag diacritics are filtered out of the result strings.

If this function is called on a cyclic transducer with unlimited values for both max_num and cycles, an exception will be thrown.

Flag diacritics are of the form @[PNDRCU][.][A-Z]+([.][A-Z]+)?

For example the transducer

[[@P.FEATURE.FOO@ foo] | [@P.FEATURE.BAR@ bar]]  |  [[foo @U.FEATURE.FOO@] | [bar @U.FEATURE.BAR@]]
    will yield the paths <CODE>[foo foo]</CODE> and <CODE>[bar bar]</CODE>.
    <CODE>[foo bar]</CODE> and <CODE>[bar foo]</CODE> are invalidated
    by the flag diacritics so thay will not be included in \a results.
Bug:
Does not work for HFST_OL_TYPE or HFST_OLW_TYPE?
Exceptions
TransducerIsCyclicException
See also
extract_paths(HfstTwoLevelPaths&, int, int) const
StringSet get_alphabet ( ) const

Get the alphabet of the transducer.

The alphabet is defined as the set of symbols known to the transducer.

StringSet get_first_input_symbols ( ) const

Get first input level symbols of strings recognized (or rejected, if they end in a non-final state) by the transducer.

std::string get_name ( ) const

Get the name of the transducer.

See also
set_name
const std::map< string, string > & get_properties ( ) const

Get all properties form transducer.

string get_property ( const std::string &  property) const

Get arbitrary string propert property. get_property("name") works like get_name.

ImplementationType get_type ( void  ) const

The implementation type of the transducer.

void harmonize ( HfstTransducer another)

Harmonize transducers this and another.

Note
In harmonization, the symbol-to-number correspondencies of this transducer are recoded so that they are equivalent to the ones used in transducer another. Then the unknown and identity symbols are expanded in both transducers. If this and another have type FOMA_TYPE, nothing is done, since foma takes care of harmonization.
HfstTransducer identity_pair ( ImplementationType  type)
static

Create identity pair transducer of type.

The transducer has only one state, and it accepts: Identity:Identity

Transducer weight is 0.

HfstTransducer & input_project ( )

Extract the input language of the transducer.

All transition symbol pairs isymbol:osymbol are changed to isymbol:isymbol.

HfstTransducer & insert_freely ( const StringPair symbol_pair,
bool  harmonize = true 
)

Freely insert symbol pair symbol_pair into the transducer.

To each state in this transducer is added a transition that leads from that state to itself with input and output symbols defined by symbol_pair.

If harmonize is true, then identity and unknown symbols in the transducer will be exapanded byt the symbols in symbol pair. Otherwise they aren't.

HfstTransducer & insert_freely ( const HfstTransducer tr,
bool  harmonize = true 
)

Freely insert a copy of tr into the transducer.

A copy of tr is attached with epsilon transitions to each state of this transducer. After the operation, for each state S in this transducer, there is an epsilon transition that leads from state S to the initial state of tr, and for each final state of tr, there is an epsilon transition that leads from that final state to state S in this transducer. The weights of the final states in tr are copied to the epsilon transitions leading to state S.

Implemented only for implementations::HfstBasicTransducer. Conversion is carried out for an HfstTransducer, if this function is called.

void insert_to_alphabet ( const std::string &  symbol)

Explicitly insert symbol to the alphabet of the transducer.

Note
Usually this function is not needed since new symbols are added to the alphabet by default.
HfstTransducer & intersect ( const HfstTransducer another,
bool  harmonize = true 
)

Intersect this transducer with another.

HfstTransducer & invert ( )

Swap the input and output symbols of each transition in the transducer.

bool is_automaton ( void  ) const

Whether the transducer is an automaton.

bool is_cyclic ( void  ) const

Whether the transducer is cyclic.

bool is_lookdown_infinitely_ambiguous ( const StringVector s) const

(Not implemented) Whether lookdown of path s will have infinite results.

Todo:
todo
bool is_lookup_infinitely_ambiguous ( const StringVector s) const

Whether lookup of path s will have infinite results.

Currently, this function will return whether the transducer is infinitely ambiguous on any lookup path found in the transducer, i.e. the argument s is ignored.

See also
lookup(HfstOneLevelPaths&, const StringVector&, ssize_t) const
HfstTransducer & lenient_composition ( const HfstTransducer another,
bool  harmonize = true 
)

Make lenient composition of this transducer with . A .O. B = [ A .o. B ] .P. A.

HfstOneLevelPaths * lookdown ( const StringVector s,
ssize_t  limit = -1 
) const

(Not implemented) Lookdown a single string s and return a maximum of limit results.

Traverse all paths on logical second level of the transducer to produce all possible inputs on the first. This is in effect a fast composition of single path from left hand side.

Parameters
sstring to look down
limitnumber of strings to extract. -1 tries to extract all and may get stuck if infinitely ambiguous
Returns
output parameter to store unique results
Todo:
todo
HfstOneLevelPaths * lookdown_fd ( StringVector s,
ssize_t  limit = -1 
) const

(Not implemented) Lookdown a single string minding flag diacritics properly.

This is a version of lookdown that handles flag diacritics as epsilons and validates the sequences prior to outputting.

See also
lookdown
Todo:
todo
HfstOneLevelPaths * lookup ( const StringVector s,
ssize_t  limit = -1,
double  time_cutoff = 0.0 
) const

Lookup or apply a single tokenized string s and return a maximum of limit results.

This is a version of lookup that handles flag diacritics as ordinary symbols and does not validate the sequences prior to outputting. Currently, this function calls lookup_fd.

Todo:
Handle flag diacritics as ordinary symbols instead of calling lookup_fd.
See also
lookup_fd
HfstOneLevelPaths * lookup ( const std::string &  s,
ssize_t  limit = -1,
double  time_cutoff = 0.0 
) const

Lookup or apply a single string s and return a maximum of limit results.

This is an overloaded lookup function that leaves tokenizing to the transducer.

HfstOneLevelPaths * lookup ( const HfstTokenizer tok,
const std::string &  s,
ssize_t  limit = -1,
double  time_cutoff = 0.0 
) const

Lookup or apply a single string s and store a maximum of limit results to results. tok defined how s is tokenized.

This function is the same as lookup(const StringVector&, ssize_t, double) const but lookup is not done using a string and a tokenizer instead of a StringVector.

HfstOneLevelPaths * lookup_fd ( const StringVector s,
ssize_t  limit = -1,
double  time_cutoff = 0.0 
) const

Lookup or apply a single string s minding flag diacritics properly and store a maximum of limit results to results.

Traverse all paths on logical first level of the transducer to produce all possible outputs on the second. This is in effect a fast composition of single path from left hand side.

This is a version of lookup that handles flag diacritics as epsilons and validates the sequences prior to outputting. Epsilons on the second level are represented by empty strings in results. For an example of flag diacritics, see hfst::HfstTransducer::extract_paths_fd(hfst::HfstTwoLevelPaths&, int, int, bool) const

Precondition
The transducer must be of type HFST_OL_TYPE or HFST_OLW_TYPE. This function is not implemented for other transducer types.
Parameters
sString to look up. The weight is ignored.
limit(Currently ignored.) Number of strings to look up. -1 tries to look up all and may get stuck if infinitely ambiguous.
time_cutoffNumber of seconds that can pass before lookup is stopped.
Returns
{A pointer to a HfstOneLevelPaths container allocated by callee}
See also
HfstTokenizer::tokenize_one_level
is_lookup_infinitely_ambiguous(const StringVector&) const
Todo:
Do not ignore argument limit.
HfstOneLevelPaths * lookup_fd ( const std::string &  s,
ssize_t  limit = -1,
double  time_cutoff = 0.0 
) const

Lookup or apply a single string s minding flag diacritics properly and store a maximum of limit results to results.

This is an overloaded lookup_fd that leaves tokenizing to the transducer.

Warning
{This function will convert the transducer into HFST_OLW_TYPE which may be very slow for large transducers. Lookup speed can therefore be extremely slow.}
   @param s  String to look up. The weight is ignored.
   @param limit  (Currently ignored.) Number of strings to look up. 
                 -1 tries to look up all and may get stuck 
                 if infinitely ambiguous.
   @param time_cutoff Number of seconds that can pass before lookup is stopped.
   \return{A pointer to a HfstOneLevelPaths container allocated by callee}


  @sa lookup_fd  
HFSTDLL HfstOneLevelPaths* lookup_fd ( const HfstTokenizer tok,
const std::string &  s,
ssize_t  limit = -1,
double  time_cutoff = 0.0 
) const

Lookup or apply a single string s minding flag diacritics properly and store a maximum of limit results to results. tok defines how s is tokenized.

The same as lookup_fd(const StringVector&, ssize_t, double) const but uses a tokenizer and a string instead of a StringVector.

HfstTransducer & minimize ( )

Minimize the transducer.

Minimizing a transducer yields an equivalent transducer with the smallest number of states.

Bug:
OpenFst's minimization algorithm seems to add epsilon transitions to weighted transducers?
HfstTransducer & n_best ( unsigned int  n)

Extract n best paths of the transducer.

In the case of a weighted transducer (TROPICAL_OPENFST_TYPE or LOG_OPENFST_TYPE), best paths are defined as paths with the lowest weight. In the case of an unweighted transducer (SFST_TYPE or FOMA_TYPE), the function returns random paths.

This function is not implemented for FOMA_TYPE or SFST_TYPE. If this function is called by an HfstTransducer of type FOMA_TYPE or SFST_TYPE, it is converted to TROPICAL_OPENFST_TYPE, paths are extracted and it is converted back to FOMA_TYPE or SFST_TYPE. If HFST is not linked to OpenFst library, an ImplementationTypeNotAvailableException is thrown.

HfstTransducer & operator= ( const HfstTransducer another)

Assign this transducer a new value equivalent to transducer another.

HfstTransducer & optionalize ( )

Disjunct the transducer with an epsilon transducer.

HfstTransducer & output_project ( )

Extract the output language of the transducer.

All transition symbol pairs isymbol:osymbol are changed to osymbol:osymbol.

HfstTransducer & priority_union ( const HfstTransducer another)

Make priority union of this transducer with another.

For the operation t1.priority_union(t2), the result is a union of t1 and t2, except that whenever t1 and t2 have the same string on the upper side, the path in t1 overrides the path in t2.

Example

Transducer 1 (t1): a : a b : b

Transducer 2 (t2): b : B c : C

Result ( t1.priority_union(t2) ): a : a b : b c : C

For more information, read: www.fsmbook.com

HfstTransducer & prune ( )

Make transducer coaccessible.

HfstTransducer & prune_alphabet ( bool  force = true)

Remove all symbols that do not occur in transitions of the transducer from its alphabet.

If unknown or identity symbols occur in transitions of the transducer, pruning is not carried out by default.

Parameters
forceWhether unused symbols are removed even if unknown or identity symbols occur in transitions.

Epsilon, unknown and identity symbols are always included in the alphabet.

HfstTransducer & push_weights ( PushType  type)

Push weights towards initial or final state(s) as defined by type.

If the HfstTransducer is of unweighted type (SFST_TYPE or FOMA_TYPE), nothing is done.

See also
PushType
HfstTransducer * read_lexc_ptr ( const std::string &  filename,
ImplementationType  type,
bool  verbose 
)
static

Compile a lexc file in file filename into an HfstTransducer of type type and return the transducer.

HfstTransducer & remove_epsilons ( )

Remove all epsilon:epsilon transitions from the transducer so that the transducer remains equivalent.

void remove_from_alphabet ( const std::string &  symbol)

Remove symbol from the alphabet of the transducer. CURRENTLY NOT IMPLEMENTED.

Precondition
symbol does not occur in any transition of the transducer.
Note
Use with care, removing a symbol that occurs in a transition of the transducer can have unexpected results.
HfstTransducer & repeat_n ( unsigned int  n)

A concatenation of n transducers.

HfstTransducer & repeat_n_minus ( unsigned int  n)

A concatenation of N transducers where N is any number from zero to n, inclusive.

HfstTransducer & repeat_n_plus ( unsigned int  n)

A concatenation of N transducers where N is any number from n to infinity, inclusive.

HfstTransducer & repeat_n_to_k ( unsigned int  n,
unsigned int  k 
)

A concatenation of N transducers where N is any number from n to k, inclusive.

HfstTransducer & repeat_plus ( )

A concatenation of N transducers where N is any number from one to infinity.

HfstTransducer & repeat_star ( )

A concatenation of N transducers where N is any number from zero to infinity.

HfstTransducer & reverse ( )

Reverse the transducer.

A reverted transducer accepts the string "n(0) n(1) ... n(N)" iff the original transducer accepts the string "n(N) n(N-1) ... n(0)"

HfstTransducer & set_final_weights ( float  weight,
bool  increment = false 
)

Set the weights of all final states to weight. increment defines whether the old weight is incremented by weight or overwritten.

If the HfstTransducer is of unweighted type (SFST_TYPE or FOMA_TYPE), nothing is done.

void set_name ( const std::string &  name)

Rename the transducer name.

See also
get_name
void set_property ( const std::string &  property,
const std::string &  value 
)

Set arbitrary string property property to value. set_property("name") equals set_name(string&).

Note
While this function is capable of creating endless amounts of arbitrary metadata, it is suggested that property names are drawn from central repository, or prefixed with "x-". A property that does not follow this convention may affect the behavior of transducer in future releases.
HfstTransducer & substitute ( bool(*)(const StringPair &sp, StringPairSet &sps)  func)

Substitute all transition sp with transitions sps as defined by function func.

Parameters
funcA pointer to a function that takes as its argument a StringPair sp and inserts to StringPairSet sps all StringPairs with which sp is to be substituted. Returns whether any substituting string pairs were inserted in sps, i.e. whether there is a need to perform substitution on transition sp.

An example:

bool function(const StringPair &sp, StringPairSet &sps) 
{
  if (sp.second.compare(sp.first) != 0)
    return false;

  std::string isymbol = sp.first;
  std::string osymbol;

  if (sp.second.compare("a") == 0 ||
      sp.second.compare("o") == 0 ||
      sp.second.compare("u") == 0)
    osymbol = std::string("<back_wovel>");
  if (sp.second.compare("e") == 0 ||
      sp.second.compare("i") == 0)
    osymbol = std::string("<front_wovel>");

  sps.insert(StringPair(isymbol, osymbol));
  return true;
}

...

// For all transitions in transducer t whose input and output wovels 
// are equivalent, substitute the output wovel with a symbol that defines
// whether the wovel in question is a front or back wovel.
t.substitute(&function);
See also
String
HfstTransducer & substitute ( const std::string &  old_symbol,
const std::string &  new_symbol,
bool  input_side = true,
bool  output_side = true 
)

Substitute all transition symbols equal to old_symbol with symbol new_symbol. input_side and output_side define whether the substitution is made on input and output sides.

Parameters
old_symbolSymbol to be substituted.
new_symbolThe substituting symbol.
input_sideWhether the substitution is made on the input side of a transition.
output_sideWhether the substitution is made on the output side of a transition.

The transition weights remain the same.

See also
String
HfstTransducer & substitute ( const StringPair old_symbol_pair,
const StringPair new_symbol_pair 
)

Substitute all transition symbol pairs equal to old_symbol_pair with new_symbol_pair.

The transition weights remain the same.

Implemented only for TROPICAL_OPENFST_TYPE and LOG_OPENFST_TYPE. If this function is called by an unweighted HfstTransducer, it is converted to a weighted one, substitution is made and the transducer is converted back to the original format.

See also
String
HfstTransducer & substitute ( const StringPair old_symbol_pair,
const StringPairSet new_symbol_pair_set 
)

Substitute all transitions equal to old_symbol_pair with a set of transitions equal to new_symbol_pair_set.

The weight of the original transition is copied to all new transitions.

Implemented only for TROPICAL_OPENFST_TYPE and LOG_OPENFST_TYPE. If this function is called by an unweighted HfstTransducer (SFST_TYPE or FOMA_TYPE), it is converted to TROPICAL_OPENFST_TYPE, substitution is done and it is converted back to the original format.

See also
String
HfstTransducer & substitute ( const HfstSymbolSubstitutions substitutions)

Substitute all transition symbols as defined in substitutions.

Each symbol old_symbol is substituted with symbol new_symbol, iff substitutions.find(old_symbol) == new_symbol != substitutions.end(). Otherwise, old_symbol remains the same.

This function performs all substitutions at the same time, so it is more efficient than calling substitute separately for each substitution.

HfstTransducer & substitute ( const HfstSymbolPairSubstitutions substitutions)

Substitute all transition symbol pairs as defined in substitutions.

Each symbol pair old_isymbol:old_osymbol is substituted with symbol pair new_isymbol:new_osymbol, iff substitutions.find(old_isymbol:old_osymbol) == new_isymbol:new_osymbol != substitutions.end(). Otherwise, old_isymbol:old_osymbol remains the same.

This function performs all substitutions at the same time, so it is more efficient than calling substitute separately for each substitution.

HfstTransducer & substitute ( const StringPair symbol_pair,
HfstTransducer transducer,
bool  harmonize = true 
)

Substitute all transitions equal to symbol_pair with a copy of transducer transducer.

A copy of transducer is attached (using epsilon transitions) between the source and target states of the transition to be substituted. The weight of the original transition is copied to the epsilon transition leaving from the source state.

Implemented only for TROPICAL_OPENFST_TYPE and LOG_OPENFST_TYPE. If this function is called by an unweighted HfstTransducer (SFST_TYPE or FOMA_TYPE), it is converted to TROPICAL_OPENFST_TYPE, substitution is done and it is converted back to the original format.

See also
String
HfstTransducer & subtract ( const HfstTransducer another,
bool  harmonize = true 
)

Subtract transducer another from this transducer.

HfstTransducer & transform_weights ( float(*)(float)  func)

Transform all transition and state weights as defined in func.

Parameters
funcA pointer to a function that takes a weight as its argument and returns a weight that will be the new value of the weight given as the argument.

An example:

float func(float f) { 
  return 2*f + 0.5; 
}

...

// All transition and final weights are multiplied by two and summed with 0.5.
transducer.transform_weights(&func);
If the HfstTransducer is of unweighted type 
(#SFST_TYPE or #FOMA_TYPE), nothing is done.
HfstTransducer universal_pair ( ImplementationType  type)
static

Create universal pair transducer of type.

The transducer has only one state, and it accepts: Identity:Identity, Unknown:Unknown, Unknown:Epsilon and Epsilon:Unknown

Transducer weight is 0.

void write_in_att_format ( FILE *  ofile,
bool  write_weights = true 
) const

Write the transducer in AT&T format to FILE ofile. write_weights defines whether weights are written.

The fields in the resulting AT&T format are separated by tabulator characters.

NOTE: If the transition symbols contain space characters, the spaces are printed as "@_SPACE_@" because whitespace characters are used as field separators in AT&T format. Epsilon symbols are printed as "@0@".

    If several transducers are written in the same file, they must 
    be separated by a line of two consecutive hyphens "--", so that
    they will be read correctly by 
    HfstTransducer(FILE*, ImplementationType, const std::string&).

An example:

ImplementationType type = FOMA_TYPE;
HfstTransducer foobar("foo","bar",type);
HfstTransducer epsilon("@_EPSILON_SYMBOL_@",type);
HfstTransducer empty(type);
HfstTransducer a_star("a",type);
a_star.repeat_star();

FILE * ofile = fopen("testfile.att", "wb");
foobar.write_in_att_format(ofile);
fprintf(ofile, "--\n");
epsilon.write_in_att_format(ofile);
fprintf(ofile, "--\n");
empty.write_in_att_format(ofile);
fprintf(ofile, "--\n");
a_star.write_in_att_format(ofile);
fclose(ofile);

This will yield a file "testfile.att" that looks as follows:

0    1    foo  bar  0.0
1    0.0
--
0    0.0
--
--
0    0.0
0    0    a    a    0.0
    @throws StreamCannotBeWrittenException 
    @throws StreamIsClosedException

    @see operator<<(std::ostream &out, const HfstTransducer &t)
    @see HfstTransducer(FILE*, ImplementationType, const std::string&)  
void write_in_att_format ( const std::string &  filename,
bool  write_weights = true 
) const

Write the transducer in AT&T format to FILE named filename. write_weights defines whether weights are written.

If the file exists, it is overwritten. If the file does not exist, it is created.

See also
write_in_att_format

Friends And Related Function Documentation

HFSTDLL friend std::ostream& operator<< ( std::ostream &  out,
const HfstTransducer t 
)
friend

Write transducer t in AT&T format to ostream out.

The same as hfst::HfstTransducer::write_in_att_format(FILE*, bool) const with ostreams. Weights are written if the type of t is weighted.


The documentation for this class was generated from the following files: