HFST - Helsinki Finite-State Transducer Technology - C++ API
version 3.9.1
|
A simple transition graph format that consists of states and transitions between those states. More...
#include <HfstTransitionGraph.h>
Public Types | |
typedef HfstStates::const_iterator | const_iterator |
A const iterator type that points a state in a graph. More... | |
typedef C::SymbolType | HfstSymbol |
Datatype for a symbol in a transition. More... | |
typedef std::pair< HfstSymbol, HfstSymbol > | HfstSymbolPair |
Datatype for a symbol pair in a transition. More... | |
typedef std::set< HfstSymbolPair > | HfstSymbolPairSet |
A set of symbol pairs. More... | |
typedef std::vector < HfstSymbolPair > | HfstSymbolPairVector |
A vector of symbol pairs. More... | |
typedef std::set< HfstSymbol > | HfstSymbolSet |
A set of symbol pairs. More... | |
typedef std::set< HfstSymbol > | HfstTransitionGraphAlphabet |
Datatype for the alphabet of a graph. More... | |
typedef std::vector < HfstTransition< C > > | HfstTransitions |
Datatype for the states of a transition in a graph. More... | |
Public Member Functions | |
HFSTDLL HfstState | add_state (void) |
Add a new state to this graph and return its number. More... | |
HFSTDLL HfstState | add_state (HfstState s) |
Add a state s to this graph. More... | |
HFSTDLL void | add_symbol_to_alphabet (const HfstSymbol &symbol) |
Explicitly add symbol to the alphabet of the graph. More... | |
HFSTDLL void | add_symbols_to_alphabet (const HfstSymbolSet &symbols) |
Same as add_symbol_to_alphabet for each symbol in symbols. More... | |
HFSTDLL void | add_transition (HfstState s, const HfstTransition< C > &transition, bool add_symbols_to_alphabet=true) |
Add a transition transition to state s. More... | |
HFSTDLL iterator | begin () |
Get an iterator to the beginning of the states in the graph. More... | |
HFSTDLL const_iterator | begin () const |
Get a const iterator to the beginning of states in the graph. More... | |
HFSTDLL HfstTransitionGraph & | disjunct (const StringPairVector &spv, typename C::WeightType weight) |
Disjunct this graph with a one-path graph defined by string pair vector spv with weight weight. More... | |
HFSTDLL iterator | end () |
Get an iterator to the end of states (last state + 1) in the graph. More... | |
HFSTDLL const_iterator | end () const |
Get a const iterator to the end of states (last state + 1) in the graph. More... | |
HFSTDLL const HfstTransitionGraphAlphabet & | get_alphabet () const |
Get the set of HfstSymbols in the alphabet of the graph. More... | |
HFSTDLL C::WeightType | get_final_weight (HfstState s) const |
HFSTDLL HfstState | get_max_state () const |
Get the biggest state number in use. More... | |
HFSTDLL HfstTransitionGraph & | harmonize (HfstTransitionGraph &another) |
Harmonize this HfstTransitionGraph and another. More... | |
HFSTDLL | HfstTransitionGraph (void) |
Create a graph with one initial state that has state number zero and is not a final state, i.e. create an empty graph. More... | |
HFSTDLL | HfstTransitionGraph (const HfstTransitionGraph &graph) |
Create a deep copy of HfstTransitionGraph graph. More... | |
HFSTDLL | HfstTransitionGraph (const hfst::HfstTransducer &transducer) |
Create an HfstTransitionGraph equivalent to HfstTransducer transducer. FIXME: move to a separate file. More... | |
HFSTDLL HfstTransitionGraph & | insert_freely (const HfstSymbolPair &symbol_pair, typename C::WeightType weight) |
Insert freely any number of symbol_pair in the graph with weight weight. More... | |
HFSTDLL HfstTransitionGraph & | insert_freely (const HfstSymbolPairSet &symbol_pairs, typename C::WeightType weight) |
Insert freely any number of any symbol in symbol_pairs in the graph with weight weight. More... | |
HFSTDLL HfstTransitionGraph & | insert_freely (const HfstTransitionGraph &graph) |
Insert freely any number of graph in this graph. More... | |
HFSTDLL bool | is_final_state (HfstState s) const |
Whether state s is final. FIXME: return positive infinity instead if not final. More... | |
HFSTDLL int | longest_path_size () |
HFSTDLL HfstTransitionGraph & | operator= (const HfstTransitionGraph &graph) |
The assignment operator. More... | |
HFSTDLL const HfstTransitions & | operator[] (HfstState s) const |
Get the set of transitions of state s in this graph. More... | |
HFSTDLL std::vector< unsigned int > | path_sizes () |
HFSTDLL void | prune_alphabet (bool force=true) |
Remove all symbols that do not occur in transitions of the graph from its alphabet. More... | |
HFSTDLL void | remove_symbol_from_alphabet (const HfstSymbol &symbol) |
Remove symbol symbol from the alphabet of the graph. More... | |
HFSTDLL void | remove_transition (HfstState s, const HfstTransition< C > &transition, bool remove_symbols_from_alphabet=false) |
Remove transition transition from state s. remove_symbols_from_alphabet defines whether symbols in transition are removed from the alphabet if they are no longer used in the graph. More... | |
HFSTDLL void | set_final_weight (HfstState s, const typename C::WeightType &weight) |
Set the final weight of state s in this graph to weight. More... | |
HFSTDLL HfstTransitionGraph & | sort_arcs (void) |
Sort the arcs of this transducer according to input and output symbols. More... | |
std::vector< HfstState > | states () const |
The states of the graph. More... | |
HfstBasicStates | states_and_transitions () const |
The states of the graph and their transitions. More... | |
HFSTDLL HfstTransitionGraph & | substitute (const HfstSymbol &old_symbol, const HfstSymbol &new_symbol, bool input_side=true, bool output_side=true) |
Substitute old_symbol with new_symbol in all transitions. input_side and output_side define whether the substitution is made on input and output sides. More... | |
HfstTransitionGraph & | substitute (const HfstSymbolSubstitutions &substitutions) |
Substitute all transitions as defined in substitutions. More... | |
HFSTDLL HfstTransitionGraph & | substitute (const HfstSymbolPairSubstitutions &substitutions) |
Substitute all transitions as defined in substitutions. More... | |
HFSTDLL HfstTransitionGraph & | substitute (const HfstSymbolPair &sp, const HfstSymbolPairSet &sps) |
Substitute all transitions sp with a set of transitions sps. More... | |
HFSTDLL HfstTransitionGraph & | substitute (const HfstSymbolPair &old_pair, const HfstSymbolPair &new_pair) |
Substitute all transitions old_pair with new_pair. More... | |
HFSTDLL HfstTransitionGraph & | substitute (bool(*func)(const HfstSymbolPair &sp, HfstSymbolPairSet &sps)) |
Substitute all transitions with a set of transitions as defined by function func. More... | |
HFSTDLL HfstTransitionGraph & | substitute (const HfstSymbolPair &sp, const HfstTransitionGraph &graph) |
Substitute all transitions old_symbol : new_symbol with a copy of graph. More... | |
HFSTDLL const HfstTransitions & | transitions (HfstState s) const |
Alternative name for operator[]. More... | |
HFSTDLL HfstTransitions & | transitions (HfstState s) |
Get mutable transitions. More... | |
HFSTDLL void | write_in_att_format (std::ostream &os, bool write_weights=true) |
Write the graph in AT&T format to ostream os. write_weights defines whether weights are printed. More... | |
HFSTDLL void | write_in_att_format (FILE *file, bool write_weights=true) |
Write the graph in AT&T format to FILE file. write_weights defines whether weights are printed. More... | |
HFSTDLL void | write_in_att_format_number (FILE *file, bool write_weights=true) |
Write the graph in AT&T format to FILE file using numbers instead of symbol names. write_weights defines whether weights are printed. More... | |
HFSTDLL void | write_in_prolog_format (FILE *file, const std::string &name, bool write_weights=true) |
Write the graph in prolog format to FILE file. write_weights defines whether weights are printed (todo). More... | |
HFSTDLL void | write_in_prolog_format (std::ostream &os, const std::string &name, bool write_weights=true) |
Write the graph in prolog format to ostream os. write_weights defines whether weights are printed (todo). More... | |
HFSTDLL void | write_in_xfst_format (std::ostream &os, bool write_weights=true) |
Write the graph in xfst text format to ostream os. write_weights defines whether weights are printed (todo). More... | |
HFSTDLL void | write_in_xfst_format (FILE *file, bool write_weights=true) |
Write the graph in xfst text format to FILE file. write_weights defines whether weights are printed (todo). More... | |
Static Public Member Functions | |
static HFSTDLL HfstTransitionGraph | read_in_att_format (std::istream &is, std::string epsilon_symbol, unsigned int &linecount) |
Create an HfstTransitionGraph as defined in AT&T transducer format in istream is. epsilon_symbol defines how epsilon is represented. More... | |
static HFSTDLL HfstTransitionGraph | read_in_att_format (FILE *file, std::string epsilon_symbol, unsigned int &linecount) |
Create an HfstTransitionGraph as defined in AT&T transducer format in FILE file. epsilon_symbol defines how epsilon is represented. More... | |
Public Attributes | |
std::string | name |
The name of the graph. More... | |
A simple transition graph format that consists of states and transitions between those states.
Probably the easiest way to use this template is to choose the implementations HfstBasicTransducer (HfstTransitionGraph<HfstTropicalTransducerTransitionData>) and HfstBasicTransition (HfstTransition<HfstTropicalTransducerTransitionData>). The class HfstTropicalTransducerTransitionData contains an input string, an output string and a float weight. HfstBasicTransducer is the implementation that is used as an example in this documentation.
An example of creating a HfstBasicTransducer [foo:bar baz:baz] with weight 0.4 from scratch:
// Create an empty transducer // The transducer has initially one start state (number zero) // that is not final HfstBasicTransducer fsm; // Add two states to the transducer fsm.add_state(1); fsm.add_state(2); // Create a transition [foo:bar] leading to state 1 with weight 0.1 ... HfstBasicTransition tr(1, "foo", "bar", 0.1); // ... and add it to state zero fsm.add_transition(0, tr); // Add a transition [baz:baz] with weight 0 from state 1 to state 2 fsm.add_transition(1, HfstBasicTransition(2, "baz", "baz", 0.0)); // Set state 2 as final with weight 0.3 fsm.set_final_weight(2, 0.3);
An example of iterating through a HfstBasicTransducer's states and transitions when printing it in AT&T format to stderr:
// The first state is always number zero. unsigned int source_state=0; // Go through all states for (HfstBasicTransducer::const_iterator it = fsm.begin(); it != fsm.end(); it++ ) { // Go through all transitions for (HfstBasicTransducer::HfstTransitions::const_iterator tr_it = it->begin(); tr_it != it->end(); tr_it++) { std::cerr << source_state << "\t" << tr_it->get_target_state() << "\t" << tr_it->get_input_symbol() << "\t" << tr_it->get_output_symbol() << "\t" << tr_it->get_weight() << std::endl; } if (fsm.is_final_state(source_state)) { std::cerr << source_state << "\t" << fsm.get_final_weight(source_state) << std::endl; } // the next state is numbered source_state + 1 source_state++; }
typedef HfstStates::const_iterator const_iterator |
A const iterator type that points a state in a graph.
The value pointed by the iterator is of type HfstTransitions.
typedef C::SymbolType HfstSymbol |
Datatype for a symbol in a transition.
typedef std::pair<HfstSymbol, HfstSymbol> HfstSymbolPair |
Datatype for a symbol pair in a transition.
typedef std::set<HfstSymbolPair> HfstSymbolPairSet |
A set of symbol pairs.
typedef std::vector<HfstSymbolPair> HfstSymbolPairVector |
A vector of symbol pairs.
typedef std::set<HfstSymbol> HfstSymbolSet |
A set of symbol pairs.
typedef std::set<HfstSymbol> HfstTransitionGraphAlphabet |
Datatype for the alphabet of a graph.
typedef std::vector<HfstTransition<C> > HfstTransitions |
Datatype for the states of a transition in a graph.
|
inline |
Create a graph with one initial state that has state number zero and is not a final state, i.e. create an empty graph.
|
inline |
Create a deep copy of HfstTransitionGraph graph.
|
inline |
Create an HfstTransitionGraph equivalent to HfstTransducer transducer. FIXME: move to a separate file.
|
inline |
Add a new state to this graph and return its number.
Add a state s to this graph.
If the state already exists, it is not added again. All states with state number smaller than s are also added to the graph if they did not exist before.
|
inline |
Explicitly add symbol to the alphabet of the graph.
|
inline |
Same as add_symbol_to_alphabet for each symbol in symbols.
|
inline |
Add a transition transition to state s.
If state s does not exist, it is created.
|
inline |
Get an iterator to the beginning of the states in the graph.
For an example, see HfstTransitionGraph
|
inline |
Get a const iterator to the beginning of states in the graph.
|
inline |
Disjunct this graph with a one-path graph defined by string pair vector spv with weight weight.
There is no way to test whether a graph is a trie, so the use of this function is probably limited to fast construction of a lexicon. Here is an example:
HfstBasicTransducer lexicon; HfstTokenizer TOK; lexicon.disjunct(TOK.tokenize("dog"), 0.3); lexicon.disjunct(TOK.tokenize("cat"), 0.5); lexicon.disjunct(TOK.tokenize("elephant"), 1.6);
|
inline |
Get an iterator to the end of states (last state + 1) in the graph.
|
inline |
Get a const iterator to the end of states (last state + 1) in the graph.
|
inline |
Get the set of HfstSymbols in the alphabet of the graph.
The HfstSymbols do not necessarily occur in any transitions of the graph. Epsilon, unknown and identity symbols are always included in the alphabet.
|
inline |
Get the final weight of state s in this graph.
|
inline |
Get the biggest state number in use.
|
inline |
Harmonize this HfstTransitionGraph and another.
In harmonization the unknown and identity symbols in transitions of both graphs are expanded according to the symbols that are previously unknown to the graph.
For example the graphs
[a:b ?:?] [c:d ? ?:c]
are expanded to
[ a:b [?:? | ?:c | ?:d | c:d | d:c | c:? | d:?] ] [ c:d [? | a | b] [?:c| a:c | b:?] ]
when harmonized. The symbol "?" means @_UNKNOWN_SYMBOL_@ in either or both sides of a transition (transitions of type [?:x], [x:?] and [?:?]). The transition [?] means [@_IDENTITY_SYMBOL_@].
|
inline |
Insert freely any number of symbol_pair in the graph with weight weight.
|
inline |
Insert freely any number of any symbol in symbol_pairs in the graph with weight weight.
|
inline |
Insert freely any number of graph in this graph.
|
inline |
Whether state s is final. FIXME: return positive infinity instead if not final.
|
inline |
The length of longest string accepted by this graph. If no string is accepted, return -1.
|
inline |
The assignment operator.
|
inline |
Get the set of transitions of state s in this graph.
If the state does not exist, a StateIndexOutOfBoundsException is thrown.
|
inline |
The lengths of strings accepted by this graph, in descending order. If not string is accepted, return an empty vector.
|
inline |
Remove all symbols that do not occur in transitions of the graph from its alphabet.
force | Whether unused symbols are removed even if unknown or identity symbols occur in transitions. |
Epsilon, unknown and identity symbols are always included in the alphabet.
|
inlinestatic |
Create an HfstTransitionGraph as defined in AT&T transducer format in istream is. epsilon_symbol defines how epsilon is represented.
|
inlinestatic |
Create an HfstTransitionGraph as defined in AT&T transducer format in FILE file. epsilon_symbol defines how epsilon is represented.
|
inline |
Remove symbol symbol from the alphabet of the graph.
|
inline |
Remove transition transition from state s. remove_symbols_from_alphabet defines whether symbols in transition are removed from the alphabet if they are no longer used in the graph.
If state or transition does not exist, nothing is done.
|
inline |
Set the final weight of state s in this graph to weight.
If the state does not exist, it is created.
|
inline |
Sort the arcs of this transducer according to input and output symbols.
|
inline |
The states of the graph.
|
inline |
The states of the graph and their transitions.
|
inline |
Substitute old_symbol with new_symbol in all transitions. input_side and output_side define whether the substitution is made on input and output sides.
|
inline |
Substitute all transitions as defined in substitutions.
|
inline |
Substitute all transitions as defined in substitutions.
For each transition x:y, substitutions is searched and if a mapping x:y -> X:Y is found, the transition x:y is replaced with X:Y. If no mapping is found, the transition remains the same.
|
inline |
Substitute all transitions sp with a set of transitions sps.
|
inline |
Substitute all transitions old_pair with new_pair.
|
inline |
Substitute all transitions with a set of transitions as defined by function func.
func takes as its argument a transition sp and inserts into the set of transitions sps the transitions with which the original transition sp must be replaced. func returns a value indicating whether any substitution must be made, i.e. whether any transition was inserted into sps.
|
inline |
Substitute all transitions old_symbol : new_symbol with a copy of graph.
Copies of graph are attached to this graph with epsilon transitions.
The weights of the transitions to be substituted are copied to epsilon transitions leaving from the source state of the transitions to be substituted to the initial state of a copy of graph.
The final weights in graph are copied to epsilon transitions leading from the final states (after substitution non-final states) of graph to target states of transitions old_symbol : new_symbol (that are substituted) in this graph.
|
inline |
Alternative name for operator[].
Python interface uses this function as '[]' is not a legal name.
|
inline |
Get mutable transitions.
|
inline |
Write the graph in AT&T format to ostream os. write_weights defines whether weights are printed.
|
inline |
Write the graph in AT&T format to FILE file. write_weights defines whether weights are printed.
|
inline |
Write the graph in AT&T format to FILE file using numbers instead of symbol names. write_weights defines whether weights are printed.
|
inline |
Write the graph in prolog format to FILE file. write_weights defines whether weights are printed (todo).
|
inline |
Write the graph in prolog format to ostream os. write_weights defines whether weights are printed (todo).
|
inline |
Write the graph in xfst text format to ostream os. write_weights defines whether weights are printed (todo).
|
inline |
Write the graph in xfst text format to FILE file. write_weights defines whether weights are printed (todo).
std::string name |
The name of the graph.