This page lists questions about automata and other rational/regular expressions that were asked on Stackoverflow, and where Vcsn seems to be an appropriate tool to compute the answer.
The set of all strings beginning with 101 and ending with 01010.
First, let's define our "context": we work with "labels are letters" (lal), on the alphabet $\{0, 1\}$. We don't use weights, or rather, we use the traditional Boolean weights: $\mathbb{B}$.
import vcsn
c = vcsn.context('lal_char(01), b')
c
Then, we build our expression using an unusual operator: $\mathsf{E} \& \mathsf{F}$ denotes the conjunction of expressions $\mathsf{E}$ and $\mathsf{F}$. In this case (unweighted/Boolean automata), it denotes exactly the intersection of languages.
e = c.expression('(101[01]*)&([01]*01010)')
e
We want to normalize this extended expression (it has conjunction and complement operators) into a basic expression. To this end, we first convert it to an automaton.
a = e.automaton()
a
and then we convert this automaton into a basic expression:
a.expression()
Or, in ASCII:
print(a.expression())
I'd like to know if it's possible to match lines that don't contain a specific word (e.g. hede) using a regular expression?
First, let's define that alphabet we work on: from $a$ to $z$ for instance.
import vcsn
c = vcsn.context('lal_char(a-z), b')
c
Then we define our expression, which is extended (it uses the complement operator), so to normalize it, we first convert it into automaton (with _expression_.automaton
), from which we extract a basic expression (with _automaton_.expresion
).
e = c.expression('(hede){c}')
e
a = e.automaton()
a
a.expression()
Or, in ASCII (+
is usually denoted |
; \e
denotes the empty word; and [^]
denotes any character, usually written .
):
print(a.expression())
Is there a tool (or an algorithm) to convert a finite state machine into a regular expression?
Vcsn is tool for rational expressions and automata. See http://vcsn.lrde.epita.fr.
import vcsn
Build a random automaton $a$ of 4 states, labeled by letters to choose in ${a, b, c}$. Then "lift" it into an automaton $b$ labeled by expressions.
a = vcsn.context('lal(abc), b').random_automaton(4)
b = a.lift()
b
Eliminate state 2, then 3, etc. The order does not matter for correction (any order gives a correct result), but the quality of the result does depend on this order.
b = b.eliminate_state(2)
b
b = b.eliminate_state(3); b
b = b.eliminate_state(1); b
Eliminate the last state, and read the answer.
b = b.eliminate_state(0); b
Alternatively, you can use the method automaton.expression
to ask for the result.
a.expression()
You may see this algorithm run "interactively" using %demo
.
a = vcsn.context('lal(abc), b').random(10)
b = a.lift()
%demo b eliminate_state
I'm trying to construct a regular expression from a Finite Automaton but found my self completely stuck with this one.
%%automaton a
$ -> q3
q1 -> q2 a
q1 -> q3 b
q2 -> q2 a, b
q3 -> q4 a
q3 -> q3 b
q4 -> q4 a
q4 -> q1 b
q3 -> $
q4 -> $
a.expression()