res
.Naming convensions
The short names below applies to local variables. Use meaningful (not hyper-short) names for long-lived entities (e.g., function names, members, etc.).
automaton: aut, aut_ Aut, Automaton
LabelSet: ls, labelset, LabelSet
GenSet: gs GenSet
context: ctx Context, Ctx
WeightSet: ws, weightset WeightSet
Weight/word: w
label: l
state: s
transition: t
ostream: os, o
istream: is
The printing functions take the printee first, then the stream.
c++filt
demangles C++ symbol names.valgrind
verifies memory issues.nm
lists the symbols in object files. Use nm -C
to demangle C++ names.ldd
lists the libraries an object file depends upon.It is possible to use ASAN, but there are quite a few tricky issues.
-fsanitize=address
in CXX
, not CXXFLAGS
, because otherwise libtool
tends to move it.vcsn standard -Ee a
for instance properly loads asan), but it is not for Python, since, it is unlikely that your Python was linked against ASAN. So you need to preload it, e.g., LD_PRELOAD=/usr/lib/clang/3.8.0/lib/linux/libclang_rt.asan-x86_64.so
.ASAN_OPTIONS=detect_leaks=0
to disable LSAN.Don't remove temporary files (which is especially useful to keep debug symbols in plugins).
Augment output with debugging information.
Display information about registration and query about dyn algorithms.
Where data is stored at runtime. See VCSN_PLUGINDIR
Specify that "power" should perform the naive iterative multiplicative approach, instead of the squaring one.
Force the display of useless parentheses.
Where the runtime context instantiation are generated and compiled.
Force a more verbose display of expression (XML like), where one can actually see the nesting of the structures.
The python directory.
Disable the generation of a random seed, stick to the compile-time default seed. This ensure that the two successive runs are identical.
Path to the folder in which compiled contexts should be stored.
When reporting compilation error, include the full compiler error
message. If defined as an integer greater than or equal to 2, then
dyn
compilations are reported.
Make the test suite more verbose.
Set to enable Bison parser/ Flex scanner tracing. Can be an integer to denote nesting (which is useful for instance for dot parsing which can fire expression parsing: specify how many layers you want to make verbose).
To run the test suite, 4 processes in parallel (e.g., if you have 4 threads like with an i7), run
$ make -j4 check
To run some of the tests (e.g., tafkit/determinize.chk), run
$ make -j4 tests/python/determinize.log
or
$ make -j4 check TESTS=tests/python/determinize.py
The latter will create tests/test-suite.log
and tests/python/determinize.log
, the former only the latter.
Both will run make all
, which can take a while. If you know what you are doing (for instance you want to check something which is compiled at runtime), you can avoid this make all
by running the test by hand:
$ ./_build/tests/bin/vcsn -e tests/python/determinize.py
You may also run only the unit/rat/demo/tafkit tests:
$ make -j4 check-tafkit
Creating a new test case is quite easy: select the proper file (usually the name of the tests/python/*.py
is clear enough, but make sure with git grep ALGO tests/python
if the ALGO is not actually tested in a less obvious place), then do as you see with the other tests.
Avoid creating too many files. For instance prefix
and suffix
do not deserve two different files, they can both live in prefix.py
for instance.
tests/python/foo.py
by copying some existing filemake sure it is executable
$ chmod +x tests/python/foo.py
add this file to the repository
$ git add tests/python/foo.py
add this file to the list of tests in tests/python/local.mk
, respect the alphabetical order
run just this test to make sure that everything is ok
$ ./_build/35s/tests/bin/vcsn -e tests/python/foo.py
The script _build/35s/tests/bin/vcsn
defines the environment needed to run a non-installed version of Vcsn. To debug a Python program, use this:
./_build/35s/tests/bin/vcsn run gdb --args python -c "import vcsn;
vcsn.context('lal_char(ab), b').expression('[^]*a').derived_term()
for instance. To run a TAF-Kit command under Valgrind:
./_build/35s/tests/bin/vcsn run valgrind ./_build/35s/bin/vcsn-tafkit cat -Ee a
Note that the debug symbols (at least on OS X) are in the *.o
file, not the *.so
file. And the *.o
is removed when the compilation succeeded. To keep it, be sure to enable VCSN_DEBUG
:
VCSN_DEBUG=1 ./_build/35s/tests/bin/vcsn run gdb --args python -c "import vcsn;
vcsn.context('lal_char(ab), b').expression('[^]*a').derived_term()
Also, if your build is optimized, some variables might be optimized away. So use VCSN_CXXFLAGS
to change that:
VCSN_CXXFLAGS='-std=c++14 -O0 -ggdb' \
VCSN_DEBUG=1 \
./_build/35s/tests/bin/vcsn run gdb --args python -c "import vcsn;
vcsn.context('lal_char(ab), b').expression('[^]*a').derived_term()
Be sure to compile in -O3 -DNDEBUG
for benchmarks. Be sure that the tools with compared against (typically OpenFST) are also compiled this way!
When benching against OpenFST, do not use the convenient automaton.fstFOO
functions. For instance:
vcsn = [time(lambda : a.synchronize()) for a in auts]
ofst = [time(lambda : a.fstsynchronize()) for a in auts]
is completely unfair to OpenFST. Do this instead:
ofst =
[time(lambda : subprocess.call(['fstsynchronize', fst.name, '/dev/null']))
for fst in [a.as_fst() for a in auts]]
as it converts to FST format only once (and outside the benchmark command), and does not try to convert the result back to Vcsn. Actually, it does not even try to save the result at all (/dev/null
), since Vcsn does not try to save its result either.
There is a number of rules to follow:
doc
, dyn
, determinize
, expression
, style
(for space changes, etc.)...vcsn score
that demonstrate that there is no regression/an improvementBe sure to read other commit logs to understand the style and to copy it. Here are a few examples:
commit ab30db4e253cf88e167baf7209036e62df2820ce
Author: Akim Demaille <akim@lrde.epita.fr>
Date: Tue Jul 22 13:05:38 2014 +0200
translate: issue a parse error when reading an invalid context
Currently, when parsing a context, we actually get ready to accept any
type (including automata, ints, or pure junk). As a result, we also
try to compile the corresponding make_context, even for obvious parse
errors. This is slow, and provides a bad error message.
Speed is especially an issue with the current %automaton IPython magic
line, as it tries to compile the result at each key stroke, which
results in incredibly slow interaction, as we wait for the compiler to
fail...
* vcsn/dyn/context-parser.hh, lib/vcsn/dyn/context-parser.cc
(parse_context): New.
(context_): New overload to factor some calls.
* lib/vcsn/dyn/translate.cc: Use parse_context.
Remove duplicate handling of EOF.
commit e105139747ae4efc83e6e4b61545feaa25d2efec
Author: Akim Demaille <akim@lrde.epita.fr>
Date: Mon Jul 21 18:57:47 2014 +0200
doc: determinize: be clearer
* share/vcsn/notebooks/algos/determinize.ipynb: Show examples in Q.
commit 5308a3c6679efea3076b5de645b85f85bed1cb9a
Author: Akim Demaille <akim@lrde.epita.fr>
Date: Wed Jul 16 10:28:59 2014 +0200
polynomials: get back the print_weight_ function
This basically reverts
commit 61802295079835b2f71bb4b35545f5c8dcc2a7b7
Author: Akim Demaille <akim@lrde.epita.fr>
Date: Fri Jul 11 18:32:34 2014 +0200
determinize: use our angle-bracket notation for "weighted" states
I see polynomials of states in our future...
* vcsn/weightset/polynomialset.hh (print_): Move to...
* vcsn/weightset/weightset.hh (print_weight): here.
since we no longer need it outside polynomials, as we use
polynomialset's print function.
* vcsn/weightset/weightset.hh (print_weight): Move back to...
* vcsn/weightset/polynomialset.hh (print_weight_): here.
commit 8e183c358dd10246ef57dc25fabe122f4ddb0bb0
Author: Canh Luu <ldcanh@gmail.com>
Date: Wed Jul 2 10:13:07 2014 +0200
determinize: remove complete
Benchs show that even when _not_ asking for a complete result, the
mere fact of checking whether not to complete does incur a small
performance hit.
Worse yet: the next commit completely changes to core loop of the
algorithm, and making the automaton complete really adds cruft to the
algorithm. So instead of trying to merge two algorithms into one,
stop supporting the 'complete' feature in determinization, just call
complete() afterwards if needed.
On erebus, previous commit and then this one:
n-1 n
7.93s 7.80s: a.determinize() # a = ladybird(21)
6.64s 6.45s: a.determinize() # a = lal(a-zA-Z0-9).ladybird(18)
* vcsn/algos/determinize.hh, lib/vcsn/algos/determinize.cc,
* python/vcsn_cxx.cc, tests/python/determinize.py
* vcsn/algos/are-equivalent.hh, bin/vcsn-tafkit.cc,
* vcsn/ctx/lal_char, b.hh, vcsn/dyn/algos.hh: Here.
commit 044bf26c52b825ff155df4e103586cef68092102
Author: Luca Saiu <positron@gnu.org>
Date: Fri Jun 27 12:08:47 2014 +0200
subset automaton bugfix: copy shared_ptr instead of using reference to it
(Reported by Alexandre Duret-Lutz)
I was mistakenly storing a reference to a shared_ptr object, instead
of copying the shared_ptr into another shared_ptr, thus updating the
reference counter. Of course as soon as the referred object was
destroyed, hell broke loose.
* vcsn/core/partition-automaton.hh: One-character fix.
* TODO.txt: Write down that using a *const* shared_ptr for the referred
* automaton, as I did here, is often a good idea in decorators.
commit a17e4e65d6a67dda647776164da0a5e2c6a5799b
Author: Luca Saiu <positron@gnu.org>
Date: Thu Jun 5 16:49:14 2014 +0200
add new partition-automaton decorator, and use it for minimization
Build partition-automaton, a decorator related to the *shape* of an
automaton rather than to the algorithm which produced it, which
reflects a slight change of directions in decorators. Use it in
quotient to build a decorated automaton from classes -- without making
everything inefficient: the algorithm remains identical; it could have
been slightly simplified, but at the cost of introducing dynamic
lookups for state sets.
The decorator is intended to be useful for determinization as well.
Don't accept "brzozowski" as an algorithm variant at the static level;
rather, let the user call minimize_brzozowski directly (Rationale: the
decorator would be very different, and typically uninteresting).
Accept "brzozowski" as a variant at the Dyn and Python levels.
No change in score from the previous commit. On minsky:
1.64s 1.64s a.minimize("moore") # a = std([a-k]{2000})
2.34s 2.32s a.minimize("signature") # a = std([a-g]{800})
* vcsn/core/partition-automaton.hh: New.
* vcsn/algos/minimize.hh: Disable brzozowski; don't strip input.
* vcsn/algos/quotient.hh: Remove the VCSN_ORIGINS kludge, now obsolete here.
* vcsn/algos/minimize-moore.hh, vcsn/algos/minimize-signature.hh,
* vcsn/algos/minimize-weighted.hh: Change result types.
* lib/vcsn/dyn/context-parser.cc: Here.
* lib/vcsn/dyn/context-printer.cc: Here.
* vcsn/local.mk: Here.
* tests/python/minimize.py, tests/python/minimize.dir/intricate.exp.gv,
* tests/python/minimize.dir/nonlal.exp.gv,
* tests/python/minimize.dir/redundant.exp.gv,
* tests/python/minimize.dir/small-nfa.exp.gv,
* tests/python/minimize.dir/small-weighted.exp.gv,
* tests/python/local.mk: Update testsuite.
* vcsn/algos/minimize-brzozowski.hh: Write "brzozowski" with a 'w'.
* NEWS.txt: Brag.