Transducers, also called k-tape automata, are finite state machines whose transitions are labeled on several tapes. The labelset of a transducer is a Cartesian product of the labelsets of each tape: $L = L_1 \times \dots \times L_k$.
Usually, it is common to manipulate 2-tape transducers, and to consider one as the input tape, and the other as the output tape. For example, we can define a 2-tape transducer with the first tape accepting letters in [a-c], and the same for the second tape:
import vcsn
ctx = vcsn.context("lat<lal_char(abc), lal_char(abc)>, b")
ctx
Now we can define a transducer that will transform every a into b, and keep the rest of the letters. When writing the expression, to delimit the labels (a letter for each tape), we have to use simple quotes.
r = ctx.expression("(a|b+b|b+c|c)*")
r
r.automaton()
Similarly, it is possible to define weighted transducers, as for weighted automata:
ctxw = vcsn.context("lat<lan_char(ab), lan_char(xy)>, z")
ctxw
r = ctxw.expression("(a|x)*((a|y)(b|x))*(b|y)*")
r
r.thompson()
This transducer transforms the as at the beginning into xs, then ab into yx, then bs into ys. As you can see, it's possible to have $\varepsilon$-transitions in a transducer.
Keep in mind that while it is the common use-case, transducers are not limited to 2 tapes, but can have an arbitrary number of tapes. The notion of input tape and output tape becomes fuzzy, and the problem will have to be addressed in the algorithms' interface.