# Technical Reports

## HPL-2012-10

**Type Classes of Context Trees**

* Martin, Alvaro; Seroussi, Gadiel; Weinberger, Marcelo J.*

HP Laboratories

HPL-2012-10

**Keyword(s):** context trees; method of types; enumeration; Markov chains; data compression

**Abstract:** It is well known that a tree model does not always admit a finite-state machine (FSM) representation with the same (minimal) number of parameters. Therefore, known characterizations of type classes for FSMs do not apply, in general, to tree models. In this paper, the type class of a sequence with respect to a given context tree T is studied. An exact formula is derived for the size of the class, extending Whittle's formula for type classes with respect to FSMs. The derivation is more intricate than in the FSM case, since some basic properties of FSM types do not hold in general for tree types. The derivation also yields an efficient enumeration of the tree type class. A formula for the number of type classes with respect to $T$ is also derived. The formula is asymptotically tight up to a multiplicative constant and also extends the corresponding result for FSMs. The asymptotic behavior of the number of type classes, and of the size of a class, are expressed in terms of the so- called minimal canonical extension of T, a tree that is generally larger than T but smaller than its FSM closure.

35 Pages

Additional Publication Information: Submitted IEEE Transactions on Information Theory

External Posting Date: January 21, 2012 [Fulltext]. Approved for External Publication

Internal Posting Date: January 21, 2012 [Fulltext]