--------------------- Arithmetic Coding Example Routines ------------------ Copyright 1990 Massachusetts Institute of Technology All rights reserved. Permission to use, copy, or modify this software and its documentation for educational and research purposes only and without fee is hereby granted, provided that this copyright notice appear on all copies and supporting documentation. For any other uses of this software, in original or modified form, including but not limited to distribution in whole or in part, specific prior permission must be obtained from M.I.T. and the author. These programs shall not be used, rewritten, or adapted as the basis of a commercial software or hardware product without first obtaining appropriate licenses from M.I.T. M.I.T. and the author make no representations about the suitability of this software for any purpose. It is provided "as is" without express or implied warranty. ----------------------------------------------------------------------- For a demo, type: make acms_example acms_example 100 4.0 The file "acms.c" contains C language subroutines to perform K-ary arithmetic coding, an efficient form of data compression. The details of the technique are provided in chapter 3 of: Popat, Ashok C., "Scalar Quantization with Arithmetic Coding," SM thesis, MIT Dept. of Elec. Eng. and Comp. Sci., Cambridge, Massachusetts, June 1990. The thesis, in postscript form, is available by anonymous ftp from media-lab.media.mit.edu (18.85.0.2): pub/k-arith-code/doc?.ps.Z If you can't ftp, let me know and I'll email it to you in small chunks. The algorithm doesn't require any division. Encoding requires two multiplications, two additions, and one barrel-shift per source letter. Decoding requires one barrel-shift and not more than ceiling(log_2(K))+1 multiplications and additions per source letter, where K is the alphabet size. Currently, the routines "encode" and "decode" are set up to allow efficient encoding of a sequence of independent identically distributed (iid) discrete random variables. It is not set up for adaptive coding, although extension to the adaptive case is straightforward. Basically, you'd set up a probability matrix rather than a probability vector. You'd switch from row to row of the matrix, according to some state variable in your model. Every row would have to have been rounded to admissible values using, for example, the supplied routine optrnd(). Usage: Here, the source random variables will be called "letters," and a particular sequence of them will be called a "message". N will denote the number of letters in the message; that is, its length. The "alphabet" for the source is the set of integers between 0 to K-1, inclusive. Thus, the alphabet size is K. The letters, being identically distributed, are described by a single probability mass function p[k], k=0,...,K-1. To encode a message: encode(K,p,message,N,16,15,codebuf,&nbits); where Inputs: - K: the alphabet size, type "int". - p: the probability mass function array, type "double *" - message: an array of letters to be encoded, type = "int *" - N: the number of letters in the message, type = "int" - (the values 16 and 15 shouldn't be touched for now) Outputs: - codebuf: the resulting array of code bits, type = "int *" - nbits: set to the number of bits in codebuf, type = "int *" To recover the message from codebuf, the following call to "decode" can be used: decode(K,p,recovered_message,nlet,16,15,codebuf,nbits); where Inputs: - K: the alphabet size, type "int". - p: the probability mass function array, type "double *" - N: the number of letters in the message, type = "int" - (the values 16 and 15 shouldn't be touched for now) - codebuf: the array of code bits, type = "int *" - nbits: the number of bits in codebuf, type = "int" Outputs: - recovered_message: the decoded array of letters, type = "int *" EXAMPLE: See the file "acms_example.c" for an example of how to use the routines. See my thesis for an explanation of what's really being done. Good luck. If you find mistakes in the thesis or code, or if you make any improvements to the code, I'd like to hear about it. Kris Popat popat@media-lab.mit.edu MIT Rm. E15-355 Cambridge, MA 02139