Good probabilistic models are needed in data compression and many other applications. A good model must exploit contextual information, which requires high-order conditioning. As the number of conditioning variables increases, direct estimation of the distribution becomes exponentially more difficult. To circumvent this, we consider a means of adaptively combining several low-order conditional probability distributions into a single higher-order estimate, based on their degree of agreement. Though the technique is broadly applicable, image compression is singled out as a testing ground of its abilities. Good performance is demonstrated by experimental results.