Ending Spam: Bayesian Content Filtering and the Art of Statistical Language
Classification reviewed by Robert Pritchett
Author: Jonathan A. Zdziarski
No Starch Press
555 De Haro St., Suite 250
San Francisco, CA 94107
1 800 420 7240 or 1 415 863 9900
Fax: 415 863 9950
$40 USD, $54 CND, 28 Net UK, 35 Euro
Published: July 2005
For SPAM Geeks or wannabees.
Strengths: Anti SPAM solutions for programmers.
Weaknesses: None found.
Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification by
Jonathan A. Zdziarski goes soup to nuts with history, examples and concepts and algorithms
used to counteract the nefarious (yes, nefarious!) activities of those who inundate us with
unsolicited advertising for products we have no business ever buying.
The book for the most part is built like a charcoal filtered water system. It guides us through the
fine parts and pieces (statistical filtering) needed to reduce the flow without cutting off legitimate
Email and drinkable water at the other end. And Jonathan Zdziarski does a great job educating
us on logic and thought taken to combat this SPAM blight on the Internet.
Chapter 7 gets into problems, solutions and why they are important, what works, what does not
and more importantly, why.
He is interested in getting as close to 100% effectiveness as possible. Interested in tokenization,
Bayesian analysis or Markovian discrimination? It's all here.
There are 3 Parts on filtering in 14 Chapters and one Appendix that gets into open source
solutions including one the author has created (DSPAM). That Appendix is where the meat is
closest to the bone. There is a process for beating back the hoards and this book provides the
methods required to get there.
MPN, LLC 2005 macCompanion
August 2005, Volume 3 Issue 8