Playful Technology Limited

Is It A Mushroom or Is It A Toadstool?

Using Bayesian Belief Networks to classify fungus edibility

The UCI Machine Learning Mushroom Classification Dataset on Kaggle tabulates discrete features around 8000 specimens of fungi. There are 23 species represented, and the challenge is to classify which are edible and which are poisonous. Since the data are all categorical, I decided that a Bayesian Belief Network would be a suitable, and used an ad-hoc clustering algorithm to infer a hidden variable.

These results seem promising, but I wanted to see if I could do even better. This time I used Mutual Information to infer two hidden variables.

WARNING This is intended solely as a technology demonstration. Playful Technology Limited cannot accept any liability if you pick wild mushrooms on the basis of these notebooks. If you want to forage for wild mushrooms, find an experienced guide.

If you are interested in classification problems, contact me.

By @Dr Peter J Bleackley in
Tags : #Kaggle, #classification, #clustering, #data science, #fungi, #bayesian, #information theory, #mutual information,