The world wide web has grown exponentially over the previous decade in terms of its size that is currently over a billion sties, as well as the number of users. In fact, web usage has become pervasive to touch all aspects of our life, economy and education. These rapid advances have also significantly increase the vulnerabilities of websites that are being hacked on a daily basis. According to White Hat security's '2015 Website Security Statistics Report' more than 86% of all websites have one or more critical vulnerability and the likelihood of information leakage is 56%. With no effective website security measures in place, one can expect the website security to be even more critical. The main research goal of this paper is to overcome this challenge by presenting an online anomaly behavior analysis of websites (e.g., HTML files) to detect any malicious codes or pages that have been injected by web attacks. Our anomaly analysis approach utilizes feature selection, data mining, data analytics and statistical techniques to identify accurately the webpage contents that have been compromised or can be exploited by attacks such as phishing attacks, cross site scripting attacks, html injection attacks, malware insertion attacks, just to name a few. We have validated our approach on more than 10,000 files and showed that our approach can detect malicious HTML files with a true positive rate of 99% and a false positive rate of 0.8% for abnormal files.