##plugins.themes.bootstrap3.article.main##

This investigation explores data mining using open source software WEKA in health care application. The cluster analysis technique is utilized to study the effects of diabetes, obesity and hypertension from the database obtained from Virginia school of Medicine. The simple k-means cluster techniques are adopted to form ten clusters which are clearly discernible to distinguish the differences among the risk factors such as diabetes, obesity and hypertension. Cluster formation was tried by trial and error method and also kept the SSE as low as possible. The SSE is low when numbers of clusters are more. Less than ten clusters formation unable to yield distinguishable information. In this work each cluster is revealing quit important information about the diabetes, obesity, hypertension and their interrelation. Cluster 0: Diabetes ? Obesity ? Hypertension = Healthy patient, Cluster 1: Diabetes ? Obesity ? Hypertension = Healthy patient, Cluster2: Diabetes ? Obesity ? Hypertension = Obesity, Cluster3: Diabetes ? Obesity ?  Hypertension = Patients with Obesity and Hypertension, Cluster4: Boarder line Diabetes ?  Obesity ?  Hypertension = Sever obesity, Cluster5: Obesity ? Hyper tension ? Diabetes = Hypertension, Cluster6: Border line obese   ? Border line hypertension ? Diabetes = No serious complications, Cluster 7: Obesity  ? Hypertension ? Diabetes= Healthy patients, Cluster 8: Obesity  ? Hypertension ? Diabetes= Healthy patients, and Cluster 9: Diabetes ? Hyper tension ? Obesity = High risk unhealthy patients.

Downloads

Download data is not yet available.

References

  1. Frank, Eibe, Mark Hall, Len Trigg, Geoffrey Holmes, and Ian H. Witten. "Data mining in bioinformatics using Weka." Bioinformatics 20, no. 15 (2004): 2479-2481.
     Google Scholar
  2. Islam, Md Zahidul, and Ljiljana Brankovic. "Privacy preserving data mining: A noise addition framework using a novel clustering technique." Knowledge-Based Systems 24, no. 8 (2011): 1214-1223.
     Google Scholar
  3. Rajagopal, Dr. "Customer data clustering using data mining technique." International Journal of Database Management Systems ( IJDMS ) Vol.3, No.4, November 2011
     Google Scholar
  4. Arora, Rakesh Kumar, and D. Badal. "Admission Management through Data Mining using WEKA." International Journal of Advanced Research in Computer Science and Software Engineering 3, no. 10 (2013): 674-678.
     Google Scholar
  5. Mourya, Murlidhar, and Phani Prasad. "An effective execution of diabetes dataset using WEKA." Inter J Comput Sci Inf Technol 4, no. 5 (2013): 681-682.
     Google Scholar
  6. Manikandan, R. Nithya Dr D. RamyachitraP. "An Efficient Bayes Classifiers Algorithm on 10-fold Cross Validation for Heart Disease Dataset." International Journal of Computational Intelligence and Informatics, Vol. 5: No. 3, December 2015.
     Google Scholar
  7. Suman and Pooja Mittal, Comparison and Analysis of Various Clustering Methods in Data mining On Education data set Using the weak tool, International Journal of Emerging Trends & Technology in Computer Science Volume 3, Issue 2,2014, pp 240-244.
     Google Scholar
  8. T. Soni Madhulatha, “An overview on clustering methods” IOSR Journal of Engineering, 2012, Vol. 2(4) pp: 719-725.
     Google Scholar
  9. Durairaj, M., and C. Vijitha. "Educational Data mining for Prediction of Student Performance Using Clustering Algorithms." International Journal of Computer Science and Information Technologies 5, no. 4 (2014): 5987-5991.
     Google Scholar
  10. Chakraborty, Sanjay, and N. K. Nagwani. "Performance evaluation of incremental K-means clustering algorithm." arXiv preprint arXiv:1406.4737(2014).
     Google Scholar
  11. Kaladhar, D. S. V. G. K., Bharath Kumar Pottumuthu, Padmanabhuni V. Nageswara Rao, Varahalarao Vadlamudi, A. Krishna Chaitanya, and R. Harikrishna Reddy. "The Elements of Statistical Learning in Colon Cancer Datasets: Data Mining, Inference and Prediction." Algorithms Research 2, no. 1 (1926): 8-17.
     Google Scholar
  12. Sharma, Narendra, Aman Bajpai, and Mr Ratnesh Litoriya. "Comparison the various clustering algorithms of weka tools." International Journal of Emerging Technologies in Computational and Applied Sciences 4, no. 7 (2012).
     Google Scholar
  13. Agarwal, Jyoti, Renuka Nagpal, and Rajni Sehgal. "Crime analysis using K-means clustering." International Journal of Computer Applications 83, no. 4 (2013).
     Google Scholar
  14. Karmaker, Amitava, and Syed Rahman. "Outlier detection in spatial databases using clustering data mining." In Information Technology: New Generations, 2009. ITNG'09. Sixth International Conference on, pp. 1657-1658. IEEE, 2009.
     Google Scholar
  15. Lu, Yi-Hong, and Yan Huang. "Mining data streams using clustering." In 2005 International Conference on Machine Learning and Cybernetics, vol. 4, pp. 2079-2083. IEEE, 2005.
     Google Scholar
  16. Siddiqui, Mohammad Khubeb, and Shams Naahid. "Analysis of KDD CUP 99 dataset using clustering based data mining." International Journal of Database Theory and Application 6, no. 5 (2013): 23-34.
     Google Scholar
  17. Kim H. Pries and Robert Dunnigan Big Data Analytics A Practical Guide for `Managers -Kim H. Pries and Robert Dunnigan 1st ed. Auerbach Publications, 2015, ch. 5, pp. 161.
     Google Scholar
  18. John Schorling, Department of Medicine, University of Virginia school of Medicine, diabetes data set. Available: http://biostat.mc.vanderbilt.edu/DataSets
     Google Scholar