This code performs an analysis on the Breast Cancer Wisconsin dataset, which consists of 569 samples of malignant and benign tumor cells. The dataset includes various features that are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass.
- Descriptive statistics calculation
- Pair plot generation
- Cluster analysis with Agglomerative Clustering
- Running the Code
-matplotlib
-pandas
-sklearn
-seaborn
-numpy
-factor_analyzer
-scipy
You can install these packages using pip.
Once you have the necessary packages installed, you can run the code by using the following command:
python breast_cancer_analysis.py Output The code will output:
- The shape of the data (number of rows and columns)
- Descriptive statistics for the data
- A pair plot showing relationships between pairs of variables, as well as the correlation matrix
- A dendrogram plot and a scatter plot showing the results of the cluster analysis.