It seems that the cluster result of dbscan-on-spark is not right.
I used sklearn to generate 100 data, and used sklearn to train data. The result is different from that of dbscan-on-spark . Here are my experimental data and results.
Is there something wrong with me or the dbscan-on-spark code.
The 100 data is , first and second column is x and y , the third is the output clusterid used dbscan-on-spark with parameter maxPointsPerPartition=12, eps = 0.3 , minPoints = 10
.
-1.0158802720216003,0.147342211484412,0
-1.0146038471276722,0.03688725664273079,0
-0.9891239738704828,0.022213809184072942,0
-0.9638043562674901,0.19370487762607208,0
-0.9528948845748842,0.2606655111008295,0
-0.9333963520783779,0.3087590886721549,2
-0.9075869251302896,0.41694606375014176,2
-0.8897644802049538,0.3652758661784908,2
-0.880180077035953,0.47937007252773894,2
-0.877911880574905,0.586132545648214,2
-0.7753025837331028,0.6042502634413288,2
-0.7707474677539745,0.6849960870555312,2
-0.7311006936368126,0.7227208343955539,2
-0.6983566813753926,0.720816991846437,2
-0.5938859100499629,0.8422336677663952,2
-0.5846965684242103,0.7441899166472538,2
-0.5343754904916749,0.8321019363347932,2
-0.47424038807062,0.8758860679090658,2
-0.3878784206919627,0.915309044770749,2
-0.3289511302371282,0.9573149852131584,2
-0.2959324774808664,0.9325603737985263,2
-0.21870067123075473,0.9970866099086676,2
-0.11691604427847285,0.9778576019025226,2
-0.10195211460944066,0.9709550110857252,2
-0.029517735653622493,0.507649328605902,0
-0.023708638395017974,1.0069445660821037,2
-0.005849731617721753,1.0054354462743158,2
0.004878608148313203,0.33471329223760526,0
0.013060570355586543,0.43901096175191556,0
0.013738246924096587,0.24394131643956118,0
0.015089734435828273,0.3515797069345012,0
0.047196112670520006,0.2206600035440831,0
0.05576443581177526,0.15151041876834784,4
0.07973418606854518,0.054405388341981394,4
0.11072653052995088,0.9518445839926504,2
0.14329032472059547,0.9604927564401988,2
0.15054141777418886,0.03451008884781809,4
0.16128620811774508,-0.09013998774134449,4
0.18846566254628738,-0.05019026114410958,4
0.20397463243581263,0.9737643345398193,2
0.22359870290598133,-0.15409491439129092,0
0.2751214191247985,0.9347761383211116,2
0.287593102624648,-0.24638341424476687,0
0.36088367672771,-0.2411234194792873,3
0.37624208564171424,-0.27216645323731653,3
0.38265901863925844,0.9187882235965241,2
0.3887195306260235,0.9033706015027824,2
0.40914260501491323,-0.3033592048800921,3
0.43346670763470607,0.8748254472162161,2
0.4756292725872716,-0.35619531114932845,3
0.496985556291109,0.8394683116286269,2
0.5240072940588782,-0.415381768593055,3
0.5482086350246445,0.7973898622196185,2
0.5790022726885077,-0.43888757836793596,3
0.6062194604607748,0.7624055296463416,2
0.6168393841819796,-0.4254723492171427,3
0.6776865344598341,0.7288515644281421,2
0.7167688763117702,0.6947432873340993,2
0.7476336573593888,-0.43956394127856613,3
0.7572920472149428,-0.436367630296194,3
0.7740957194369665,0.6125896453298072,2
0.8011753211137118,0.57376379359662,2
0.8417223438915941,0.5526331446967239,2
0.8532880875726063,-0.49795719019316187,3
0.863385424874792,0.4747468867223078,2
0.880275606232329,-0.4927970362051949,3
0.9032177435804262,0.40648142738558235,2
0.9394614201995631,0.37705763226924555,2
0.9446468417742363,0.18276454590281604,2
0.974681291427821,0.04097122039961243,0
0.9767100737539655,-0.02241667025607359,0
0.981331751323225,0.31018095161038134,2
0.9895718078881754,0.1459207265521259,2
0.9924132799217354,0.2407697250631794,2
1.0185336471622448,-0.5338521320792655,3
1.0284991718568057,-0.5059328635382044,3
1.0737327870386202,-0.5064573873771955,3
1.155279215360569,-0.49812019299551963,3
1.2290952646189792,-0.46969221135524847,3
1.2916449149878702,-0.4412009995446211,3
1.3291052136060821,-0.44642696791708286,3
1.4442640860583817,-0.4319727123957646,3
1.4632640963338075,-0.3945052322897577,3
1.476901232024859,-0.3357710068006996,3
1.5790424791153852,-0.2927273617730516,1
1.6839521844358782,-0.2766017069970544,1
1.7288137825879888,-0.24294452147350262,1
1.7460881272956796,-0.19290929215511726,1
1.7596749110664895,-0.1514486544693371,1
1.7891687192218997,-0.10216824559688067,1
1.8414791557100734,0.007705154849708373,1
1.87128705469536,-0.024324243892344288,1
1.8961568159672442,0.14023128237704588,1
1.9140044814018815,0.025879420371462798,1
1.9458748140234137,0.1348425674970267,1
1.9629414250510497,0.2391046347601252,1
1.9968692003241,0.525507288705618,1
2.0040410445424053,0.3872492455105901,1
2.012481427686388,0.4616919308479025,1
2.013488317583532,0.2822368891808566,1
plot the dbscan-on-spark result.
so , there are some difference between two image . Is there something wrong with me or the code?