Using Hadoop streaming perform four iterations manually using 6 centers (initially with randomly chosen centers). This would require passing a text file with cluster centers using -file option, opening the [login to view URL] in the mapper with open(‘[login to view URL]’, ‘r’) and assigning a key to each point based on which center is the closest to each particular point. Your reducer would then compute the new centers, and at that point the iteration is done and the output of the reducer with new centers can be given to the next pass of the same code.
The only difference between first and subsequent iteration is that in first iteration you have to pick the initial centers. Starting from 2nd iteration, the centers will be given to you by a previous pass of KMeans.