I want to create a script that randomly shuffles the rows and columns of a large csv file. For example, for a initial file f.csv:
a, b, c ,d e, f, g, h i, j, k, l First, we shuffle the rows to obtain f1.csv:
e, f, g, h a, b, c ,d i, j, k, l Then, we shuffle the columns f2.csv:
g, e, h, f c, a, d, b k, i, l, j In order to shuffle the rows, we can use from here:
awk 'BEGIN{srand() } { lines[++d]=$0 } END{ while (1){ if (e==d) {break} RANDOM = int(1 + rand() * d) if ( RANDOM in lines ){ print lines[RANDOM] delete lines[RANDOM] ++e } } }' f.csv > f1.csv But, how to shuffle the columns?
$0when populating lines[]. Try to code it yourself, the logics all there for you in your script.