Skip to main content
2 of 8
deleted 628 characters in body

Try csvstat

The common CSV toolkits "csvkit" and "xsv" include some basic statistics features.

Input:

$ echo 1 2 9 9 | tr " " "\n" 1 2 9 9 

csvstat

Output csvstat human readable:

$ echo 1 2 9 9 | tr " " "\n" | csvstat --no-header-row /usr/lib/python2.7/site-packages/agate/table/from_csv.py:74: RuntimeWarning: Error sniffing CSV dialect: Could not determine delimiter 1. "a" Type of data: Number Contains null values: False Unique values: 3 Smallest value: 1 Largest value: 9 Sum: 21 Mean: 5.25 Median: 5.5 StDev: 4.349 Most common values: 9 (2x) 1 (1x) 2 (1x) Row count: 4 

Output csvstat machine readable:

$ echo 1 2 9 9 | tr " " "\n" | csvstat --no-header-row --csv /usr/lib/python2.7/site-packages/agate/table/from_csv.py:74: RuntimeWarning: Error sniffing CSV dialect: Could not determine delimiter column_id,column_name,type,nulls,unique,min,max,sum,mean,median,stdev,len,freq 1,a,Number,False,3,1,9,21,5.25,5.5,4.349,,"9, 1, 2" 
$ echo 1 2 9 9 | tr " " "\n" | csvstat --no-header-row --csv | csvlook /usr/lib/python2.7/site-packages/agate/table/from_csv.py:74: RuntimeWarning: Error sniffing CSV dialect: Could not determine delimiter | column_id | column_name | type | nulls | unique | min | max | sum | mean | median | stdev | len | freq | | --------- | ----------- | ------ | ----- | ------ | ---- | --- | --- | ---- | ------ | ----- | --- | ------- | | True | a | Number | False | 3 | True | 9 | 21 | 5.25 | 5.5 | 4.349 | | 9, 1, 2 | 

xsv stats

xsv stats is similar but unfortunately does not include median.

$ echo 1 2 9 9 | tr " " "\n" | xsv stats --no-headers field,type,sum,min,max,min_length,max_length,mean,stddev 0,Integer,21,1,9,1,1,5.25,3.766629793329841 
$ echo 1 2 9 9 | tr " " "\n" | xsv stats --no-headers | xsv table field type sum min max min_length max_length mean stddev 0 Integer 21 1 9 1 1 5.25 3.766629793329841