Revision badc51cb-e94b-45d6-9b2d-2d467c78dc62

# Try csvstat
The common CSV toolkits [`csvkit`](https://csvkit.readthedocs.io/https://csvkit.readthedocs.io/) and [`xsv`](https://github.com/BurntSushi/xsv) include some basic statistics features. 

So just pretend that your one-record-per-line input data is a single column of a header-less CSV file.

CSVKIT is older and more well-known, I think. XSV is newer and much faster for big inputs.

**Input:**
```
$ echo 1 2 9 9 | tr " " "\n"
1
2
9
9
```
## csvstat
The default csvstat output is for humans...
```
$ echo 1 2 9 9 | tr " " "\n" | csvstat --no-header-row
/usr/lib/python2.7/site-packages/agate/table/from_csv.py:74: RuntimeWarning: Error sniffing CSV dialect: Could not determine delimiter
 1. "a"

 Type of data: Number
 Contains null values: False
 Unique values: 3
 Smallest value: 1
 Largest value: 9
 Sum: 21
 Mean: 5.25
 Median: 5.5
 StDev: 4.349
 Most common values: 9 (2x)
 1 (1x)
 2 (1x)

Row count: 4
```
...but you can also get output as a CSV itself, which is better further processing:
```
$ echo 1 2 9 9 | tr " " "\n" | csvstat --no-header-row --csv
/usr/lib/python2.7/site-packages/agate/table/from_csv.py:74: RuntimeWarning: Error sniffing CSV dialect: Could not determine delimiter
column_id,column_name,type,nulls,unique,min,max,sum,mean,median,stdev,len,freq
1,a,Number,False,3,1,9,21,5.25,5.5,4.349,,"9, 1, 2"
```

csvstat will always complain that the lines do not contain any delimiter. To get rid of that error message just pipe it to /dev/null like so:
```
$ echo 1 2 9 9 | tr " " "\n" | csvstat --no-header-row --csv 2>/dev/null
column_id,column_name,type,nulls,unique,min,max,sum,mean,median,stdev,len,freq
1,a,Number,False,3,1,9,21,5.25,5.5,4.349,,"9, 1, 2"
```

And if you want a slightly more human readable version you can pipe the whole thing through `csvlook` again:
```
$ echo 1 2 9 9 | tr " " "\n" | csvstat --no-header-row --csv 2>/dev/null | csvlook
| column_id | column_name | type | nulls | unique | min | max | sum | mean | median | stdev | len | freq |
| --------- | ----------- | ------ | ----- | ------ | ---- | --- | --- | ---- | ------ | ----- | --- | ------- |
| True | a | Number | False | 3 | True | 9 | 21 | 5.25 | 5.5 | 4.349 | | 9, 1, 2 |
```

## xsv stats
For speed reasons `xsv stats` does not include median by default...
```
$ echo 1 2 9 9 | tr " " "\n" | xsv stats --no-headers
field,type,sum,min,max,min_length,max_length,mean,stddev
0,Integer,21,1,9,1,1,5.25,3.766629793329841
```

```
$ echo 1 2 9 9 | tr " " "\n" | xsv stats --no-headers | xsv table
field type sum min max min_length max_length mean stddev
0 Integer 21 1 9 1 1 5.25 3.766629793329841
```

...but you can enable it via the [`--everything`](
https://github.com/BurntSushi/xsv#:~:text=in%20each%20column.-,The%20stats%20command,-will%20do%20this) switch. This will give you these three extra columns: `median,mode,cardinality`:
```
$ echo 1 2 9 9 | tr " " "\n" | xsv stats --no-headers --everything
field,type,sum,min,max,min_length,max_length,mean,stddev,median,mode,cardinality
0,Integer,21,1,9,1,1,5.25,3.766629793329841,5.5,9,3

$ echo 1 2 9 9 | tr " " "\n" | xsv stats --no-headers --everything | xsv table
field type sum min max min_length max_length mean stddev median mode cardinality
0 Integer 21 1 9 1 1 5.25 3.766629793329841 5.5 9 3
```