dropping a row with at least one value larger than a particular value

Question

So for instance I have M X N table. I want to drop any row which has an element larger than, say, one in the second or the third column.

I guess I have to do some sort of pattern matching. But not exactly sure how to do it. I can use a table to search for a value and then find its Position and then drop the corresponding row. There must be a easier way to do this without traversing through each element using the Table function. help please :)

Per my comment to Kguler, could you clarify: do you mean in your example any rows with columns other than second or third with a value greater than the second or third, or are you interested in only checking against one column? — ciao
– ciao, Commented Jun 3, 2014 at 23:37
Sorry for this late response and the confusion. I meant second or third column (for that matter any column) containing a value which is larger than a particular value (e.g. 10, etc). Your suggestion is very exciting. — user1188038
– user1188038, Commented Jun 5, 2014 at 22:11
OK, I'm actually more confused re-reading your OP and reply. Say a row is {1,2,3,4,5,6,7} (element is col number in this case). Is it that you want to specify some value, and then remove rows where only certain columns you specify have a value exceeding that, or is it you want to specify some columns, and delete any rows where any of the other columns in the row have a value exceeding any of the values in the specified columns (what I answered)? — ciao
– ciao, Commented Jun 5, 2014 at 22:18
I went ahead and updated my post with the alternative interpretation... let me know if that's what you meant. — ciao
– ciao, Commented Jun 5, 2014 at 22:29
I want to specify some value, and then remove rows where only certain columns I specify have a value exceeding that. Sorry for all the confusions! the other interpretation of my OP sounds too complicated :). But then I am not sure I was clear — user1188038
– user1188038, Commented Jun 6, 2014 at 16:30

kglr · Accepted Answer · 2014-06-04 02:03:34Z

 refcol = 3; dt = RandomInteger[10, {10, 5}]; dt//TableForm

enter image description here

Select[dt, Max[# - #[[refcol]]] <= 0 &] (* {{2, 7, 8, 5, 4}, {7, 1, 8, 4, 0}, {4, 10, 10, 3, 6}} *)

Alternatively,

DeleteCases[dt, _?(Max[# - #[[refcol]]] > 0 &)] Cases[dt, _?(Max[# - #[[refcol]]] <= 0 &)] Pick[dt, (Max[# - #[[refcol]]] <= 0) & /@ dt]

all give the same output.

Showing the deleted rows in red:

If[Max[# - #[[refcol]]] > 0, Style[#, Bold, Red, 20] & /@ #, Style[#, Directive[Bold, 20]] & /@ #] & /@ dt // TableForm

enter image description here

Update: The four functions above work for the case of a single reference column.For the case where a row is deleted if any entry exceeds any of the multiple reference column entries, one needs the following straightforward modifications:

 f1 = Function[{dt, cols}, With[{rest = Complement[Range@Length@dt[[1]], cols]}, Pick[dt, (Max[#[[rest]] - Min[#[[cols]]]] <= 0) & /@ dt]]]; f2 = Function[{dt, cols}, With[{rest = Complement[Range@Length@dt[[1]], cols]}, Select[dt, Max[#[[rest]] - Min[#[[cols]]]] <= 0 &]]]; f3 = Function[{dt, cols}, With[{rest = Complement[Range@Length@dt[[1]], cols]}, Cases[dt, _?(Max[#[[rest]] - Min[#[[cols]]]] <= 0 &)]]]; f4 = Function[{dt, cols}, With[{rest = Complement[Range@Length@dt[[1]], cols]}, DeleteCases[dt, _?(Max[#[[rest]] - Min[#[[cols]]]] > 0 &)]]]; f1[dt, {2, 3}] == f2[dt, {2, 3}] == f3[dt, {2, 3}] == f4[dt, {2, 3}] (* True *) f1[dt, {2, 3}] (* {{2, 7, 8, 5, 4}, {4, 10, 8, 0, 4}, {4, 10, 10, 3, 6}} *)

Am I missing something? How does this handle checking against multiple "target" columns as per the OP? Unless I misread the meaning of "...one in the second or the third column..." (emphasis mine). — ciao
– ciao, Commented Jun 3, 2014 at 23:32
@rasher, obviously no -- this does not check against your interpretation of "say, ... or ..." :) Let's wait for OPs clarification. — kglr
– kglr, Commented Jun 3, 2014 at 23:47

Mr.Wizard · Accepted Answer · 2014-06-07 14:49:24Z

This is significantly faster than other options presented:

Pick[dt, Max /@ dt - dt[[All, refcol]], 0]

Timings:

refcol = 3; dt = RandomInteger[10, {50000, 7}]; Select[dt, Max[# - #[[refcol]]] <= 0 &] // Timing // First DeleteCases[dt, _?(Max[# - #[[refcol]]] > 0 &)] // Timing // First Cases[dt, _?(Max[# - #[[refcol]]] <= 0 &)] // Timing // First Pick[dt, (Max[# - #[[refcol]]] <= 0) & /@ dt] // Timing // First Pick[dt, Max /@ dt - dt[[All, refcol]], 0] // Timing // First

0.1372 0.1652 0.1592 0.1716 0.02932

Stack Exchange Network

dropping a row with at least one value larger than a particular value

2 Answers 2

Hot Network Questions

dropping a row with at least one value larger than a particular value

2 Answers 2

Related

Hot Network Questions