1
$\begingroup$

So for instance I have M X N table. I want to drop any row which has an element larger than, say, one in the second or the third column.

I guess I have to do some sort of pattern matching. But not exactly sure how to do it. I can use a table to search for a value and then find its Position and then drop the corresponding row. There must be a easier way to do this without traversing through each element using the Table function. help please :)

$\endgroup$
6
  • $\begingroup$ Per my comment to Kguler, could you clarify: do you mean in your example any rows with columns other than second or third with a value greater than the second or third, or are you interested in only checking against one column? $\endgroup$ Commented Jun 3, 2014 at 23:37
  • $\begingroup$ Sorry for this late response and the confusion. I meant second or third column (for that matter any column) containing a value which is larger than a particular value (e.g. 10, etc). Your suggestion is very exciting. $\endgroup$ Commented Jun 5, 2014 at 22:11
  • $\begingroup$ OK, I'm actually more confused re-reading your OP and reply. Say a row is {1,2,3,4,5,6,7} (element is col number in this case). Is it that you want to specify some value, and then remove rows where only certain columns you specify have a value exceeding that, or is it you want to specify some columns, and delete any rows where any of the other columns in the row have a value exceeding any of the values in the specified columns (what I answered)? $\endgroup$ Commented Jun 5, 2014 at 22:18
  • $\begingroup$ I went ahead and updated my post with the alternative interpretation... let me know if that's what you meant. $\endgroup$ Commented Jun 5, 2014 at 22:29
  • $\begingroup$ I want to specify some value, and then remove rows where only certain columns I specify have a value exceeding that. Sorry for all the confusions! the other interpretation of my OP sounds too complicated :). But then I am not sure I was clear $\endgroup$ Commented Jun 6, 2014 at 16:30

2 Answers 2

2
$\begingroup$
 refcol = 3; dt = RandomInteger[10, {10, 5}]; dt//TableForm 

enter image description here

Select[dt, Max[# - #[[refcol]]] <= 0 &] (* {{2, 7, 8, 5, 4}, {7, 1, 8, 4, 0}, {4, 10, 10, 3, 6}} *) 

Alternatively,

DeleteCases[dt, _?(Max[# - #[[refcol]]] > 0 &)] Cases[dt, _?(Max[# - #[[refcol]]] <= 0 &)] Pick[dt, (Max[# - #[[refcol]]] <= 0) & /@ dt] 

all give the same output.

Showing the deleted rows in red:

If[Max[# - #[[refcol]]] > 0, Style[#, Bold, Red, 20] & /@ #, Style[#, Directive[Bold, 20]] & /@ #] & /@ dt // TableForm 

enter image description here

Update: The four functions above work for the case of a single reference column.For the case where a row is deleted if any entry exceeds any of the multiple reference column entries, one needs the following straightforward modifications:

 f1 = Function[{dt, cols}, With[{rest = Complement[Range@Length@dt[[1]], cols]}, Pick[dt, (Max[#[[rest]] - Min[#[[cols]]]] <= 0) & /@ dt]]]; f2 = Function[{dt, cols}, With[{rest = Complement[Range@Length@dt[[1]], cols]}, Select[dt, Max[#[[rest]] - Min[#[[cols]]]] <= 0 &]]]; f3 = Function[{dt, cols}, With[{rest = Complement[Range@Length@dt[[1]], cols]}, Cases[dt, _?(Max[#[[rest]] - Min[#[[cols]]]] <= 0 &)]]]; f4 = Function[{dt, cols}, With[{rest = Complement[Range@Length@dt[[1]], cols]}, DeleteCases[dt, _?(Max[#[[rest]] - Min[#[[cols]]]] > 0 &)]]]; f1[dt, {2, 3}] == f2[dt, {2, 3}] == f3[dt, {2, 3}] == f4[dt, {2, 3}] (* True *) f1[dt, {2, 3}] (* {{2, 7, 8, 5, 4}, {4, 10, 8, 0, 4}, {4, 10, 10, 3, 6}} *) 
$\endgroup$
2
  • $\begingroup$ Am I missing something? How does this handle checking against multiple "target" columns as per the OP? Unless I misread the meaning of "...one in the second or the third column..." (emphasis mine). $\endgroup$ Commented Jun 3, 2014 at 23:32
  • $\begingroup$ @rasher, obviously no -- this does not check against your interpretation of "say, ... or ..." :) Let's wait for OPs clarification. $\endgroup$ Commented Jun 3, 2014 at 23:47
0
$\begingroup$

This is significantly faster than other options presented:

Pick[dt, Max /@ dt - dt[[All, refcol]], 0] 

Timings:

refcol = 3; dt = RandomInteger[10, {50000, 7}]; Select[dt, Max[# - #[[refcol]]] <= 0 &] // Timing // First DeleteCases[dt, _?(Max[# - #[[refcol]]] > 0 &)] // Timing // First Cases[dt, _?(Max[# - #[[refcol]]] <= 0 &)] // Timing // First Pick[dt, (Max[# - #[[refcol]]] <= 0) & /@ dt] // Timing // First Pick[dt, Max /@ dt - dt[[All, refcol]], 0] // Timing // First 
0.1372 0.1652 0.1592 0.1716 0.02932 
$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.