2

Let me preface this question by saying that I know very little about R. I'm importing a text file into R using read.table("file.txt", T). The text file is in the general format:

header1 header2 a 1 a 4 b 3 b 2 

Each a is an observation from a sample and similarly each b is an observation from a different sample. I want to calculate various statistics of the sets of a and b which I'm doing with tapply(header2, header1, mean). That works fine.

Now I need to do some qqnorm plots of a and b and draw with qqline. I can use tapply(header2, header1, qqnorm) to make quantile plots of each BUT using tapply(header2, header1, qqline) draws both best fit lines on the last quantile plot. Programatically that makes sense but it doesn't help me.

So my question is, how can convert the data frame to two vectors (one for all a and one for all b)? Does that make sense? Basically, in the above example, I'd want to end up with two vectors: a=(1,4) and b=(3,2).

Thanks!

4
  • 3
    This is pretty basic R question. Perhaps you can read up some tutorials on R as that will be most helpful. Commented Mar 15, 2013 at 5:13
  • 1
    If was able to find the answer on my own, I wouldn't have asked the question... Commented Mar 15, 2013 at 5:30
  • 1
    I think geektrader was referring to: cran.r-project.org/manuals.html Commented Mar 15, 2013 at 5:31
  • Beginners to r should also look at the vignettes (ie. intro notes). Link to one for data.table. Also with data.table you can address your vector as DT$a where a is your column name. Also data.table is able to apply functions to groups. e,g, get averages of b broken down by group a ans <- DT[ , mean(b), by = a] Commented Apr 12, 2015 at 7:09

1 Answer 1

4

Create a function that does both. You won't be able (easily at least) to revert to an old graphics device.

e.g.

with(dd, tapply(header2,header1, function(x) {qqnorm(x); qqline(x)})) 

You could use data.table here for coding elegance (and speed)

You can pass the equivalent of a body of a function that is evaluated within the scope of the data.table e.g.

library(data.table) DT <- data.table(dd) DT[, {qqnorm(x) qqline(x)}, by=header1] 

You don't really want to pollute your global environments with lots of objects (that will be inefficient).

Sign up to request clarification or add additional context in comments.

4 Comments

Thank you very much. I thought there would be a simple way to create two vectors but using a function is fine with me.
There is, using split lapply and list2env, or reshape and list2env but the way above is more idiomatic for R.
This actually seemed to work: a = subset(dd, header1 == "a") but there is a high probability that I'm missing something important.
Yes, but you'd have to write out that for each level of header1. You could automate it easily, but you shouldn't need to do it at all this case

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.