2

Using mtcars dataset, i want to count how many cars have at least 6 cylinders (cyl).

I use length() after filtering, and get a result of 11

library(dplyr) mtcars %>% filter( cyl > 6 ) %>% length() 

However the code provided by tutorial is like this, and returns result of 14

library(dplyr) mtcars %>% filter(cyl > 6) %>% summarise(n()) 

by viewing the result directly after filtering, it should also be 14

Now i have learnt that summarise(n()) is better Count number of rows by group using dplyr, and there are more better methods counting after filtering, but i am still confused why my code returns a different result and where the 11 comes from.

Thanks

2
  • 1
    length() returns number of columns and not rows. Commented May 12, 2024 at 12:28
  • Maybe mtcars %>% filter(cyl > 6) %>% count()? Commented May 12, 2024 at 13:33

2 Answers 2

3

When applied to a dataframe, length() returns number of columns not rows. But you can pull() variables to see their length(). So, your code counted number of columns instead of rows.

library(dplyr) # Counts columns mtcars %>% filter( cyl > 6 ) %>% length() #> [1] 11 # Counts rows mtcars %>% filter(cyl > 6) %>% summarise(n()) #> n() #> 1 14 # Counts rows mtcars %>% filter( cyl > 6 ) %>% pull() %>% length() #> [1] 14 

Created on 2024-05-12 with reprex v2.1.0

Or you could have just used count() instead of length()

Sign up to request clarification or add additional context in comments.

Comments

2

As pointed by other users, length() returns the number of columns when a dataframe or tibble is piped into it. So we use dim() to get the dimensions of the dataframe then get the first value using [1] since dim() returns the number of rows and the number of columns.

So here's an alternative with minimal change to OP's code.

library(dplyr) mtcars %>% filter( cyl > 6 ) %>% {dim(.)[1]} 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.