80

How can I extract the extension of a file given a file path as a character? I know I can do this via regular expression regexpr("\\.([[:alnum:]]+)$", x), but wondering if there's a built-in function to deal with this?

9 Answers 9

112

This is the sort of thing that easily found with R basic tools. E.g.: ??path.

Anyway, load the tools package and read ?file_ext .

Sign up to request clarification or add additional context in comments.

9 Comments

It doesn't show up with ??"extensions" although one would have expected that it would.
@DWin: "patience, grasshopper" :-). I would also recommend package:sos . It's very cool.
Witthof: Color me puzzled on two accounts; how does pkg:sos address that lack of appearance of tools::fiie_ext with ??() when a reasonable person would expect it to; and one would certainly need patience obtain value from a search strategy that delivers 20 pages with 400 hits?
sos does a full text search. ?? only searches metadata (title, keywords, etc.) Furthermore, it's not that hard to skim the results. (I tried findFn("{file extension}"), "extract {file extension}", and "{extract file extension}", the first was best.)
This would be more useful with an actual code sample
|
31

Let me extend a little bit great answer from https://stackoverflow.com/users/680068/zx8754

Here is the simple code snippet

 # 1. Load library 'tools' library("tools") # 2. Get extension for file 'test.txt' file_ext("test.txt") 

The result should be 'txt'.

4 Comments

Please scroll up and read the accepted answer to this question.
Thank you, Rich! I read this comment and add this code just to show how it looks in the simple code snippet. Maybe it will be helpful for someone.
The other comment may have been first and accepted, but it is nice to see the solution written out. The accepted answer just tells you where you find the answer. This one actually answers the question.
Don't use library(tools) when you can simply use tools::file_ext, such as in tools::file_ext("test.txt").
12

simple function with no package to load :

getExtension <- function(file){ ex <- strsplit(basename(file), split="\\.")[[1]] return(ex[-1]) } 

1 Comment

Nice basic function! Could be one line:getExtension <- function(file) strsplit(file, ".", fixed=T)[[1]][-1]. To avoid regex and increase performance fixed = TRUE can be used.
4

The regexpr above fails if the extension contains non-alnum (see e.g. https://en.wikipedia.org/wiki/List_of_filename_extensions) As an altenative one may use the following function:

getFileNameExtension <- function (fn) { # remove a path splitted <- strsplit(x=fn, split='/')[[1]] # or use .Platform$file.sep in stead of '/' fn <- splitted [length(splitted)] ext <- '' splitted <- strsplit(x=fn, split='\\.')[[1]] l <-length (splitted) if (l > 1 && sum(splitted[1:(l-1)] != '')) ext <-splitted [l] # the extention must be the suffix of a non-empty name ext 

}

3 Comments

The functions basename and dirname obviate some of the work here
@Pisca46: I would like to use a function like this in an R package. Did you write the function? If not, could you add a reference in your answer?
Yes, I wrote the function myself.
3

A way would be to use sub.

s <- c("test.txt", "file.zi_", "noExtension", "with.two.ext2", "file.with.final.dot.", "..", ".", "") sub(".*\\.|.*", "", s, perl=TRUE) #[1] "txt" "zi_" "" "ext2" "" "" "" "" 

Assuming there is a dot - which will fail in case there is no extension:

sub(".*\\.", "", s) #[1] "txt" "zi_" "noExtension" "ext2" "" #[6] "" "" "" 

For comparison tools::file_ext(s) and the code with inside used regex.

tools::file_ext(s) #[1] "txt" "" "" "ext2" "" "" "" "" pos <- regexpr("\\.([[:alnum:]]+)$", s) ifelse(pos > -1L, substring(s, pos + 1L), "") #[1] "txt" "" "" "ext2" "" "" "" "" 

Comments

2

extract file extension only without dot:

tools::file_ext(fileName)

extract file extension with dot:

paste0(".", tools::file_ext(fileName))

Comments

1

If you don't want to use any additional package you could try

file_extension <- function(filenames) { sub(pattern = "^(.*\\.|[^.]+)(?=[^.]*)", replacement = "", filenames, perl = TRUE) } 

If you like to be cryptic you could try to use it as a one-line expression: sub("^(.*\\.|[^.]+)(?=[^.]*)", "", filenames, perl = TRUE) ;-)

It works for zero (!), one or more file names (as character vector or list) with an arbitrary number of dots ., and also for file names without any extension where it returns the empty character "".

Here the tests I tried:

> file_extension("simple.txt") [1] "txt" > file_extension(c("no extension", "simple.ext1", "with.two.ext2", "some.awkward.file.name.with.a.final.dot.", "..", ".", "")) [1] "" "ext1" "ext2" "" "" "" "" > file_extension(list("file.ext1", "one.more.file.ext2")) [1] "ext1" "ext2" > file_extension(NULL) character(0) > file_extension(c()) character(0) > file_extension(list()) character(0) 

By the way, tools::file_ext() has trouble finding "strange" extensions with non-alphanumeric characters:

> tools::file_ext("file.zi_") [1] "" 

Comments

0

This function uses pipes:

library(magrittr) file_ext <- function(f_name) { f_name %>% strsplit(".", fixed = TRUE) %>% unlist %>% extract(2) } file_ext("test.txt") # [1] "txt" 

3 Comments

Can you comment how this is an improvement over tools::file_ext?
You'd better use tools function
The proposed function works incorrectly if the file contains dots in the filename. The function splits the filename and outputs the second element, while it should output the last one. For the following filename 'file.name.txt' the output is 'name', not 'txt'. tools::file_ext works fine.
0

Simplest way I've found with no additional packages:

FileExt <- function(filename) { nameSplit <- strsplit(x = filename, split = "\\.")[[1]] return(nameSplit[length(nameSplit)]) } 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.