0

I made a function which filters my list and it looks like

def listDirs(directory): listD = [d for d in [os.path.join(myPath, d1) for d1 in os.listdir(directory)] if os.path.isdir(d)] sortListD = np.unique(np.array([i[-10:] for i in listD])) if[ np.array([i.split("-")[1] for i in sortListD]) != "2017" ]: # <- here sortListD.remove(i) return sortListD 

As you imagine, the line marked does not work and I'm not really comfortable with the usage of python loops.

What i want to do is to verify that my i element in my sortListD is formatted as 'YYYY-MM-DD', and remove it if not.

my sortListD looks like this:

['2015-06-18' '2015-07-20' '2015-08-05' '2015-09-01' '2015-10-10' '2016-01-04' '2016-01-06' '2016-01-11' '2016-01-18' '2016-01-20' '2016-02-15' '2016-02-25' '2016-04-16' '2016-05-03' '2016-06-07' '2016-06-21' '2016-06-22' '2016-06-24' '2016-07-30' '2016-08-02' '2016-08-23' '2016-09-07' '2016-09-15' '2016-09-19' '2016-09-28' '2016-09-30' '2016-10-03' '2016-10-06' '2016-10-10' '2016-10-17' '2016-11-07' '2016-11-11' '2016-11-14' '2016-11-18' '2016-11-23' '2016-11-27' '2016-12-08' '2016-12-12' '2016-12-20' '2017-01-03' '2017-01-06' '2017-02-02' '2017-02-03' '2017-02-20' '2017-04-20' '2017-05-08' '2017-05-22' '2017-06-01' '2017-09-23' '2017-11-16' '2017-11-23' '2017-11-30' '**ata_Q1/old**' '**uonic_data**'] 

The elements in bold are the one i want to remove.

7
  • 1
    You are not sorting, you are filtering. Commented May 11, 2018 at 8:52
  • Consider adding the tag numpy? Also is this python-2.7 or python-3.x? Commented May 11, 2018 at 8:52
  • So the output should just be a list again? Commented May 11, 2018 at 8:53
  • Yup filtering* sorry Commented May 11, 2018 at 8:54
  • And yeah i want to have a list again, so i can get my YYYY-MM-DD elements separatly Commented May 11, 2018 at 8:55

2 Answers 2

1

Write a validation function and use that when walk through you list like this for example:

import datetime directories = [ '2015-06-18', '2015-07-20', '2015-08-05', '2015-09-01', '2015-10-10', '2017-11-23', '2017-11-30', '**ata_Q1/old**', '**uonic_data**'] def isValidDate(date_text): try: datetime.datetime.strptime(date_text, '%Y-%m-%d') except ValueError: #raise ValueError("Incorrect data format, should be YYYY-MM-DD") return False return True out = [d for d in directories if isValidDate(d)] print out 

Or to wrap it up in a function like in your example and adding the isdir()-condition as well:

def listDirs(directory): return [d for d in os.listdir(directory) if isValidDate(d) and os.path.isdir(d)] 
Sign up to request clarification or add additional context in comments.

1 Comment

The OP is also filtering the names on directories; the os.path.isdir() part is significant.
0

If you just wanted a sequence of unique date strings, use a set comprehension to extract the unique last 10 characters of each directory name, and then see if each string starts with 2017:

def listDirs(directory): # a set of the last 10 characters of all directory names in `directory` paths = {p[-10:] for p in os.listdir(directory) if os.path.isdir(os.path.join(directory, f))} # a list of all those unique substrings that start with 2017- return [p for p in paths if p.startswith('2017-')] 

This uses two comprehensions, loops that produce a Python container object directly. The first, {....} produces a set (where all contained elements are unique), and the second produces a list. The if ... parts filter the loop to limit what is included.

The above returns just those 10-character long tails of the directory names. If you wanted the whole directory name, then there is no need to use a set (names in a directory are already guaranteed to be unique) and you only need to slice in the list comprehension filter. We can make the directory test loop a generator expression to avoid creating an intermediary list:

def listDirs(directory): # a set of the last 10 characters of all directory names in `directory` dirs = (p in os.listdir(directory) if os.path.isdir(os.path.join(directory, f))} return [d for d in dirs if d[-10:].startswith('2017-')] 

The startswith test is pretty simplistic, and there can easily be false positives. You may want to test for any valid date string of the form YYYY-MM-DD, including testing that the date actually exists. That's easiest done by having the datetime library try and parse the string; if this fails, you know you don't have a valid date:

from datetime import datetime def is_valid_date(string): try: datetime.strptime(string, '%Y-%m-%d') return True except ValueError: # parsing failed, not a valid date in the expected format return False 

and use this instead of the .startswith() test:

... if is_valid_date(p) ... if is_valid_date(d[-10:]) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.