A good friend of mine had the challenge of trying to build a schedule using all available times and classes through a spreadsheet... by hand. He asked me if I could build a program to generate valid schedules, and with search algorithms being one of my favorite things to do, I accepted the task.
At first glance through my research, I believed this to be an Interval Scheduling problem; however, since the courses have unique timespans on multiple days, I needed a better way to represent my data. Ultimately I constructed a graph where the vertices are the sections of a class and the neighbors are compatible sections. This allowed me to use a DFS-like algorithm to find schedules.
I have never asked for a code review since I am yet to take CS classes, but I would like to know where I stand with my organization, usage of data structures, and general approaches. One thing I also want an opinion on is commenting, something I rarely do and one day it will come back to haunt me. This is actually the first time I wrote docstrings, which I hope you will find useful in understanding the code.
Anyways, I exported a spreadsheet of the valid courses into a .csv file. Below is the Python code I wrote to parse the file and generate schedules:
scheduler.py
import csv from collections import defaultdict from enum import Enum class Days(Enum): """ Shorthand for retrieving days by name or value """ Monday = 0 Tuesday = 1 Wednesday = 2 Thursday = 3 Friday = 4 class Graph: """ A simple graph which contains all vertices and their edges; in this case, the class and other compatible classes :param vertices: A number representing the amount of classes """ def __init__(self, vertices): self.vertices = vertices self.graph = defaultdict(list) def add_edge(self, u, v): self.graph[u].append(v) class Section: """ Used to manage different sections of a class Includes all times and days for a particular section :param section_str: A string used to parse the times at which the class meets Includes weekday, start time, and end time Format as follows: Monday,7:00,9:30/Tuesday,3:30,5:30/Wednesday,5:30,6:50 :param class_name: The name used to refer to the class (course) :param preferred: Preferred classes will be weighted more heavily in the search :param required: Search will force this class to be in the schedule """ def __init__(self, section_str, class_name='Constraint', preferred=False, required=False): self.name = class_name self.preferred = preferred self.required = required self.days = [] for course in section_str.rstrip('/').split('/'): d = {} data = course.split(',') day_mins = Days[data[0]].value * (60 * 24) d['start_time'] = self.get_time_mins(data[1]) + day_mins d['end_time'] = self.get_time_mins(data[2]) + day_mins self.days.append(d) """ Parses a time into minutes since Monday at 00:00 by assuming no class starts before 7:00am :param time_str: A string containing time in hh:mm :returns: Number of minutes since Monday 00:00 """ @staticmethod def get_time_mins(time_str): time = time_str.split(':') h = int(time[0]) if h < 7: h += 12 return 60 * h + int(time[1]) """ A (messy) method used to display the section in a readable format :param start_num: minutes from Monday 00:00 until the class starts :param end_num: minutes from Monday 00:00 until the class ends :returns: A string representing the timespan """ @staticmethod def time_from_mins(start_num, end_num): # 1440 is the number of minutes in one day (60 * 24) # This is probably the least clean part of the code? day = Days(start_num // 1440).name start_hour = (start_num // 60) % 24 start_min = (start_num % 1440) - (start_hour * 60) start_min = '00' if start_min == 0 else start_min start_format = 'am' end_hour = (end_num // 60) % 24 end_min = (end_num % 1440) - (end_hour * 60) end_min = '00' if end_min == 0 else end_min end_format = 'am' if start_hour > 12: start_hour -= 12 start_format = 'pm' time = f'{day} {start_hour}:{start_min}{start_format} => ' if end_hour > 12: end_hour -= 12 end_format = 'pm' time += f'{end_hour}:{end_min}{end_format}' return time """ Checks to see if two time ranges overlap each other :param other: Another section object to compare :returns: boolean of whether the sections overlap """ def is_overlapping(self, other): for range_1 in self.days: for range_2 in other.days: if range_1['end_time'] > range_2['start_time'] and range_2['end_time'] > range_1['start_time']: return True return False def __repr__(self): strs = [] for day in self.days: strs.append(self.time_from_mins(day['start_time'], day['end_time'])) return '\n'.join(strs) class Scheduler: """ This class powers the actual search for the schedule It makes sure to fill all requirements and uses a search algorithm to find optimal schedules :param graph: Instance of a Graph object :param num_courses: A constraint on the number of courses that the schedule should have :param num_required: A number to keep track of the amount of required classes """ def __init__(self, graph, num_courses=5, num_required=1): self.graph = graph.graph self.paths = [] self.num_courses = num_courses self.num_required = num_required self.schedule_num = 1 """ A recursive search algorithm to create schedules Nodes are Section objects, with their neighbors being compatible courses :param u: The starting node in the search :param visited: A boolean list to keep track of visited nodes :param path: List passed through recursion to keep track of the path :returns: None (modifies object properties for use in __repr__ below) """ def search(self, u, visited, path): num_courses = self.num_courses visited[u] = True path.append(u) if len(self.paths) > 1000: return if len(path) == num_courses and len([x for x in path if x.required is True]) == self.num_required: self.paths.append(list(path)) else: for section in self.graph[u]: if visited[section] == False and not any((x.is_overlapping(section) or (x.name == section.name)) for x in path): self.search(section, visited, path) path.pop() visited[u] = False def __repr__(self): out = '' for section in self.paths[self.schedule_num - 1]: out += f'{section.name}\n{"=" * len(section.name)}\n{repr(section)}\n\n' return out def main(): """ Setup all data exported into a .csv file, and prepare it for search """ data = {} # Parse csv file into raw data with open('classes.csv') as csvfile: csv_data = csv.reader(csvfile, dialect='excel') class_names = [] for j, row in enumerate(csv_data): for i, item in enumerate(row): if j == 0: if i % 3 == 0: # I believe there is a better way to read by columns name = item.strip('*') class_names.append(name) # Preferred classes are labelled with one asterisk, required with two preferred = item.count('*') == 1 required = item.count('*') == 2 data[name] = { 'sections_raw': [], 'sections': [], 'preferred': preferred, 'required': required } else: class_index = i // 3 data_ = data[class_names[class_index]] data_['sections_raw'].append(item) # Create Section objects which can be compared for overlaps for _class in data: # Personally class is more natural for me than course or lecture, but I could replace it sections_raw = data[_class]['sections_raw'] sections = [] cur_str = '' # Section strings are always in groups of three (class name, start time, end time) for i in range(0, len(sections_raw), 3): if sections_raw[i] != '': for x in range(3): cur_str += sections_raw[i + x] + ',' cur_str += '/' else: if cur_str != '': sections.append(Section(cur_str, _class, data[_class]['preferred'], data[_class]['required'])) cur_str = '' else: if cur_str != '': sections.append(Section(cur_str, _class, data[_class]['preferred'], data[_class]['required'])) cur_str = '' data[_class]['sections'] = sections # A friend asked me to prevent the scheduler from joining classes at specific times # I used my Section object as a constraint through the is_overlapping method constraint = Section('Monday,4:00,6:00/' + 'Tuesday,7:00,9:30/Tuesday,3:30,5:30/' + 'Wednesday,4:00,6:00/' + 'Thursday,7:00,9:30/Thursday,3:30,5:30/' + 'Friday,7:00,10:00') section_data = [] # Here we extract the compatible courses given the constraint for x in data.values(): for s in x['sections']: if not s.is_overlapping(constraint): section_data.append(s) graph = Graph(len(section_data)) for section in section_data: graph.graph[section] = [] start = None # Now we populate the graph, not allowing any incompatible edges for section in section_data: if start is None: start = section for vertex in graph.graph: if not section.is_overlapping(vertex) and section.name != vertex.name: graph.add_edge(vertex, section) scheduler = Scheduler(graph) visited = defaultdict(bool) scheduler.search(u=start, visited=visited, path=[]) # We use our search algorithm with courses as nodes # The scheduler doesn't actually weight the preferred classes, so we sort all our valid schedules using # the lambda function and reverse the order to show schedules with preferred classes first scheduler.paths = sorted(scheduler.paths, key= lambda path: (len([p for p in path if p.preferred])), reverse=True) return scheduler if __name__ == '__main__': # The scheduler object is created, and now we need a way for the user to view one of their schedules scheduler = main() n = int(input(f'There are {len(scheduler.paths)} found.\nWhich schedule would you like to see?\n#: ')) if not 1 <= n <= len(scheduler.paths): print(f'Enter a number between 1-{scheduler.paths}.') else: scheduler.schedule_num = n print(scheduler) The .csv file is generated from a spreadsheet that uses the following layout (visualizing it will help with understanding how I parse it):
classes.csv
SPAN 201,Start,End,POLS 110*,Start,End,ENVS 130,Start,End,ACT 210,Start,End,FSEM**,Start,End,QTM 100*,Start,End Tuesday,9:00,9:50,Tuesday,1:00,2:15,Tuesday,11:30,12:45,Monday,1:00,2:15,Tuesday,10:00,11:15,Monday,4:00,5:15 Thursday,9:00,9:50,Thursday,1:00,2:15,Thursday,11:30,12:45,Wednesday,1:00,2:15,Thursday,10:00,11:15,Wednesday,4:00,5:15 Friday,9:00,9:50,,,,,,,,,,,,,Friday,9:00,9:50 ,,,,,,,,,Monday,2:30,3:45,Monday,1:00,2:15,,, Tuesday,10:00,10:50,,,,,,,Wednesday,2:30,3:45,Wednesday,1:00,2:15,Monday,4:00,5:15 Thursday,10:00,10:50,,,,,,,,,,,,,Wednesday,4:00,5:15 Friday,10:00,10:50,,,,,,,Monday,4:00,5:15,Monday,10:00,10:50,Friday,11:00,11:50 ,,,,,,,,,Wednesday,4:00,5:15,Wednesday,10:00,10:50,,, Tuesday,12:00,12:50,,,,,,,,,,Friday,10:00,10:50,Monday,4:00,5:15 Thursday,12:00,12:50,,,,,,,Tuesday,8:30,9:45,,,,Wednesday,4:00,5:15 Friday,12:00,12:50,,,,,,,Thursday,8:30,9:45,,,,Friday,1:00,1:50 ,,,,,,,,,,,,,,,,, Tuesday,1:00,1:50,,,,,,,Tuesday,10:00,11:15,,,,Tuesday,8:30,9:45 Thursday,1:00,1:50,,,,,,,Thursday,10:00,11:15,,,,Thursday,8:30,9:45 Friday,1:00,1:50,,,,,,,,,,,,,Friday,10:00,10:50 ,,,,,,,,,Tuesday,1:00,2:15,,,,,, Tuesday,2:00,2:50,,,,,,,Thursday,1:00,2:15,,,,Tuesday,8:30,9:45 Thursday,2:00,2:50,,,,,,,,,,,,,Thursday,8:30,9:45 Friday,2:00,2:50,,,,,,,,,,,,,Friday,12:00,12:50 ,,,,,,,,,,,,,,,,, Tuesday,3:00,3:50,,,,,,,,,,,,,Tuesday,8:30,9:45 Thursday,3:00,3:50,,,,,,,,,,,,,Thursday,8:30,9:45 Friday,3:00,3:50,,,,,,,,,,,,,Friday,2:00,2:50 ,,,,,,,,,,,,,,,,, Monday,10:00,10:50,,,,,,,,,,,,,Tuesday,10:00,11:15 Wednesday,10:00,10:50,,,,,,,,,,,,,Thursday,10:00,11:15 Friday,10:00,10:50,,,,,,,,,,,,,Friday,11:00,11:50 ,,,,,,,,,,,,,,,,, Monday,9:00,9:50,,,,,,,,,,,,,Tuesday,10:00,11:15 Wednesday,9:00,9:50,,,,,,,,,,,,,Thursday,10:00,11:15 Friday,9:00,9:50,,,,,,,,,,,,,Friday,1:00,1:50 ,,,,,,,,,,,,,,,,, Monday,2:00,2:50,,,,,,,,,,,,,Monday,2:30,3:45 Wednesday,2:00,2:50,,,,,,,,,,,,,Wednesday,2:30,3:45 Friday,2:00,2:50,,,,,,,,,,,,,Friday,10:00,10:50 ,,,,,,,,,,,,,,,,, Monday,3:00,3:50,,,,,,,,,,,,,Monday,2:30,3:45 Wednesday,3:00,3:50,,,,,,,,,,,,,Wednesday,2:30,3:45 Friday,3:00,3:50,,,,,,,,,,,,,Friday,2:00,2:50 ,,,,,,,,,,,,,,,,, Monday,4:00,4:50,,,,,,,,,,,,,,, Wednesday,4:00,4:50,,,,,,,,,,,,,,, Friday,4:00,4:50,,,,,,,,,,,,,,, 