I'm trying to parse a directory with a collection of xml files from RSS feeds. I have a similar code for another directory working fine, so I can't figure out the problem. I want to return the items so I can write them to a CSV file. The error I'm getting is:
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0 Here is the site I've collected RSS feeds from: https://www.ba.no/service/rss
It worked fine for: https://www.nrk.no/toppsaker.rss and https://www.vg.no/rss/feed/?limit=10&format=rss&categories=&keywords=
Here is the function for this RSS:
import os import xml.etree.ElementTree as ET import csv def baitem(): basepath = "../data_copy/bergens_avisen" table = [] for fname in os.listdir(basepath): if fname != "last_feed.xml": files = ET.parse(os.path.join(basepath, fname)) root = files.getroot() items = root.find("channel").findall("item") #print(items) for item in items: date = item.find("pubDate").text title = item.find("title").text description = item.find("description").text link = item.find("link").text table.append((date, title, description, link)) return table I tested with print(items) and it returns all the objects. Can it be how the XML files are written?