0

I have this data (Remark: don't consider this data a json file consider it a normal txt file). :

{"tstp":1383173780727,"ststates":[{"nb":901,"state":"open","freebk":6,"freebs":14},{"nb":903,"state":"open","freebk":2,"freebs":18}]}{"tstp":1383173852184,"ststates":[{"nb":901,"state":"open","freebk":6,"freebs":14}]} 

I want to take all the values inside the first tstp only and stop when reaching the other tstp.

What I am trying to do is to create a file for each tstp and inside this file, it will have nb, state, freebk, freebs as columns in this file.

expected output:

first tstp file:

nb state freebk freebs 901 open 6 14 903 open 2 18 

second tstp file:

nb state freebk freebs 901 open 6 14 

this output is for the first tstp I want to create a different file for each tstp in my data so for the provided data 2 files will be created ( because we have only 2 tstp in the data)

Remark: don't consider this data a json file consider it a normal txt file.

2
  • 1
    If we should not consider the input is json then please craft a mcve with a normal text file Commented Mar 12, 2022 at 9:14
  • @Zeitounator The input file as a whole is not a valid JSON file, but it clearly is JSON objects concatenated one after the other. Commented Mar 12, 2022 at 9:16

2 Answers 2

1

This below approach will help you with all types of data available for "tstp" which may have spaces in between.

I used regex for properly capturing starting of each JSON to prepare a valid data. (Also works If your data is unorganized in your file.)

import re import ast # Reading Content from Text File with open("text.txt", "r") as file: data = file.read() # Transforming Data into Json for better value collection regex = r'{[\s]*"tstp"' replaced_content = ',{"tstp"' # replacing starting of every {json} dictionary with ,{json} data = re.sub(regex, replaced_content, data) data = "[" + data.strip()[1:] + "]" # removing First unnecessary comma (,) data = ast.literal_eval(data) # converting string to list of Json # Preparing data for File headings_data = "nb state freebk freebs" for count, json in enumerate(data, start=1): # Remove this part with row = "" if you dont want tstp value in file. row = "File - {0}\n\n".format(json["tstp"]) row += headings_data for item in json["ststates"]: row += "\n{0} {1} {2} {3}".format( item["nb"], item["state"], item["freebk"], item["freebs"]) # Preparing different file for each tstp filename = "file-{0}.txt".format(count) with open(filename, "w") as file: file.write(row) 

Output:

File 1

File - 1383173780727 nb state freebk freebs 901 open 6 14 903 open 2 18 

File 2

File - 1383173852184 nb state freebk freebs 901 open 6 14 
  • And So on.... for total number of "tstp" entries.

Note: We cannot replace "}{" in every situation. Maybe, in your data the brackets may placed in different lines.

Sign up to request clarification or add additional context in comments.

4 Comments

Use ast.literal_eval instead of eval.
(also, see enumerate instead of managing count by hand.)
Okay, I added enumerate and ast in code. Thank you @AKX for correction. I forgot the enumerate function.
@abdeali004 enumerate accepts an optional second argument for start, so you don't need count+1 :)
1

Well, it looks like }{ is a nice separator for the entries, so let's (ab)use that fact. Better formatting of the output is left as an exercise to the reader.

import ast # (0) could be read with f.read() data = """{"tstp":1383173780727,"ststates":[{"nb":901,"state":"open","freebk":6,"freebs":14},{"nb":903,"state":"open","freebk":2,"freebs":18}]}{"tstp":1383173852184,"ststates":[{"nb":901,"state":"open","freebk":6,"freebs":14}]}""" # (1) split data by ´}{` entries = data.replace("}{", "}\n{").splitlines() # (2) read each entry (since we were told it's not JSON, # don't use JSON but ast.literal_eval, but the effect is the same) entries = [ast.literal_eval(ent) for ent in entries] # (3) print out some ststates! for ent in entries: print("nb\tstate\tfreebk\tfreebs") for ststate in ent.get("ststates", []): print("{nb}\t{state}\t{freebk}\t{freebs}".format_map(ststate)) print("---") 

The output is

nb state freebk freebs 901 open 6 14 903 open 2 18 --- nb state freebk freebs 901 open 6 14 --- 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.