Read text file and parse in python

I have a text file(.txt) just looks like below:


Date, Day, Sect, 1, 2, 3

1, Sun, 1-1, 123, 345, 678

2, Mon, 2-2, 234, 585, 282

3, Tue, 2-2, 231, 232, 686


With this data I want to do the followings:

1) Read the text file by line as a separate element in the list

  • Split elements by comma

  • Delete non-necessary elements('\n') in the list

For the two, I did these.

file = open('abc.txt', mode = 'r', encoding = 'utf-8-sig')
lines = file.readlines()
file.close()
my_dict = {}
my_list = []
for line in lines: line = line.split(',') line = [i.strip() for i in line]

2) Set the first row(Date, Day, Sect, 1, 2, 3) as key and set the other rows as values in the dictionary.

 my_dict['Date'] = line[0] my_dict['Day'] = line[1] my_dict['Sect'] = line[2] my_dict['1'] = line[3] my_dict['2'] = line[4] my_dict['3'] = line[5]

The above code has two issues: 1) Set the first row as dictionary, too. 2) If I add this to the list as the below, it only keeps the last row as all elements in the list.

3) Create a list including the dictionary as elements.

 my_list.append(my_dict) 

4) Subset the elements that I want to.

I couldn't write any code from here. But What I want to do is subset elements meeting the condition: For example, choosing the element in the dictionary where the Sect is 2-2. Then the wanted results could be as the follows:

>> [{'Date': '2', 'Day': 'Mon', 'Sect': '2-2', '1': '234', '2': '585', '3': '282'}, {'Date': '3', 'Day': 'Tue', 'Sect': '2-2', '1': '231', '2':'232', '3':'686'}]

Thanks,

2

4 Answers

@supremed14, you can also try the below code to prepare the list of dictionaries after reading the file.

data.txt

As white spaces are there in text file. strip() method defined on strings will solve this problem.

Date, Day, Sect, 1, 2, 3
1, Sun, 1-1, 123, 345, 678
2, Mon, 2-2, 234, 585, 282
3, Tue, 2-2, 231, 232, 686

Source code:

Here you do not need to worry about closing the file. It will be taken care by Python.

import json
my_list = [];
with open('data.txt') as f: lines = f.readlines() # list containing lines of file columns = [] # To store column names i = 1 for line in lines: line = line.strip() # remove leading/trailing white spaces if line: if i == 1: columns = [item.strip() for item in line.split(',')] i = i + 1 else: d = {} # dictionary to store file data (each line) data = [item.strip() for item in line.split(',')] for index, elem in enumerate(data): d[columns[index]] = data[index] my_list.append(d) # append dictionary to list
# pretty printing list of dictionaries
print(json.dumps(my_list, indent=4))

Output:

[ { "Date": "1", "Day": "Sun", "Sect": "1-1", "1": "123", "2": "345", "3": "678" }, { "Date": "2", "Day": "Mon", "Sect": "2-2", "1": "234", "2": "585", "3": "282" }, { "Date": "3", "Day": "Tue", "Sect": "2-2", "1": "231", "2": "232", "3": "686" }
]

Using pandas this is pretty easy:

Input:

$cat test.txt
Date, Day, Sect, 1, 2, 3
1, Sun, 1-1, 123, 345, 678
2, Mon, 2-2, 234, 585, 282
3, Tue, 2-2, 231, 232, 686

Operations:

import pandas as pd
df = pd.read_csv('test.txt', skipinitialspace=True)
df.loc[df['Sect'] == '2-2'].to_dict(orient='records')

Output:

[{'1': 234, '2': 585, '3': 282, 'Date': 2, 'Day': 'Mon', 'Sect': '2-2'}, {'1': 231, '2': 232, '3': 686, 'Date': 3, 'Day': 'Tue', 'Sect': '2-2'}]

If your .txt file is in the CSV format:

Date, Day, Sect, 1, 2, 3
1, Sun, 1-1, 123, 345, 678
2, Mon, 2-2, 234, 585, 282
3, Tue, 2-2, 231, 232, 686

You can use the csv library:

from csv import reader
from pprint import pprint
result = []
with open('file.txt') as in_file: # create a csv reader object csv_reader = reader(in_file) # extract headers headers = [x.strip() for x in next(csv_reader)] # go over each line for line in csv_reader: # if line is not empty if line: # create dict for line d = dict(zip(headers, map(str.strip, line))) # append dict if it matches your condition if d['Sect'] == '2-2': result.append(d)
pprint(result)

Which gives the following list:

[{'1': '234', '2': '585', '3': '282', 'Date': '2', 'Day': 'Mon', 'Sect': '2-2'}, {'1': '231', '2': '232', '3': '686', 'Date': '3', 'Day': 'Tue', 'Sect': '2-2'}]

If you are allowed to use pandas, you can simply achieve your task by:

import pandas as pd
df = pd.read_csv('abc.txt', skipinitialspace=True) # reads your cvs file into a DataFrame
d = df.loc[df['Sect'] == '2-2'].to_dict('records') # filters the records which `Sect` value is '2-2', and returns a list of dictionaries

To install pandas run:

python3 -m pip install pandas

Assumming, the contents of abc.txt is the one you have provided, d will be:

[{'Date': 2, 'Day': 'Mon', 'Sect': '2-2', '1': 234, '2': 585, '3': 282}, {'Date': 3, 'Day': 'Tue', 'Sect': '2-2', '1': 231, '2': 232, '3': 686}]
0

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like