I am reading values from text file and and trying to find index of sub strings like below
df=pd.read_csv('break_sent.txt', index_col=False,encoding='utf-8',delimiter="\n",names=['sent'])
#print(df[:50])
#df.index = list(df.index)
df1= df[40:50]
print(len(df))
print(df1.index)
print("-------------------------------------------")
for i,row in df1.iterrows(): string = row['sent'] #print("string",string) d = df1[df1.sent.str.match(string)] # if the result includes more than 1 value then we know that substring and its matching parent string are present, then I will eliminate the substring from the dataframe if len(d.index > 2): index_val = df.index(string) df.drop(df.index(string),inpace=True) df.reset_index(level=None, drop=True, inplace=True)when I run this code, I am getting the below error
Traceback (most recent call last): File "process.py", line 15, in <module> index_val = df.index(string) TypeError: 'RangeIndex' object is not callableI tried to convert the range index to List
df.index = list(df.index)but then I got Int64Index is not callable. How can I get the index of the string ?
2 Answers
Try changing
df.drop(df.index(string),inpace=True)to
df.drop(index=string, inplace=True) You need to run df.index on the dataframe, rather than on your search string. So:
matched_rows = df.index[df1.sent.str.match(string)]will give you the rows that match your string. You should then be able to pass that output to df.drop:
if len(matched_rows) > 2: df.drop(matched_rows, inplace=True) df.reset_index(level=None, drop=True, inplace=True)I may not have grasped the exact details of what you're trying to do, but hopefully that points you in the right direction.