TypeError: expected string or buffer

I have this simple code:

import re, sys
f = open('findallEX.txt', 'r')
lines = f.readlines()
match = re.findall('[A-Z]+', lines)
print match

I don't know why I am getting the error:

'expected string or buffer'

Can anyone help?

2

5 Answers

lines is a list. re.findall() doesn't take lists.

>>> import re
>>> f = open('README.md', 'r')
>>> lines = f.readlines()
>>> match = re.findall('[A-Z]+', lines)
Traceback (most recent call last): File "<input>", line 1, in <module> File "/usr/lib/python2.7/re.py", line 177, in findall return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
>>> type(lines)
<type 'list'>

From help(file.readlines). I.e. readlines() is for loops/iterating:

readlines(...) readlines([size]) -> list of strings, each a line from the file.

To find all uppercase characters in your file:

>>> import re
>>> re.findall('[A-Z]+', open('README.md', 'r').read())
['S', 'E', 'A', 'P', 'S', 'I', 'R', 'C', 'I', 'A', 'P', 'O', 'G', 'P', 'P', 'T', 'V', 'W', 'V', 'D', 'A', 'L', 'U', 'O', 'I', 'L', 'P', 'A', 'D', 'V', 'S', 'M', 'S', 'L', 'I', 'D', 'V', 'S', 'M', 'A', 'P', 'T', 'P', 'Y', 'C', 'M', 'V', 'Y', 'C', 'M', 'R', 'R', 'B', 'P', 'M', 'L', 'F', 'D', 'W', 'V', 'C', 'X', 'S']

lines is a list of strings, re.findall doesn't work with that. try:

import re, sys
f = open('findallEX.txt', 'r')
lines = f.read()
match = re.findall('[A-Z]+', lines)
print match

readlines() will return a list of all the lines in the file, so lines is a list. You probably want something like this:

for line in f.readlines(): # Iterates through every line and looks for a match
#or
#for line in f: match = re.findall('[A-Z]+', line) print match

Or, if the file isn't too large you can grab it as as single string:

lines = f.read() # Warning: reads the FULL FILE into memory. This can be bad.
match = re.findall('[A-Z]+', lines)
print match
1

'lines' term from your snippet consists of set of strings.

 lines = f.readlines() match = re.findall('[A-Z]+', lines)

You cannot send entire lines into the re.findall('pattern',<string>)

You can try to send line by line

 for i in lines: match = re.findall('[A-Z]+', i) print match

or to convert the entire lines collection into single line (each line seperated by space)

 NEW_LIST=' '.join(lines) match=re.findall('[A-Z]+' ,NEW_LIST) print match

This might help you

re.findall finds all the occurrence of the regex in a string and return in a list. Here, you are using a list of strings, you need this to use re.findall

Note - If the regex fails, an empty list is returned.

import re, sys
f = open('picklee', 'r')
lines = f.readlines()
regex = re.compile(r'[A-Z]+')
for line in lines: print (re.findall(regex, line))
0

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like