I currently have working code that scrapes subreddit's by name and pulls out the latest 1,000 submissions, inserting their data into a DB.
Now I want to do something similar, but different. I want to grab the last 1,000 submissions (posts, not comments) by a USER (if I could do this by user per subreddit, that'd be better, but I don't think the API allows it).
I have mostly accomplished this and I think I'm doing it the "right way", except for a couple pieces of data that I think I'm accessing incorrectly and though I've tried to review PRAW's docs and done due-diligence, I can't find the right way to access these things. Let me show you.
Here is my working "grab by subreddit" code:
for subreddit in _PRAW_SUBREDDITS: for submission in reddit.subreddit(subreddit).new(limit=_PRAW_LIMIT): cursor.execute( """INSERT INTO reddit ( name, created_utc, author, link_flair_text, num_comments, score, subreddit, permalink, title, selftext) VALUES(?, ?, ?, ?, ?, ?, ?, ?, ?, ?) ON CONFLICT(name) DO UPDATE SET num_comments=excluded.num_comments, score=excluded.score, selftext=excluded.selftext """, ( submission.name, int(submission.created_utc), str(submission.author), submission.link_flair_text, submission.num_comments, submission.score, str(submission.subreddit), submission.permalink, submission.title, submission.selftext, ), )And here is a new version of the same code that I'm trying to create a new function with that will grab "by user":
for submission in reddit.redditor(_PRAW_REDDITOR).submissions.new(limit=1): print( f"{submission.name=}" f"{submission.created_utc=}" f"{submission.author=}" f"{submission.link_flair_text=}" f"{submission.num_comments=}" f"{submission.score=}" f"{submission.subreddit=}" f"{submission.permalink=}" f"{submission.title=}" f"{submission.selftext=}" )Most of my results come out normal, except the reference to the user and the subreddit.
# This is the output to console: submission.author=Redditor(name='JoeBloeUsername') submission.link_flair_text=None submission.num_comments=10 submission.score=137
submission.subreddit=Subreddit(display_name='u_JoeBlowUsername')This is how I'm creating the reddit instance at the start of my code:
# Create Reddit instance in PRAW
reddit = praw.Reddit( client_id="[REDACTED]", client_secret="[REDACTED]", user_agent="Windows 10:randoapp:0.00002",
)Clearly, I'm having a hard time wrapping my head around how to utilize the different Submission, Redditor, Comment instances here. I thought I had a grasp on them, but then it falls apart when I try to use them, in my example.
If anyone has enough experience with PRAW to enlighten me, you'd have my gratitude.
1 Answer
submission.author is a Redditor object that has many attibutes.
if you wanted to store the Redditor name, you would use submission.author.name.
submission.subreddit is very similar,
you can use submission.subreddit.display_name to get the subreddit name (it starting with u_ means that it was posted to a user profile, not a subreddit)
The PRAW documentation is great and cover all of these attibutes with examples, I highly recommend you read it