you are viewing a single comment's thread.

view the rest of the comments →

[–]d3rr 4 insightful - 1 fun4 insightful - 0 fun5 insightful - 1 fun -  (12 children)

i think it's doable. their api only goes back 1000 posts but it could be screen scraped.

[–]magnora7 4 insightful - 2 fun4 insightful - 1 fun5 insightful - 2 fun -  (5 children)

I've always wanted it so the top 3 posts on /r/bad_cop_no_donut automatically get posted to /s/policemisconduct as new posts. I think that sort of feature could be useful all over saidit to help fill out some of the newer subs

[–]d3rr 4 insightful - 1 fun4 insightful - 0 fun5 insightful - 1 fun -  (1 child)

oh that's an interesting idea.

[–]renlok 2 insightful - 1 fun2 insightful - 0 fun3 insightful - 1 fun -  (2 children)

If saidit had an API it would be pretty easy to set something up that would do this automatically, although it would just look really barren if you have loads of posts but no comments on anything

[–]magnora7 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (1 child)

Good points. That's why I would only want to transplant a few a day, that way we would have the opportunity to comment on them and it won't just be a firehose.

[–][deleted] 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (0 children)

Maybe copy the front page once a day?

[–][deleted] 4 insightful - 1 fun4 insightful - 0 fun5 insightful - 1 fun -  (5 children)

We at /r/internetcollection maintain a stickied list of links to previous posts, so that isn't a problem. The important bit is the text in the posts, which contains a short description, archive and source links, and categorisation-related info.

[–]d3rr 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (4 children)

yeah wow, you guys are seriously organized.

[–][deleted] 4 insightful - 1 fun4 insightful - 0 fun5 insightful - 1 fun -  (3 children)

/u/snallygaster deserves the credit; I just became an approved submitter fairly recently, he's the one who maintains the list and posted most of the linked stuff.

Anyways, I'm thinking a python script would be sufficient. Problem is that it's nearly 300 posts; I need a method which won't use up my bandwidth downloading it, aka I need to get famillair with website scraping and the Reddit API.

[–]d3rr 2 insightful - 1 fun2 insightful - 0 fun3 insightful - 1 fun -  (2 children)

in Python world i recommend Beautiful Soup for scraping and I'd put a delay in there or they will block your ip. Sounds like a fun project. I'd help but I'm overwhelmed with this site already.

[–][deleted] 2 insightful - 1 fun2 insightful - 0 fun3 insightful - 1 fun -  (1 child)

I'm a bit busy myself; I'll post it here once I'm done with it. Can probably be generalised to a reddit archival tool. Do you know what the delay should be?

[–]d3rr 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (0 children)

if you are scraping at your leisure, I'd put it high, like a random 30 seconds to 2 mins between requests.

yeah man throw her up on github it could prove very useful to a lot of people.