you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 4 insightful - 2 fun4 insightful - 1 fun5 insightful - 2 fun -  (15 children)

My probem is that we're going to lose /r/internetcollection if reddit goes under, as well as lots of other neat communities.

[–]magnora7 5 insightful - 2 fun5 insightful - 1 fun6 insightful - 2 fun -  (14 children)

Perhaps eventually someone could develop an automated process to port those subreddits over to saidit subs

[–][deleted] 5 insightful - 2 fun5 insightful - 1 fun6 insightful - 2 fun -  (12 children)

i think it's doable. their api only goes back 1000 posts but it could be screen scraped.

[–]magnora7 5 insightful - 3 fun5 insightful - 2 fun6 insightful - 3 fun -  (5 children)

I've always wanted it so the top 3 posts on /r/bad_cop_no_donut automatically get posted to /s/policemisconduct as new posts. I think that sort of feature could be useful all over saidit to help fill out some of the newer subs

[–][deleted] 5 insightful - 2 fun5 insightful - 1 fun6 insightful - 2 fun -  (1 child)

oh that's an interesting idea.

[–]renlok 3 insightful - 2 fun3 insightful - 1 fun4 insightful - 2 fun -  (2 children)

If saidit had an API it would be pretty easy to set something up that would do this automatically, although it would just look really barren if you have loads of posts but no comments on anything

[–]magnora7 4 insightful - 2 fun4 insightful - 1 fun5 insightful - 2 fun -  (1 child)

Good points. That's why I would only want to transplant a few a day, that way we would have the opportunity to comment on them and it won't just be a firehose.

[–][deleted] 2 insightful - 2 fun2 insightful - 1 fun3 insightful - 2 fun -  (0 children)

Maybe copy the front page once a day?

[–][deleted] 5 insightful - 2 fun5 insightful - 1 fun6 insightful - 2 fun -  (5 children)

We at /r/internetcollection maintain a stickied list of links to previous posts, so that isn't a problem. The important bit is the text in the posts, which contains a short description, archive and source links, and categorisation-related info.

[–][deleted] 4 insightful - 2 fun4 insightful - 1 fun5 insightful - 2 fun -  (4 children)

yeah wow, you guys are seriously organized.

[–][deleted] 5 insightful - 2 fun5 insightful - 1 fun6 insightful - 2 fun -  (3 children)

/u/snallygaster deserves the credit; I just became an approved submitter fairly recently, he's the one who maintains the list and posted most of the linked stuff.

Anyways, I'm thinking a python script would be sufficient. Problem is that it's nearly 300 posts; I need a method which won't use up my bandwidth downloading it, aka I need to get famillair with website scraping and the Reddit API.

[–][deleted] 3 insightful - 2 fun3 insightful - 1 fun4 insightful - 2 fun -  (2 children)

in Python world i recommend Beautiful Soup for scraping and I'd put a delay in there or they will block your ip. Sounds like a fun project. I'd help but I'm overwhelmed with this site already.

[–][deleted] 3 insightful - 2 fun3 insightful - 1 fun4 insightful - 2 fun -  (1 child)

I'm a bit busy myself; I'll post it here once I'm done with it. Can probably be generalised to a reddit archival tool. Do you know what the delay should be?

[–][deleted] 2 insightful - 2 fun2 insightful - 1 fun3 insightful - 2 fun -  (0 children)

if you are scraping at your leisure, I'd put it high, like a random 30 seconds to 2 mins between requests.

yeah man throw her up on github it could prove very useful to a lot of people.

[–][deleted] 3 insightful - 2 fun3 insightful - 1 fun4 insightful - 2 fun -  (0 children)

Yeah, because in lieu of bookmarks I and /u/snallygaster have been submitting links to /r/internetcollection, and I'd hate tomsee my old net history stuff disappear.