you are viewing a single comment's thread.

view the rest of the comments →

[–]GetNifty 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (5 children)

How were you able to fetch usernames from comments? I'm the person that made the archive btw, so I would be interested to know so I can do it for other subs.

[–]AFutureConcern[S] 4 insightful - 2 fun4 insightful - 1 fun5 insightful - 2 fun -  (4 children)

If you're on unix:

cd DebateAltRight
grep -roh 'user/[0-9A-Za-z_-]*\.html' | sed 's:^user/::;s:.html$::' | sort | uniq -c | sort -nr

[–]GetNifty 2 insightful - 1 fun2 insightful - 0 fun3 insightful - 1 fun -  (3 children)

Thanks, can you tell me which command exactly extracts comments? I could only fetch usernames from people who posted threads

[–]AFutureConcern[S] 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (2 children)

This:

grep -roh 'user/[0-9A-Za-z_-]*\.html'

Just text searches for links to users in all files. Some of them don't work (if the user has not posted a thread), but the links are there attached to each comment. The rest of the command just formats it nicely and counts the number of links to each user before sorting.

[–]GetNifty 2 insightful - 1 fun2 insightful - 0 fun3 insightful - 1 fun -  (1 child)

Ah I get it, so you basically search in every file of the archive for a /user/user.html mention (even inside comment files), then fetch it, right?

[–]AFutureConcern[S] 2 insightful - 1 fun2 insightful - 0 fun3 insightful - 1 fun -  (0 children)

Exactly.