• Welcome to SC4 Devotion Forum Archives.
 

News:

The SC4 Devotion Forums are no longer active, but remain online in an archived, read-only "museum" state.  It is not possible for regular members to post or use the private messaging system, and no technical support will be provided for any issues pertaining to the forums in their current state.  Attachments (those that still work) are accessible without login.

The LEX has been replaced with SC4Evermore (SC4E), and SC4E maintains an active Discord server.  For traditional forums, we recommend Simtropolis.

Main Menu

Backing up threads/topics

Started by JoeST, August 06, 2008, 09:58:43 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

JoeST

I have a script that will download each page of a thread (from ST at the moment) and saves the html in a txt file

I also have a script that will make a simple HTML table of all the posts in a downloaded file.

I am offering, to anyone, for free, this service.

Just post here with a link to the first page of the thread, the number of pages in said thread and (if aplicable) permission from the author of the thread.

I have already done David's (dedgren) CJ, Three Rivers Region and the results will be appearing in the 3RR-ST board in the 3RR section of this site :)

Joe
Copperminds and Cuddleswarms

Haljackey

#1
Ah, so that's what David is doing in those new ST 3RR threads!   ::)

Would you be able to do this for my CJ at Simtropolis?  (The Greater Terran Region).  Although I am now updating it here, the MD is missing dozens and dozens of updates that I just don't have time to bring over.  If you would be able to save it as a html file, that would be wonderful!

You can find the start of the thread here:
http://www.simtropolis.com/forum/messageview.cfm?catid=36&threadid=95625&enterthread=y
-Since I am the author, I grant you premission  :P




I think another good one to do would be the RHW thread at ST.  However, the author (qurlix) has not been active for some time so I don't know how you can get authorization.  The link is here:
http://www.simtropolis.com/forum/messageview.cfm?catid=124&threadid=67624&enterthread=y

Good luck! 

Best,
-Haljackey

Edit:  You're right, I did forget the number of pages.  It is currently 31 pages long.

JoeST

#2
You forgot a vital part... the number of pages :D, but as its my first i will forgive you  $%Grinno$%

I will leave the RHW thread until there is some "higher" authentication...but your CJ is coming right as I type this :)

... Done, in less than 2 mins :)

now I just need to get the parser to write to files :)

Joe

Edit: WOOO my parser works too :)
Copperminds and Cuddleswarms

JoeST

Hey Hal, thanks for being my test subject  ::) $%Grinno$%

I got you that backup you wanted, dont know how to get it you tho

and as a bonus for being my first customer (and a request by CasperVg ;)) I got your Show Us Your Interchanges thread as well :) Downloaded and  parsed in under 2 mins :o

Joe
Copperminds and Cuddleswarms

Haljackey

Quote from: JoeST on August 06, 2008, 02:57:32 PM

and as a bonus for being my first customer (and a request by CasperVg ;)) I got your Show Us Your Interchanges thread as well :)

:angrymore: What?  Without my permission?  :bomb: I am the author of that thread! %bur2$

Nah, thanks!  :P  Glad to see that you are having luck with it. 

You don't know how to get it to me???   $%Grinno$%  Just take your time with it.   ;)

Great work!   :thumbsup:

Best,
-Haljackey

JoeST

while i am at it, do you want your Multi-RHW Guide? :D

and yeah i got yours, but only for personal use...  :P

so, like email you it?

* JoeST waits for a customer  :)

Joe
Copperminds and Cuddleswarms

Jonathan

* A customer arrives *

You probably never heard of backing up these threads before this post ;D

You know you could backup the Discovery threads in the Modding - R&D board.

All 2 pages long except the RUL and SC4Paths which has 6.

Jonathan

JoeST

Hey Jon

guess what.... I was just looking at those threads :D

I will get you then asap.

Joe
Copperminds and Cuddleswarms

deathtopumpkins

I know this thread's kinda dead...

but another customer is here if you're still open. There are several private threads on Simtropolis that I would like to have an html copy of, if you could. I can give you links, pages, etc., but you'll need to be added to the member list first, which I can't do, as I didn't start the threads, though I have permission to back them up.

Tell me if you're willing to do it and I'll give you links.

Thanks,
DTP
NAM Team Member | 3RR Collaborater | Virgin Shores

JoeST

yeah I will back them up, I just have to find the script and get it up and running, might be a few days before i am ready :)

joe
Copperminds and Cuddleswarms

deathtopumpkins

#10
OK, thanks! That's fine. I'll have you added to the member lists and PM you the links then.

EDIT: Joe, what's your username on ST? I can't seem to find you...
NAM Team Member | 3RR Collaborater | Virgin Shores

JoeST

Copperminds and Cuddleswarms

deathtopumpkins

NAM Team Member | 3RR Collaborater | Virgin Shores

JoeST

oh, just realised, i wont beable to auto back up... due to my script acting as a guest...

if you want the parsing thing, then i would suggest saving them individualy as full html and i can then run them through the parsing script as a batch ;)
Copperminds and Cuddleswarms

JoeST

Right, I am back and ready for business.

I have a bash script that downloads each page of a topic from ST, and a py script that splits them (roughly) into posts. I can skim for particular postser's contributions too. It currently backs up the whole thread as pages (to `./pages/page_#`), and then can split each page into posts (`./posts/post_#-$AUTH` in html not just the post content) and can filter out only those by a specific author.

Any particular threads anyone wants?

Joe
Copperminds and Cuddleswarms