English Amiga Board


Go Back   English Amiga Board > Off Topic > OT - Technical

 
 
Thread Tools
Old 19 November 2014, 22:52   #1
Anubis
Maj. Voodoo

Anubis's Avatar
 
Join Date: Jan 2005
Location: #DrainTwitterSpam
Age: 45
Posts: 2,284
How to leach from Archive.org

If memory serves good, this place has some angry leachers who would leach everything possible on net... Here is small instruction I wrote on L64 how to leach mags from Archive.org. With small adjustments, script can be changed to leach anything you want from Archive.



Here is instruction how to download ZZap!64 (or any other mag)

If you use Windows, you need to install Cygwin.

https://cygwin.com/install.html

When you install it, make something simple for root directory (mine is set to c:\cygwin64 ), set where program will download files (on my computer cygwin64 folder inside download folder), select any mirrors from mirrors list and make sure to select wget (under Web menu) by clicking circle next to word 'Skip'.

Once this installed, you should have shortcut that will start command/cmd/shell look like prompt that will accept *NIX commands (only those you selected while installing).

Now got on Archive org, select Zzap!64 archive, click All items (most recently added first). On this windows, next to Search you should see something like this 'collection:zzap64-magazine', click advanced search.

On advanced search page, somewhere on middle there will be 'Advanced Search returning JSON, XML, and more' with identifier already selected. Leave identifier selected, click CSV format and change number of results from default 50 to anything you want, that is greater then collection total number. (in this case 106) I set my to 500 (just add 0 - lazy me )

You will get note how many results you query for (in my case 500) and then you will have save dialog to save search.csv. Save it by creating 'download' foder in Cygwin folder on C drive. Open file in text editor, remove first line that has identifier and remove all " by selecting one and replacing it with empty string. Save file and rename it to zzap.txt.

Now start Cygwin terminal, CD to download folder and type following command:

wget -r -H -nc -np -nd -P 'Zzap64' -A .pdf -e robots=off -l1 -i ./zzap.txt -B 'http://archive.org/download/'

, get coffee or beer and watch program download every Zzap!64 issue on Archive.org by placing them in Zzap64 folder in downloads.

The same can be used to collect any other collection (or format) by making appropriate changes to above command.

It might sound this is a lot, but it is much less then downloading every single mag individually, and you can learn thing two about *NIX if you already don't know much about it...


Edit - you can open list in Excel and sort it out if you like to download that way...
Anubis is offline  
AdSense AdSense  
Old 19 November 2014, 22:57   #2
prowler
Global Moderator

prowler's Avatar
 
Join Date: Aug 2008
Location: Sidcup, England
Posts: 10,278
Thanks for the great tip, Anubis!

You now have a sticky thread.
prowler is offline  
Old 19 November 2014, 23:05   #3
Anubis
Maj. Voodoo

Anubis's Avatar
 
Join Date: Jan 2005
Location: #DrainTwitterSpam
Age: 45
Posts: 2,284
Quote:
Originally Posted by prowler View Post
Thanks for the great tip, Anubis!

You now have a sticky thread.
Thank you!

And I was wondering if this might be against some rules I don't know about...

If anyone needs help or screenshots, I can provide them. It is easy and allows you to get all mags that are available on archive.org.
Anubis is offline  
Old 20 November 2014, 13:13   #4
Hewitson
Registered User
Hewitson's Avatar
 
Join Date: Feb 2007
Location: Melbourne, Australia
Age: 35
Posts: 2,235
Why install cygwin? A Windows binary of wget has existed for years.
Hewitson is offline  
Old 20 November 2014, 14:31   #5
Anubis
Maj. Voodoo

Anubis's Avatar
 
Join Date: Jan 2005
Location: #DrainTwitterSpam
Age: 45
Posts: 2,284
I use it for more then this, so that is why I used this approach. I never used wget win binary. (And Cygwin is easy to keep updated)
Anubis is offline  
Old 21 November 2014, 00:16   #6
jbl007
Registered User
 
Join Date: Mar 2013
Location: Leipzig/Germany
Posts: 355
A one-liner for bash with no need to save the .csv:
Code:
for i in  $(curl 'https://archive.org/advancedsearch.php?q=collection%3Azzap64-magazine&fl%5B%5D=identifier&rows=500&output=csv'); do wget -r -H -nc -np -nd -P 'Zzap64' -A .pdf -e robots=off -l1  http://archive.org/download/${i:1:-1}; done
jbl007 is offline  
Old 12 January 2015, 05:59   #7
rootboy
 
Posts: n/a
Quote:
Originally Posted by jbl007 View Post
A one-liner for bash with no need to save the .csv:
Code:
for i in  $(curl 'https://archive.org/advancedsearch.php?q=collection%3Azzap64-magazine&fl%5B%5D=identifier&rows=500&output=csv'); do wget -r -H -nc -np -nd -P 'Zzap64' -A .pdf -e robots=off -l1  http://archive.org/download/${i:1:-1}; done
Works just fine on Mint. Thanks!
 
AdSense AdSense  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Found the Asm-One Manual on Archive.org Photon Coders. Asm / Hardware 21 21 August 2015 00:36
Coppershade.org Photon Amiga websites reviews 7 28 May 2014 02:50
Archive.org - .mod music AntiPontifex project.Mods Jukebox 8 17 May 2014 17:49
Amiga.org alexh Amiga scene 15 23 May 2006 16:35
t-bone.org Paul Amiga websites reviews 1 14 February 2003 00:39

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 18:40.


Powered by vBulletin® Version 3.8.8 Beta 1
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Page generated in 0.14364 seconds with 14 queries