Size: 2180
Comment:
|
← Revision 6 as of 2018-10-12 20:47:28 ⇥
Size: 3544
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 68: | Line 68: |
sed -i 's/.*/"&"/' allfiles4.txt | cp allfiles4.txt allfiles5.txt sed -i 's/.*/"&"/' allfiles5.txt |
Line 73: | Line 74: |
" (8 spaces) /" | " (8 spaces) / |
Line 77: | Line 78: |
COMANDO /" | "./ Also, "data" MUST GO. |
Line 81: | Line 84: |
Now we have lines like those: | Now we have allfiles5.txt with lines like those: |
Line 84: | Line 87: |
COMANDO "data/static/S100/extensys/photos/Extensys RM64 64K RAM.txt" COMANDO "data/static/S100/microdesign/photos/Microdesign MR 8 RAM PROM card.jpg" COMANDO "data/static/S100/kontron/systems/z80a-ecb-e1_kontron_ger_bwr.pdf" |
"./static/S100/extensys/photos/Extensys RM64 64K RAM.txt" "./static/S100/microdesign/photos/Microdesign MR 8 RAM PROM card.jpg" "./static/S100/kontron/systems/z80a-ecb-e1_kontron_ger_bwr.pdf" |
Line 88: | Line 91: |
Here's the basic file for download those FUCKING PIECES OF SHIT. |
|
Line 93: | Line 94: |
Now COMANDO must serve to check if the file exists in our backup. so: | '''NOTE: some files contain $. So find them and replace, on joe, from "$" to "\\\$" to escape them!''' To substitute $ on vi, use {{{ press : and %s/\$/\\$/g }}} |
Line 96: | Line 103: |
Now we have allfiles8.txt, and so copy it into the correct dir and do: {{{ cd /media/asbesto/BALAZZO/amaus.org while read -r line; do ls "$line"; done < allfiles8.txt 1>outo 2>erro }}} outo will contain existing files, erro the missing files. wc sum of outo and erro must match the wc count of allfiles8.txt {{{ root@rover:/media/asbesto/BALAZZO/amaus.org# wc outo erro 80156 278481 5208866 outo 123829 1325067 14779570 erro 203985 1603548 19988436 total root@rover:/media/asbesto/BALAZZO/amaus.org# wc allfiles8.txt 203985 612916 14165878 allfiles8.txt root@rover:/media/asbesto/BALAZZO/amaus.org# }}} That's it! Now we can use "erro" to download what we FUCKING NEED. |
|
Line 99: | Line 131: |
We try wget now that we DON'T have any FUCKING | So add the path to the internet site, you must have |
Line 101: | Line 133: |
{{{ 'https://amaus.org/static/S100/zilog/z80/older/The Z80 microcomputer handbook cover.PDF' 'https://amaus.org/static/S100/zilog/z80/older/The Z80 microcomputer handbook William Barden.PDF' 'https://amaus.org/static/S100/zilog/z80/The Z80 microcomputer handbook William Barden.PDF' 'https://amaus.org/static/S100/zilog/z80/Z80 Assembly subroutines Leventhal.pdf' }}} |
|
Line 102: | Line 140: |
for all files, and you need to add wget -r -c -np in front of every fucking line because this SHIT down here doesnt work! | |
Line 103: | Line 142: |
{{{ while read -r line; do wget -r -c -np $line ; done < missingfiles1.txt }}} |
The Maben / Amaus HOLOCAUST
This is the log of my effort to mirror the infamous S100 archive, called MABEN / Amaus.org.
I have most of it in my disk, it was mirrored time ago. Now something was added and I want to download only what's missing.
No rsync - and wget probe every fucking single file, this is very annoying.
So:
File listing
I download allfiles.txt:
wget amaus.org/static/S100/allfiles.txt
Check for existing files
Get rid of first and last line, those are descriptions
asbesto@rover:~$ head -1 allfiles.txt All files from S100 directory listed here at Thu Oct 11 16:00:01 BST 2018 asbesto@rover:~$ tail -1 allfiles.txt All files listing completed at Thu Oct 11 16:13:18 BST 2018 asbesto@rover:~$
sed '$d' < allfiles.txt | sed "1d" > allfiles2.txt
Filter and shape the file
Format of the file is:
-rwxrwxrwx 1 root root 2346222 May 1 2009 /data/static/S100/avo/Avometer Model 8 Mk II Working Instructions.pdf
We want something like
wget -options /data/static/S100/avo/Avometer Model 8 Mk II Working Instructions.pdf
only for FILES, no directories! they will be created later.
Get rid of directories:
grep -v drwx allfiles2.txt > allfiles3.txt
Cut out first 8 columns to leave only the filename:
awk '{$1=$2=$3=$4=$5=$6=$7=$8=""; print}' allfiles3.txt > allfiles4.txt
Now shape the filenames etc.
Add brackets:
cp allfiles4.txt allfiles5.txt sed -i 's/.*/"&"/' allfiles5.txt
and use an editor to trasform
" (8 spaces) / into "./ Also, "data" MUST GO.
Now we have allfiles5.txt with lines like those:
"./static/S100/extensys/photos/Extensys RM64 64K RAM.txt" "./static/S100/microdesign/photos/Microdesign MR 8 RAM PROM card.jpg" "./static/S100/kontron/systems/z80a-ecb-e1_kontron_ger_bwr.pdf"
CHECK FOR DUPES
NOTE: some files contain $. So find them and replace, on joe, from "$" to "\\\$" to escape them! To substitute $ on vi, use
press : and %s/\$/\\$/g
Now we have allfiles8.txt, and so copy it into the correct dir and do:
cd /media/asbesto/BALAZZO/amaus.org while read -r line; do ls "$line"; done < allfiles8.txt 1>outo 2>erro
outo will contain existing files, erro the missing files.
wc sum of outo and erro must match the wc count of allfiles8.txt
root@rover:/media/asbesto/BALAZZO/amaus.org# wc outo erro 80156 278481 5208866 outo 123829 1325067 14779570 erro 203985 1603548 19988436 total root@rover:/media/asbesto/BALAZZO/amaus.org# wc allfiles8.txt 203985 612916 14165878 allfiles8.txt root@rover:/media/asbesto/BALAZZO/amaus.org#
That's it!
Now we can use "erro" to download what we FUCKING NEED.
what the FUCK use to download
So add the path to the internet site, you must have
'https://amaus.org/static/S100/zilog/z80/older/The Z80 microcomputer handbook cover.PDF' 'https://amaus.org/static/S100/zilog/z80/older/The Z80 microcomputer handbook William Barden.PDF' 'https://amaus.org/static/S100/zilog/z80/The Z80 microcomputer handbook William Barden.PDF' 'https://amaus.org/static/S100/zilog/z80/Z80 Assembly subroutines Leventhal.pdf'
for all files, and you need to add wget -r -c -np in front of every fucking line because this SHIT down here doesnt work!
while read -r line; do wget -r -c -np $line ; done < missingfiles1.txt