05 June 2020, 22:24 | #1 |
Zone Friend
Join Date: Apr 2005
Location: London
Posts: 1,180
|
Checking for undesireable contents
It sound like you guys are going through a manual process checking for unwanted files on the file server. How are you doing this, by file name?
How about you get a list of MD5 hashes of all the files you want to exclude, then run a script to unpack every archive on the server to a temp directory in turn, and check the MD5 of each of the archive's contents against the list of unwanted files. You could run it against all the current contents and also against any new upload as a routine. Just a suggestion guys. |
06 June 2020, 10:28 | #2 |
Banned
Join Date: Aug 2005
Location: London / Sydney
Age: 47
Posts: 20,420
|
Heya rare_j,
I wish it would be that easy... Unfortunately it has to be checked manually as there are .ADF / .DMS / .IPF / LHAs / LZXs / CD ISO / ZIPs / RARs / other formats. Impossible to get MD5 hashes of all the files we want to exclude i.e. any KS ROMs / Workbench / still for sale software etc... |
06 June 2020, 15:01 | #3 |
Registered User
Join Date: Dec 2008
Location: Ursviken
Posts: 146
|
There's no easy solution in making this a fully automatic process. You could simply limit different file formats allowed, but it would still be possible to share a kickstart ROM file using base64 or other encoding inside a PDF file..
Good luck in checking the just-above-million files manually. |
06 June 2020, 19:01 | #4 |
Zone Friend
Join Date: Apr 2005
Location: London
Posts: 1,180
|
Well it's a complex problem but I still think it could be done using a combination of host utilities, and script driven emulation to examine contents of adf/dms etc.
There's also the complication of recursive archives of various formats. An iso containing a zip containing an adf containing an lzx of a kickstart. But it can be done. You're right - it's not easy, and could be easily defeated intentionally. But intentional distribution of kickstart files and other materials through the eab file server is not really the issue here. What I'm talking about is a system to check for kickstarts and other files which have been uploaded by accident. Then at least we could say we are taking reasonable steps not to distribute these materials and nobody can lightly accuse us of wilfully distributing this stuff. I'll do some research at least. |
06 June 2020, 21:38 | #5 |
Registered User
Join Date: Dec 2019
Location: Preston
Posts: 100
|
thinking around rare_j's suggestion, could you do a check against known MD5s for files you *can* keep. and use this information to rule out files that require manually checking. reducing the size of the problem, maybe?
e.g. any MD5s of TOSEC files could be assumed to be okay. similarly IPF files should have an associated checksum. |
06 June 2020, 21:59 | #6 |
Banned
Join Date: Aug 2005
Location: London / Sydney
Age: 47
Posts: 20,420
|
|
07 June 2020, 11:59 | #7 | |
Registered User
|
Hello DamienD!
Quote:
My idea to find KS files: Do binary search at each extracted file finding the string "exec.library" + NULL byte and ROM tag. BTW: In order to get all used file extensions sorted alphabetically starting from current dir, you can do in bash: Code:
find . -type f -name '*.*' | sed 's|.*\.||' | sort -iu Last edited by BastyCDGS; 07 June 2020 at 12:05. Reason: Added bash example for getting file extensions |
|
08 June 2020, 13:18 | #8 |
Banned
Join Date: Aug 2005
Location: London / Sydney
Age: 47
Posts: 20,420
|
That sounds great BastyCDGS, apreciate the offer of assistance
We are also looking for any Workbench disk; all versions. Finally we are looking for games / software that isn't allowed, but there ins't really any of this. |
08 June 2020, 19:15 | #9 | |
Registered User
|
Hi DamienD!
Quote:
BTW: Here is a bash one-liner which searches all files containing the ROM tag AND the NULL terminated string 'exec.library' from current directory. It will work on all files that are uncompressed and not encrypted, i.e. will work through most ISOs, uncompressed TARs and also LHA/LZX/ZIP/etc. archives which store data instead of using compression. It prints out a line for each file with full path containing the magic ROM tag (0x1111) as well as 'exec.library' NULL terminated string. Code:
find . -type f -exec $(which grep) -qP "\x11\x11" {} \; -exec $(which grep) -alP "exec\.library\00" {} \; NOTE: This might also flag expansion ROMs and AROS, so always check first before deleting! For automatically parsing compressed stuff, I suggest you send me a list of all file extensions used, so I can determine what decrunchers are needed and implement it. You can use the bash oneliner in my previous post for it! |
|
12 June 2020, 21:20 | #10 |
Registered User
Join Date: Dec 2008
Location: Ursviken
Posts: 146
|
Any progress ? Will the EAB FTP ever be the same after this ?
How about opening up the checked (clean) part of the FTP ? .. and while at it, this could be a great time to reduce the number of duplicates (and maybe adding the MD5s and timestamps to the file list as I suggested a long time ago) .. |
13 June 2020, 10:35 | #11 |
Registered User
Join Date: Oct 2012
Location: Italy
Age: 49
Posts: 2,984
|
I don't think eab file server could be opened in different parts
I mean, or the whole thing or nothing ..... but i could be wrong |
13 June 2020, 19:45 | #12 |
Registered User
Join Date: Nov 2010
Location: South Wales
Age: 47
Posts: 947
|
can easily give/remove permissions to certain directories/files. But this would be a load of extra work , when they could just wait till it's already done.
|
13 June 2020, 19:51 | #13 |
Global Moderator
Join Date: Nov 2001
Location: Derby, UK
Age: 48
Posts: 9,355
|
The file server wilbe opened when we are happy and satisfied that it is okay to do so. Myself and Ian need to have a discussion about what's left to do. I am back to work now so have a lot less free time.
Please bear with us guys/gals |
23 June 2020, 09:26 | #14 | |
Registered User
Join Date: Nov 2006
Location: Eugene,Oregon,USA
Posts: 61
|
Quote:
let me know if I can be of help |
|
23 June 2020, 23:16 | #15 |
Registered User
Join Date: Dec 2008
Location: Ursviken
Posts: 146
|
@alanwall are you sure that you got that "request" for pirated workbench floppies correct ?
|
30 June 2020, 00:41 | #16 |
Registered User
Join Date: May 2019
Location: USA
Posts: 57
|
You know what? I'm gonna say a little something here. Just a simple thank you. Why? Because you guys are doing everything you can to ensure the rules of this place are upheld to the best possible ability. Sure it's a downer that the FTP server is down until this done, but it shows the dedication of the moderation team (And Damien) to ensure that everything is running as clean as possible on all fronts.
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Checking for genuine CPUs? | Mr.Flibble | support.Hardware | 0 | 03 July 2015 21:08 |
Danish Fanboy checking in! | G00dY | Member Introductions | 8 | 21 March 2014 09:49 |
Checking my Religion | Axxy | request.Modules | 0 | 04 July 2005 19:22 |
Checking a 2.5" hd for errors | oldpx | support.Hardware | 12 | 28 July 2002 08:17 |
|
|