Today at work, I was tasked to move a website from an external hosting, to one of our servers. They gave me the FTP username and password, so I started to download the whole website. I then grabbed the database info from the configuration, and made an export of the database as well. Uploaded everything to our server, restored the database, and voila, almost everything worked.
The website in question is a news website. Its content is basically text+images. The text, obviously, was in the database. The images where, obviously, on the filesystem. So logic would suggest that transferring the files and database would do the trick, right? (Ignore the DNS for a moment, okay?)
And sure, normally it would. And all the files where there, except for the images. Taking a closer look, it turned out a part of the images (the first 1998 to be exact) where. in fact, there. Taking an even closer look made me notice that a part of the response text when doing an FTP LIST of the images directory was “TRUNCATING RESULT AT 2000 MATCHES”. Pretty fucking awsome, right, they limited their FTP LIST to 2000 items (‘.’ and ‘..’ being 1 and 2).
I called the support phonenumber, but they couldn’t help me – apparently increasing that 2000 would be bad for server performance. They suggested I grab 2000 files, delete them on the server and then get the next 2000. Nice idea, except that’d mean the website would be broken while I’m getting the files. (The actual switch will be monday/thuesday, so removing all but 2000 images now is not an option.)
Long live FTP access though – I wrote a small PHP script that would list (links to) all the files in the directory, install the firefox plugin ‘download them all’, and about two hours later, I had all the files sitting on my harddrive nicely. (And I’m sure the 14.000 HTTP requests where a much bigger hit on their server then allowing me to grab them trough FTP would have ever been.)
So, in the end, this was just a dumb limitation that (in this specific case) did more harm then good.