uploaded files - database vs filesystem, when using Grails and MySQL -


I know that this is a "classic question", but mysql / grails (posted on Tomcat) uploads a new user Spin on thinking about storing files made

I like using databases for everything (simple architecture, scaling is just scaling the database). But using the file system means that we do not weaken mysql with binary files. Some people can also argue that the API is faster than tomcat for the service of binary files, although I have noticed that in fact the number of tokens placed on an apache (httpd) proxy Can be faster than use. / P>

How should I choose where to place the user's uploaded files?

Thank you for your thoughts, time and thoughts.

I do not know if someone can make general observations about such a decision, because this is true That's what you are trying to do and how much performance and response time your application is for the priority list NFR.

If you have a lot of users to upload large number of binary files, with a large number of those uploaded binary files, then you have a situation where the database is in Cost of depositing files include:

  • Large size binary files
  • have the advantage

      < Li> makes atom
    • Scaling comes with database (though w

    of the same system The need to know where you store files in the system

    • Scaling
    • File name man
    • Map the files on the disk To make related records in DB (and all this surrounding code)
    • By looking at your apache configurations, they serve with file system

    Our Grails site We had such a problem to solve this way, where the content editor was one day hundreds We were uploading pictures. We knew that through the application, all demands could be better processing while running (considering that the expectation of pages was expected to be in lakhs per week, we would definitely see the images Did not want to cripple).

    We prepared the upload -> file system solution. A DB meta-data record was created for each uploaded file and was managed in conjunction with the upload process (and read that record contrast in the GSP content link when that image was generated). Based on the link requested by the browser, we have requested the disk directly via Apache. But, and there is always one but, remember that with things like file system, you only have content per machine.

    We believe that there was a headache for re-synchronizing the images on every server, as opposed to the DB sits behind the cluster and enables the cluster to behave equally, the physical locations on the file server Is bound to

    There is another problem that you can run against the file system, that folder is the content size when you start folders, where there are literally thousands of files in them, then the folder scan at the OS level actually starts dragging To overcome this problem, we had to write down the code which means that I / MM / DD / Image Name.jpg manages image uploads in folder structures so that no folder can store hundreds of thousands of images.

    What I am telling you is that till we achieve that performance which was not able to use the Blob storage for the Dob, which is the cost of the system management comes on.


Comments