Our experience with EFS as a FS for applications
AWS EFS is a network filing system that can be shared over different EC2 instances (or on premises instances).
It virtually has no limits on storage capacity. You can increase bandwidth or configure it as burstable.
It seems ideal for traditional applications migrated to AWS with autoscaling. Not every application is designed in such a way that they can handle the ephemeral nature of the storage in an EC2 instance.
In an autoscaling group, an EC2 instance may just disappear, and along with it it's associated (EBS) storage.
EFS seems like a nice solution to store your application. Just let the application server embedded in your instance (in our case, NginX/PHP-FPM) point to the EFS. When a new instance pops up, it will just point to the shared location.
No very special deployment procedure needed.
We learned the hard way it's not a good solution.
Of course an EFS is slower than EBS block storage.
We thought setting the opcache in such a way that application files are seldom requested would tackle this. Additionally, we installed a file system cache for NFS's on the ec2 instances.
(The opcache holds application files in a semi compiled state in memory, reducing interpretation as well as file system access).
Perhaps it would, but in our use case, other problems arose even before then.
DEPLOYMENT: GIT,COMPOSER,CACHE WARMING
Even trying to get out application from the git server on the EFS, and setting up using composer (the package manager) seemed to be enough to deplete the burst credits of the EFS.
Cause of this was the amount of files pulled from the git repo, as well as the amount of files installed from 3rd parties by composer (and verifying the existence of this).
Another important cause was the 'cache warming' by the deployment script. Many files were written to a cache directory on EFS.
This soon turned out to be an unworkable situation. A deployment would take hours and we even hadn't the application up and running within the application server.
OPTIMIZING: THE OPCACHE
So when we had the application running, it's performance degraded to 10% of that on the original infrastructure. Even with the opcache enabled (we suspect the cache, since when we moved the cache locally for testing, performance rose. But it never was very good).
So we decided the concept to use EFS for our application very quickly.
Unfortunately the application was not very suited to use the standard solutions AWS offers (Elastic Beanstalk, or Code Deploy).
In the end we decided to host the application locally on each ec2 instance. Each time an EC2 instance pops up, it installs the latest deployment locally on it's EBS storage using our custom deployment script.
For deployment over the autoscaling group, we wrote a script that executes the deployment script on each script separately. For this we used AWS command, to target tagged instances.
This works like a charm: deployment times are acceptable, and the application runs as speedy as on the original infrastructure (of course with the added benefit of autoscaling).
For another case, EFS was better suited to host the application.
We had a small microservice (not a 500MB git repository) with infrequent deployments. The application was NOT intended to run from within a web server, instead it was designed to do a lot of heavy media processing with actions we stored in an SQS queue.
As long as we do not store the resulting files or intermediate temporary files on EFS (but first on EBS, then on S3), everything works perfectly well and we are not even near exhausting EFS's burst credits.
You should be very aware when you use EFS to host your application about how exactly your application is used, and how it will make use of storage space.
For some cases, using EFS for application storage within an autoscaling group may be feasible, but if it's not, consequences may be drastic.
We'd not recommend using EFS for web servers in any way, for application storage or file storage, although it might seem a cheap solution with a low TCO at first. It's not.
Use EBS and S3 with a good deployment procedure instead.