It’s a tedious exercise to download all of your activity data from Strava, as you have to visit then export each activity individually to export them all. To make it easier to access your data we’ve added a new bulk export feature. Found on your Settings page in the lower-right corner, “Download your data” converts all of your Strava activities to GPX format and delivers them as a single zip file. Once the conversion and compression is finished you’ll receive an email with a link to your file. The file can take several minutes to generate depending on how many activities you have.
On the engineering side there’s not much magic going on here (unlike an upcoming engineering post on Route Master). The AWS S3 SDK has some nice options for generating unique, expiring URLs for a S3 bucket which allows us to offload the transport of these files from our servers. In addition, the SDK supports streaming downloads and chunking writes which makes these large zip files manageable. Finally, we discovered that the standard Ruby zip library is a massive memory hog so we used the filesystem zip instead.
There is a bit of workflow to manage in that we want to convert your activities in a parallel but ultimately we need to combine them into a single file. We kick off potentially hundreds of parallel jobs to fetch, convert, and temporarily store your individual activities in S3, which is where most of the time is spent. When this step finishes we use a single job to grab the converted files from S3, download them to a temporary location on a single box, then zip and upload them back to S3 as a single file. All of this transient workflow state is managed in Redis (my favorite engineering quote: “If the answer isn’t Redis, you’re probably asking the wrong question.”).
Finally, as with nearly everything we build these days, this process is monitored using StatsD and Graphite, and alerts are managed with Nagios and delivered with PagerDuty. We love StatsD and Graphite, the introduction of them into our stack several months ago has been game changing. Look for an upcoming post on our Graphite topology – we’ve found scaling Graphite to be non-trivial.