As a freshman at Rutgers, I brought my self-built desktop computer with an aging copy of Microsoft Office 2011. It quickly became a pain only being effective in my room when I could use my desktop computer, so my sophomore year I bought a Macbook to see if I could benefit from using a *nix-like system since I was already comfortable with bash on my Windows computer via cygwin. I transistioned to doing almost 95% of my college reports, slides, etc. on Google Drive.
Here's a tree
-like example of my directory structure:
~/Documents/school
|--2017-Spring-RU
| |--Digital Logic Design
| | |--syllabus.gdoc
| | |--report-1.gdoc
...
And the setup worked really well! I had some decent filename conventions and some useful symlinks that mirrored my folders across Dropbox and Google Drive (and maybe I'll add self-hosted OwnCloud, in the future) for cloud redundancy. But at the end of the semester, I realized there's the problem with doing everything in Google Drive: archiving those files is a huge pain.
Let me explain: when I completed a semester, I cleaned my school
folder by zipping up each folder and transferring that to my portable SSD. However, Google Drive files are hosted online, and the locally stored*"files"* are just url pointers to the actual file online. To get the actual content, you need to export the file to a common file format. For one report I worked on last week, it's not too hard to do. For 50+ documents accumulated over an entire semester? yeeeeeaaaaaahhhhh, no.
So I did what any self-respecting programmer would do: I automated it. I wrote hardcopy - a command line utility that traverses a given directory (presumably a Google Drive folder/subfolder) and converts Google Document/Sheet/Slide files to their respective offline equivalents( .pdf's and .xlsx's). I used Node.JS and the Google Drive npm packages to do this. I chose Node because I ran into Commander (a nifty npm package for building CLIs) and it really made me want to build CLIs since it was so easy with that package.
There's a bit of acrobatics you need to go through to obtain the right Google Drive API keys, but once that's done. It goes something like this:
$ hardcopy ~/Google Drive/school/2017-RU-Spring
Found "Digital Logic Design/report-1.gdoc"
...
And so on. It's not the prettiest, most user-friendly CLI ever, but it works for me. It exports the files from <filename>.gdoc
to <filename>.gdoc.pdf
so I know which files were originally Google Documents.
My initial version was pretty fast (totally baseless, didn't actually run any benchmarking lol) and it could export about ~20 documents in a minute or so.
I was curious to see if I could utilize multithreading to speed up file export, so I rewrote the CLI in Python - figure'd it would be worth a shot. However, I ran into a byte streaming issue with the Google Drive library for Python that hadn't been fixed, so I was left with single-threaded, blocking file export version of hardcopy. Needless to say, it was a lot slower than the non-blocking Node version.
So yeah, hardcopy and hardcopy-python are both hosted on my Github. Check 'em out, clone/fork/do what you want to them. I do plan to revisit the Python version and see if I can get the multithreading to work - hopefully the recent work in turning Google Drive into Google Backup + Storage fixed the chunk streaming issue. Probably not, but who knows :)?