Jul 4
2020

Python, I wish I could quit you

I want to stop using Python. I really do. There are lots of interesting languages out there to learn. It's just that Python is so convenient.

Tonight, in an ... hour? or two?, I built an inverted-index search tool for my email. It reads every email in a reasonably-sized (4.2GiB) email folder, decodes MIME, chooses the best format (plaintext or HTML) to index, extracts the text if it's in HTML, extracts the words minus stopwords, and generates a data structure which is a relatively-small 42MiB on disk. Searching this data structure for emails produces correct results in milliseconds. It's several orders of magnitude faster than Apple Mail or Mailmate. And the entire source code including comments and blank lines and so on is 183 lines.

I'd love to say that I wrote this so quickly and so well because I'm a genius programmer, but unfortunately it's pretty clearly just because Python gives me an excellent standard library and a huge ecosystem of extensions, many of which are very well-polished (parsing email and parsing HTML are both notoriously tricky tasks which only look easy). Is there an equivalent ecosystem for any other language?