Sunday, January 5, 2014

Mathematics, Statistics, Computer Science

More so than any other disciplines, they are tools to work with abstract information. When physics teachers preach the importance of units, they often point out that unitless equations like "2 + 3 = 5" are meaningless. 5 what? 5 potatoes? 5 light years? The optimistic view is that 2 + 3 = 5 no matter what we are talking about. That is, understanding this abstract, "meaningless" space can be applied to infinite real-world situations. How cool is that? As long as you can transform a real problem into an abstract computational one, you can use all the tools they have to offer.

One great example is graphs, or networks: a set of vertices, and edges between them. They might be weighted or unweighted, and the edges might be directed or undirected. A particular network might represent cities and roads, atoms and bonds, or politicians and alliances. Clustering algorithms may be run on Facebook or LinkedIn, but they are also used to identify cancer pathways.

Another universal structure is a sequence, for which we have tons of models and algorithms. Amino acids in a protein. Weather patterns over a week. Die rolls in a casino. Letters in words, and words in sentences. Stock market prices over a year. Anything with some temporal order is fair game. These are all sequences! Hidden Markov Models (HMMs) can be used just as well for meteorological or financial predictions. Sequence alignment is as valuable for mapping genes on your genome as it is for the natural language processing on your phone!

Yet beyond just being useful tools for people in different walks of life, each of them has its own important lessons to teach. Mathematics gives us a language in which to precisely discuss and understand beautifully pure forms we will never see, as well as patterns we experience every day. Statistics teaches us how to decide if something is important or significant, and how to change our beliefs based on new evidence. Computer science instills faith into our culture that data we will never be able to process ourselves can still be understood, that the problems different people face are often more similar than they may seem, and that those problems can be solved ever more quickly and efficiently if we try.