6

It is common knowledge that neuroscience -- particularly experimental neuroscience -- uses MATLAB more than any other programming language. I have always just taken this as a given, and pointed to existing toolboxes, legacy lab codebases, and social pressure as the reasons why this continues to be the case, even though concrete reasons to switch are widely known.

I want to go past these general reasons and pin down the exact causes of MATLAB's stronghold in Neuroscience. Things like

  • file formats that can only be read in MATLAB
  • acquisition systems that are highly coupled to MATLAB
  • toolboxes that are widely regarded as the "gold standard" for something
  • labs or companies that train many researchers on MATLAB and advocate for it
  • introductory neuroscience programs / textbooks that teach concepts through MATLAB examples

or any other tangible things to point to would be excellent.

tbekolay
  • 206
  • 1
  • 6
  • 1
    At least for EEG, fMRI and MEG analysis, the dominant toolboxes - SPM, EEGLAB, Fieldtrip - are all MATLAB-based. These toolboxes are all dominant most likely because they combine 1. open-source tech, 2. being the first to provide, or at least make accessible, a long range of now gold-standard analysis tools, 3. allowing both basic and advanced analyses.
    Also, the most typical form of data for multivariate neuroscience stuff is matrices, which the MATrix LABoratory happens to handle adequately.
    – jona Sep 17 '14 at 19:53
  • 4
    And then, aren't we all praying every day for the adequate Python-based solution to finally arrive?.. – jona Sep 17 '14 at 19:53
  • @jona Yes, but who'll pay for it? MATLAB toolboxes represent man years of software development. I'd love to see more in Python. – James Sep 19 '14 at 10:59
  • @James, Many labs are already producing modernized versions of Matlab toolboxes, written in Python. They're certainly years behind, but there's a huge momentum behind Python for both ideological reasons and practical (read: monetary) ones. In short, the answer to your question is "labs are rerouting funds for Matlab licenses towards Python development and supplementing it with grant money." Exhibit A. Exhibit B. – Louis Thibault Sep 25 '14 at 13:39
  • @blz Certainly the toolboxes licensed by my employer would have cost many times their fee in development for us to have done that but none the less I am very pleased organisations are doing this. – James Sep 26 '14 at 00:35

2 Answers2

5

I think the major reason is inertia, in the sense that many labs use Matlab, so many Matlab toolboxes are available, so many labs train people in the use of Matlab...

However, as a trained software engineer turned Neuroscientist who has been programming in various languages for close to 30 years, there are several reasons why I actually enjoy using Matlab. One can also make arguments for why it's a good fit for labs filled with people who aren't programmers.

(As an aside, the .mat file format is actually a standard file format called "HDF". So I don't think that file format lock-in is a real reason.)

Things Matlab does well

  1. Very good cross-platform support. Matlab runs out of the box on every major platform. In research, when you want something to just work, spending days getting a python distribution configured with a mutually-compatible set of package versions is just ridiculous. Caveat: I haven't tried anaconda, which hopefully has solved this problem.

  2. Very low housekeeping overhead. No other serious language (is BASIC a serious language?) lets you start writing code without having to include headers, define namespaces, allocate memory... For biologists without programming experience, this is a godsend.

  3. Integrated debugging environment. Non-programmers will not easily be able to learn a language that does not come with an IDE. R, I'm looking at you. Python has very nice options now.

  4. Intrinsic vectorisation of operations. for loops are kind of ugly and error-prone, when what you really want to do is perform an operation on every element of a matrix or perform matrix operations. I know that Python has some add-ins to accomplish matrix operations, but Python itself is not vectorised.

  5. Support for object-oriented software design. As a legacy language, Matlab has a whole bunch of baggage. Mathworks is slowly improving the design of the language and adding other niceties (the parallel computing toolbox is pretty great): the class system; tables with named columns; the ability to ignore return and input arguments in function calls; function handles; namespaces...

  6. Ability to work across machine abstraction levels. Matlab used to be slow, but the JIT compilation works wonders and keeps improving. You can operate at a very high level of abstraction with classes and objects; if you need something fast and low-level, you can write a small amount of C code and call it natively from Matlab.

Things Matlab does badly

  1. Graphics and GUIDE. Oh my god. Handle graphics is pretty horrible. I know they are working on version 2, and it can't come soon enough. matplotlib in Python looks much better, and R makes amazing graphs out of the box. GUIs are pretty ugly to design in any language, but GUIDE makes it very easy to write some really horrible code.

  2. Global namespace. This is sort-of true, and sort-of makes sense for convenience's sake. It's an inherent tradeoff between ease of use and nice encapsulation. Matlab does provide namespace support now, but I think it's still true that the vast majority of toolboxes don't use it.

  3. First-class functions. If the anonymous function system could be upgraded slightly to handle multiple return arguments and to handle currying... Also, intrinsic support for named arguments and parameters would be greatly helpful, especially considering that many Matlab functions accept named parameters as part of their calling syntax.

  • Absence of namespace is is generally considered to be a Very Bad Thing, and forgoing these (small) hurdles is only a win in the most near-sighted sense; it's akin to saying a car without seatbelts is better for novice drivers because remembering to buckle up is a pain in the ass. I would also have placed the OOP support under the heading of "Things Matlab does badly" ;) Lastly, the Anaconda distributions have made it so that Python Just Works, so that is by and large a problem of the past. (In all fairness though, I agree with your other points!) – Louis Thibault Sep 24 '14 at 15:57
  • As another extensive MATLAB user, I don't agree with many of your claims. Just some examples: 1, Python and R, being Open Source and free, are practically easier to deploy. 2, neither R nor Python require any of these. 3, RStudio is arguably better than the MATLAB IDE (which I personally never use). 4, The appropriate comparison is Numpy, which offers all the vectorization MATLAB does; R also vectorizes easily. 5, MATLAB tends to get these (e.g., formula syntax, DataFrame formats) after other languages. 6, MATLAB is slower than R and Python in many benchmarks. – jona Sep 24 '14 at 20:46
  • 1
    Sorry, I don't agree with free == easy to deploy. Matlab is trivial to deploy: run an installer, everything just works. Anaconda aside, Python has been miserable to deploy at least until very recently. I don't see how Anaconda can possibly resolve the issue Python has with Python package incompatibilities and the tangled web of requirements. R seems pretty straightforward to deploy, but I haven't used it much. Numpy is the matrix-operation Python add-in I referred to. The core language of Python is not vectorised. – Dylan Richard Muir Sep 24 '14 at 21:12
  • Point 5. So? Like I said, Matlab comes from the era of Fortran. 6. The benchmarks I've seen (Python vs Matlab: http://wiki.scipy.org/PerformancePython; R vs Matlab: http://www.sciviews.org/benchmark/ but I can't find a more recent benchmark) don't show a huge difference. They should all be using low-level optimised BLAS routines anyway. Besides, if you want blinding speed you should be writing everything in C. Which is not going to happen in Neuroscience. – Dylan Richard Muir Sep 24 '14 at 21:13
  • @DylanRichardMuir, Concerning your response to jona's 5th point: Matlab's heritage is not an excuse. The complaint is that Matlab is a poorly-designed and slow-to-evolve language. Of course there are reasons for this, but it doesn't change the end result! :) Your argument is akin to driving a Ford Model T and exclaiming "Of course my car sucks! It's from the 1900's!" The issue isn't that of machine time, but that of developer time. That's what matters in the overwhelming majority of cases, and it's why nobody uses Fortran anymore. [Note: no disrespect intended! I'm enjoying the arg!] – Louis Thibault Sep 25 '14 at 09:51
  • @DylanRichardMuir, which package incompatibilities are you referring to? Anaconda really does solve deployment issue, at least to the standard we've come to expect from Matlab. There's really no "tangled web of requirements" and there never was. The previous packaging woes came from something entirely different. Matlab also has it's fair share of cross-platform issues (cough cough psychtoolbox). For clarity: I don't mean to imply that these are deal-breakers for Matlab, and similarly, they aren't for Python. These arguments strike me as a rephrasing of "the devil I know..." :) – Louis Thibault Sep 25 '14 at 10:01
  • @blz Jona's complaint was that Matlab gets these things after other languages. Yes. But it has them now. Matlab is evolving as a language; I don't know how you'd compare "speed of evolution" between languages, but every language evolves slowly or you very quickly throw out the entire body of existing code... My "argument" is that if two languages have the same features now, does it matter which one had them first? – Dylan Richard Muir Sep 25 '14 at 12:36
  • @blz I'm not going to engage with the deployment debate. I have had to get Python plus some set of packages up and running on several systems over the last several years, and it's been excruciating every time. It's getting easier, but I think you will have a hard time convincing anybody that Python is as easy to deploy right now. Like I said, maybe Anaconda does a good job (haven't tried it), but I have tried the Enthought distributions previously which promised to take care of everything. They work reasonably well, but can't possibly fix mutual issues between theano/numpy / blas, for example. – Dylan Richard Muir Sep 25 '14 at 12:42
  • @DylanRichardMuir, Agreed, but I think you're still missing the point. My point (and I suspect jona's as well) is that Matlab has two problems. (1) It's closed-source nature means that development occurs a the discretion of it's core developers, which is incurs extra development lag while offering no tangible advantage. (2) Because it's an antiquated language, we usually find poor implementations (c.f.: OOP) or worse, intentionally crippled implementations in an attempt to sell licenses (c.f. parallelization). What Matlab has now is what other languages had 5 years ago, but crappier. – Louis Thibault Sep 25 '14 at 12:44
  • @blz In any case, the question was not about why is Matlab better or worse than Python, the question was about its prevalence in neuroscience. Life is a series of decisions about efficiency: What's the fastest way to get some code written that maybe myself and others in the lab can use? Switching platforms costs a lot, and researchers have little time to waste anyway. Starting from scratch in Python might be worthwhile. Switching over with a huge body of code to port probably isn't. – Dylan Richard Muir Sep 25 '14 at 12:45
  • "maybe Anaconda does a good job (haven't tried it)". You should, if the only thing holding you back from Python is the difficulty in deployment! Again, though, I eagerly concede that there are some times where Matlab is the better, more sensible solution. I simply contend that these are corner-cases and no longer a general rule. =) – Louis Thibault Sep 25 '14 at 12:46
  • @DylanRichardMuir, "the question was not about why is Matlab better or worse than Python, the question was about its prevalence in neuroscience". I completely agree. I think this is the point that was raised about inertia. There's certainly a lot of value in having a system that's known to work -- perhaps I got a bit proselytic, in which case I do apologize! – Louis Thibault Sep 25 '14 at 12:47
  • @blz For me and for most programmers, open source vs closed source is utterly irrelevant. I don't want to spend time developing the language! I want a tool that works. Whether the language developers are part of a paid company or a loose-knit cadre of open-source programmers doesn't make much difference to me. Edit: I mean in a practical programming sense. Matlab licenses are expensive, which sucks. – Dylan Richard Muir Sep 25 '14 at 12:49
  • @DylanRichardMuir, again I think I may have been unclear (character limits don't help!). I don't mean that open-source software is good because you personally can develop the language. I mean that small groups of researchers can make modifications to the core language or leverage internal implementation details to produce domain-specific code that wouldn't be given the time of day by a for-profit tool. The open-source issue is therefore relevant for researchers -- it's about having a responsive community. Matlab doesn't have this, and that's why it's slowly and steadily losing momentum. – Louis Thibault Sep 25 '14 at 12:58
  • @blz The Matlab file exchange is open source and pretty extensive. And I think you'll have a hard time arguing that Matlab is not very extensible. – Dylan Richard Muir Sep 25 '14 at 13:05
  • @DylanRichardMuir, Indeed, but this doesn't even begin to touch on core language development, which is what I'm talking about. It's also not unique to Matlab (and neither is the writing of C/++ extensions). Again, Matlab has some very strong points but its closed-source development is a decidedly very weak point, even if you're not directly involved with improving the language (or particularly aware of the problem). [With the shift in tone, I feel the need to once again insist that I mean no disrespect, and that I'm finding this debate quite enjoyable :) ] – Louis Thibault Sep 25 '14 at 13:08
  • @blz See my comment above about language evolution. Matlab's core language is evolving. It can't evolve very quickly, for reasons of not obsoleting everyone's code. But neither can any language with a large developer base. And the ugliness of Matlab is clearly not sufficient to drive existing developers away en masse. I think it will be a process of slow attrition, as new developers choose to start with a more modern (or post-modern i.e Perl :) ) language. – Dylan Richard Muir Sep 25 '14 at 13:12
  • @DylanRichardMuir, "But neither can any language with a large developer base." I think this is our point of disagreement. The good thing here is that time will tell =) In any case, Matlab certainly isn't going to disappear overnight and it's so extremely useful in some cases that I still (begrudgingly) use it myself for certain things. – Louis Thibault Sep 25 '14 at 13:29
4

I completely agree with most the factors you've identified, but before I suggest some additional points, I'd like to correct one of yours:

file formats that can only be read in MATLAB

Unless you're talking about some obscure format that I'm unaware of (entirely possible!), MATLAB files are readable by non-MATLAB tools. In particular, scipy.io provides flawless I/O from/to .mat files (especially since yours truly submitted a minor patch ;) ).

I mention this because it's an excellent argument for making the switch towards the scientific python stack; my Ph.D advisor only knows MATLAB, but she has become much less reluctant to work with Python since I explained that everything I did could be trivially made to interact with existing MATLAB code.

Additional (Python-centric) reasons why MATLAB remains king:

  1. There's what I call "battered spouse syndrome": researchers are used to the pain and suffering caused by MATLAB and wrongly assume that it's just part of the programming landscape. In other words, they don't know any better. They don't realize that there are tools that are more pleasant to use, more robust, (often) faster, more reliable, free and extensible. It's a classic case of thinking that all programming languages (eligible bachelors) are as abusive as your current tool (spouse).
  2. Researchers are put off by IDLE and aren't aware of such great tools as:
    1. IPython
    2. IPython Notebook (this will really rock your socks if you haven't seen it yet)
    3. Spyder (for those seeking a MATLAB-esque graphical interface)
  3. People haven't seen the Pandas library in action. I know it's a bit presumptuous to claim that a single library can convert steadfast MATLAB users, but Pandas, especially when combined with the IPython notebook, makes dealing with labeled matrices downright enjoyable. It offers such things as:
    • string-labeled columns and/or rows
    • groupby, split & merge operations
    • baked-in plotting with fine-grained control
    • baked-in summary & inferential statistics
  4. Python has historically been quite a pain in the ass to install on windows and OSX, and most potential users are still unaware that the Anaconda distribution has all but completely solved the problem.

My reason for bringing up point 3 is more than just a shameless plug for the Pandas library. I wanted to draw your attention to the fact that Pandas manages to combine MATLAB's most sought-after features (fast numerical arrays, logical indexing, plotting & myriad 3rd party libraries) with some of the most useful features of the R language (DataFrame-like structures, summary stats, split/combine/apply operations, missing data management (e.g. fill-forward, interpolation, etc). With this library, you can essentially make the argument that you're replacing two languages for most common purposes.


And now a real shameless plug: I'm one of the moderators on Reddit's /r/pystats. Anybody trying to ditch Matlab is more than welcome around these parts ;)

Louis Thibault
  • 1,493
  • 2
  • 11
  • 25
  • I really like Pandas, but regardless of how great Pandas is, it's a long way from having scipy.io and Pandas to performing SPM or doing EEG analyses. To say otherwise would honestly be nothing but ignorant of the great work done by Friston, Makeig/Delorme, et. al. Yes, somebody could implement all of that in Python; but "could" means something like "spend half a decade or more coding". Thanks for fixing .mat IO in scipy either way! – jona Sep 23 '14 at 20:31
  • @jona, I don't disagree with your statement, and I certainly didn't mean to insinuate that every use-case was covered in Python. There are some very clear use-cases for using Matlab, especially in the neurosciences. This having been said there are very promising projects for SPM and EEG analysis that aren't far from giving Matlab a good run for it's money; I'm of course referring to NIPy and py-MNE. But I share your sentiment insofar as programming tools should make the job easier, not harder, and in any case, half a decade is only 5 years ;) – Louis Thibault Sep 23 '14 at 21:33
  • IPython Notebook is very awesome, and looks like it should make Python more accessible to non-programmers and the Mathematica crowd. – Dylan Richard Muir Sep 24 '14 at 14:47
  • I think there is a bit of a dogmatic push to move to Python, which has started (and continued) by re-implementing huge chunks of Matlab functionality. While there are issues with the design of Matlab's language (it comes from the Fortran era, after all), I'm not convinced that the whole sphere of Python is so much more appealing to justify the huge reimplementation project. – Dylan Richard Muir Sep 24 '14 at 14:49
  • @DylanRichardMuir while I certainly agree that there's (what I would call) an ideological push for Python, it's also hard to deny that Python is not a better-designed language. Matlab has a tendency to generate illegible code beyond what one expects from novice programmers. Again, I want to stress that this is because of language design: Matlab is designed with the (false) premise that all data is best represented as matrices. PHP's double-clawed hammer analogy seems appropriate, here. – Louis Thibault Sep 24 '14 at 15:54
  • @DylanRichardMuir, I just now saw your comment about IPython-notebooks. Yes, they're extremely cool, even more so now that they're being ported to Julia! =) – Louis Thibault Sep 24 '14 at 16:10
  • Ideological is a much better word :) Julia sounds very interesting; I haven't had the cycles to check it out yet (inertia, yeah?) – Dylan Richard Muir Sep 24 '14 at 21:24
  • I tried making the switch to Python and I admit there are some sexy things about it. However, in the end, MATLAB was always faster for me for development; so many packages come default and getting useful stuff of the matlab file exchange has solved every issue I couldn't program myself. For me, developing is the bottleneck because I'm not developing programs that robust to every kind of input, I frequently develop new programs for new tasks, new data formats, new research projects and they're never used again after the result is obtained. – Keegan Keplinger Sep 24 '14 at 21:46
  • @KeeganKeplinger, Again, there are some use cases for which Matlab is clearly the right tool, but nothing you've described is exclusive to Matlab. I work very much in the same manner as you. I only say this so as to not propagate the notion that Python is only suited to writing complex, robust programs. =) – Louis Thibault Sep 25 '14 at 09:40
  • @DylanRichardMuir, I've only played around with it for a few hours but it seems like a very promising project. I have a nagging feeling that I'll be proselytizing for the Julia camp in a few years. Thankfully, there's a great deal of interoperability between Python and Julia code, IIRC. – Louis Thibault Sep 25 '14 at 09:53
  • @DylanRichardMuir, Julia is like MATLAB, but without the neuroscience-specific good parts (tons of great packages already available) and the bad parts (antique, convoluted syntax, slow). Hope that makes sense ... – jona Sep 25 '14 at 10:12
  • @Jona You don't make it sound so appealing... – Dylan Richard Muir Sep 25 '14 at 12:51
  • @DylanRichardMuir, I'm with Jona on this one. Julia has some nice ideas but is severely lacking in maturity. The hope is that its compatibility with Python will enable it to leverage the (more) mature Python scientific stack, but it'll be a few years before it becomes a viable solution. Right now Julia is a cool, modern, well-designed language, but impractical for the purposes of "doing science". – Louis Thibault Sep 25 '14 at 13:33