Challenges of academic code

Challenges of academic code

Structural/institutional

  • Money for software dev. is scarce (see Magma, SAGE)

  • Bibliometric evaluation of researchers rarely include code

  • Unclear citation guidelines (user-facing code vs. libraries)

  • Who owns the code?

    One researcher: idiosyncratic code style, “scratch my current itch”, “left for industry”

    Research group: fast turn-over of maintainers, lack of software mentorship

Contextual

  • It’s research! Requirements/scope evolve fast

    Code bases become piles of quick’n’dirty additions/fixes

    Documentation, if written, gets outdated fast.

    Different levels of code stability in the same project

  • Rabbit hole: language features as mathematical puzzles

    E.g. type systems. Luckily Python is a sweet spot.

  • Programming skills variance

  • REPL/Notebook inspired coding

    Lack of modularity

    Mixing of computation and plot/GUI code

  • Big dataset / long analysis time (astronomy)

    Impedes code+examples distribution / testing

Python

  • One way to do it … but Python is 30 years old

    Diversity of approaches in packaging/publishing, Python environment manangement, IDEs, tooling

  • Interpreted language with GIL

    Multiprocessing, integrating/publishing C/C++/Fortran code

  • Type annotations and numerical/data processing code

    Ongoing process (as of Apr 2022): astropy is a mixed bag

    Pandas dataframes are difficult to Documentation

    NumPy typing is in its early stages; and slows down code analysis