blog.willbadart.com

This is the blog of Will Badart.

Archive Series Tags Feed

Domain-Level Errors

15 Jan 2020

  • #data science
  • #software engineering

Over the last year and a half, I’ve had the opportunity to work with a lot of different data scientists, all from varying backgrounds and with different levels of experience. During this time, I’ve observed a pain point that seems to afflict these practitioners without regard to expertise or experience: bad error messages.


Ghastly Echoes in Python

14 Jan 2020

  • #data science
  • #software engineering

The last three posts gradually introduced “Ghosts of Departed Proofs” (GDP) as a framework for designing safe, ergonomic data science libraries. GDP and the related approaches leverage Haskell’s expressive type system to enable the compiler to detect user errors which violate the preconditions of an algorithm. Unfortunately, not many data scientists are using Haskell today, so to use GDP directly, most teams would have to switch languages.


GDP for Data Science

14 Jan 2020

  • #data science
  • #software engineering
  • #functional programming

In the last post, we used types to record interesting properties of the terms in our program. Equivalently, from a function’s perspective, we used types to encode preconditions on the arguments to a function, letting the compiler reject misuse of the algorithm it implements. Such misuse constitutes a mistake at the conceptual or domain level, and ruling out error classes at this level of abstraction can be very strong evidence for the correctness of a program.


The Argument from Types

13 Jan 2020

  • #data science
  • #software engineering

In the last post, we examined how software engineers construct arguments for correctness. In doing so, we identified “well-typedness,” the property of a program absent type errors, as a particularly strong argument due to its ability to rule out entire classes of error wholesale. But we encountered a hiccup when basic data types alone didn’t rule out all misuses of a function (in the given example, division by zero). In other words, we couldn’t statically enforce the precondition that the second argument to division must be nonzero.


Arguing for Correctness

10 Jan 2020

  • #data science
  • #software engineering

One of the most important jobs of a software engineer is convincing people that a program is correct with respect to some specification or other. Perhaps most often this person is ourself, but it can also be a teammate, project manager, or client. It would be lovely if we could formulate “correctness” as a logical conclusion, which we could deductively argue using properties of our programs as premises. That is, if we could actually prove that a program is correct. The field of formal methods studies this formulation, but the current techniques for “proving” a program tend to be heavy-weight and inaccessible to the everyday programmer.



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16