Many apparently difficult problems can be solved by reduction to lineal- programming. Such problems are often subprobleins within larger systeins. When gradient optimisation of the entire larger system is desired, it is necessary to propagate gradients through the internally-invoked J.P solver. For instance, when an intermediate quantity z is the solution to a linear program involving constraint matrix A, a vector of sensitivities dE/dz will induce sensitivities dE/dA. Here we show how these can be efficiently calculated, when they exist. This allows algorithmic differentiation to be applied to algorithms that invoke linear programming solvers as subroutines, as is common when using sparse representations in signal processing. Here we apply it to gradient optimisation of overcomplete dictionaries for maximally sparse representations of a speech corpus. The dictionaries are employed in a single-channel speech separation task, leading to 5 dB and 8 dB targct-to-intcrfcrcncc ratio improvements for same-gender and opposite-gender mixtures, respectively. Furthermore, the dictionaries are successfully applied to a speaker identification task. Â© 2006 IEEE.