Dynamic sparsity: enabling efficient, interpretable and generalizable sequence models