State-constrained offline deep reinforcement learning