Towards End-to-End Spoken Language Understanding