Developing a text-based corpus of the language of Japanese comics (manga)