Skip to content


html2text is a python package which converts a page of HTML into clean, easy-to-read plain ASCII text. Better yet, that ASCII also happens to be valid Markdown (a text-to-HTML format).

It was originally written by Aaron Swartz.

The code is under GPL v3.

The module is based on the html parser in the python standard library and so any valid input for the parser is valid input for the library.