Uncategorized

MarkItDown: Python Tool for Converting Files and Office Documents to Markdown

Nifty new convert-to-Markdown library from a small indie development shop named Microsoft:

The MarkItDown library is a utility tool for converting various
files to Markdown (e.g., for indexing, text analysis, etc.)

It presently supports:

PDF (.pdf)
PowerPoint (.pptx)
Word (.docx)
Excel (.xlsx)
Images (EXIF metadata, and OCR)
Audio (EXIF metadata, and speech transcription)
HTML (special handling of Wikipedia, etc.)
Various other text-based formats (csv, json, xml, etc.)

The API is simple:

from markitdown import MarkItDown

markitdown = MarkItDown()
result = markitdown.convert(“test.xlsx”)
print(result.text_content)

Via Stephan Ango (CEO of the excellent, popular Markdown writing and note-taking app Obsidian), who also points out that Google Docs added Markdown export a few months ago. I’ve never used Google Docs other than to read documents created by others, but MarkItDown seems like a library I might make great use of. “MarkItDown” is even a great name. What a world.

Not bad for a 20-year-old syntax.

 ★ 

Nifty new convert-to-Markdown library from a small indie development shop named Microsoft:

The MarkItDown library is a utility tool for converting various
files to Markdown (e.g., for indexing, text analysis, etc.)

It presently supports:

PDF (.pdf)
PowerPoint (.pptx)
Word (.docx)
Excel (.xlsx)
Images (EXIF metadata, and OCR)
Audio (EXIF metadata, and speech transcription)
HTML (special handling of Wikipedia, etc.)
Various other text-based formats (csv, json, xml, etc.)

The API is simple:

from markitdown import MarkItDown

markitdown = MarkItDown()
result = markitdown.convert(“test.xlsx”)
print(result.text_content)

Via Stephan Ango (CEO of the excellent, popular Markdown writing and note-taking app Obsidian), who also points out that Google Docs added Markdown export a few months ago. I’ve never used Google Docs other than to read documents created by others, but MarkItDown seems like a library I might make great use of. “MarkItDown” is even a great name. What a world.

Not bad for a 20-year-old syntax.

Read More 

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top
Generated by Feedzy