seanpuppy
seanpuppy t1_j37gis3 wrote
Reply to comment by EvitanRelta in HTML-to-Markdown converter that adaptively preserve HTML when needed (eg. when center-aligning, or resizing images) by EvitanRelta
I have some experience with it… basically it converts markup languages to a AST and you can convert that AST to lots of things. It does not preserve everything.
Eg: I write markdown with * as bullet points but if I convert from markdown -> ast -> markdown it will be formatted a little different
ive been working in a side project to extend / modify said AST to be able to “insert” markdown into existing markdown.
Im on mobile right now, half a cup of coffee, and ive got a meeting in a few mins but I can try and show an example command later if I remember
edit: Work is over and I found an online pandoc tool
seanpuppy t1_j3305h6 wrote
Reply to HTML-to-Markdown converter that adaptively preserve HTML when needed (eg. when center-aligning, or resizing images) by EvitanRelta
Have you compared how pandoc does it?
seanpuppy t1_j3aeh8f wrote
Reply to comment by EvitanRelta in HTML-to-Markdown converter that adaptively preserve HTML when needed (eg. when center-aligning, or resizing images) by EvitanRelta
It most likely can't convert the html tag to stay as html. an <h1/> tag literally is a header with one # in markdown. Markdown was meant to be a more human read/write able form of HTML, where its meant to directly translate to HTML.
​
For "inserting markdown" I wish I had a better example ready, I haven't open-sourced this thing yet (or finished it) but it started as an idea to make using my existing note system more powerful / easy to use without actually opening a file.
All my notes from day to day are in markdown, and lets say I use a template with something like below, so that I have a dedicated note file for every single day. (I use a cool VS code extension called Dendron which is similar to Obsidian for markdown notes)
I want to be able to quickly jot down an idea or todo item, but I don't want to have to actual do the mental context switch of switching windows and finding the daily file. My work computer is a mac laptop, and ive found Alfred to be a very powerful and flexible tool to do basically anything from any context.
​
So ideally I could have an Alfred command for "ideas" or "todos" etc... that would insert a string of text into my daily notes into the right spot. So in this case something like `inst $todo email joe about that thing` would insert "email joe about that thing" into a list block under a header tag called 'todo'
​
with an output like:
TODO
Ideas
​
But that got me thinking, there's potential for a powerful / flexible system of converting markdown into a tree like AST syntax that would let me reference different levels of the note similar to how one could reference nester JSON.
​
So I started exploring pandoc, which converts all sorts of things into an AST (Abstract syntax tree) which is almost what I want, except its flat. No hierarchy except bulleted lists. To me, a <h2> below an <h1> is a second level in - BUT pandoc would treat it as a flat list of different markdown elements.
​
I started out trying to write a python pandoc filter (see https://pandoc.org/filters.html ) but realized its intended design couldn't do what I want, but that doesn't really matter as pandoc can handle reading/writing from/to a pandoc AST, so any python script that reads in and spits out a compatible tree will work fine.
SO I created a python script that can handle SOME Markdown aspects, turn it into a nested tree, and spit back out a flat tree, which can then be used as ran through pandoc again to get markdown back out. Once I have that tree, I can start to design a syntax for specifying a part of the tree, and text I want to add, resulting in a modified nested tree, which can still be ultimately converted back to markdown.
​
Unfortunately I haven't opened sourced it yet, I haven't finished but realistically its got enough functionality to be worth sharing as WIP. I hope all this made sense, I'm not sure if I've explained this project to anyone in this much detail yet.