Viewing a single comment thread. View all comments

EvitanRelta OP t1_j36l6kf wrote

Oh have I compared this to Pandoc?
No actually, I've never tried Pandoc before.

I'm not sure if Pandoc can be configured to do the same adaptive-preserving of HTML in markdown, like converting this HTML:

<h1><b>Italicised heading</b></h1>
<h1 align="center">
  <i>Centered italicised heading</i>
</h1>

to this markdown:

# _Italicised heading_

<h1 align="center">
  <i>Centered italicised heading</i>
</h1>

Anyone here with Pandoc experience who've tried this?

3

seanpuppy t1_j37gis3 wrote

I have some experience with it… basically it converts markup languages to a AST and you can convert that AST to lots of things. It does not preserve everything.

Eg: I write markdown with * as bullet points but if I convert from markdown -> ast -> markdown it will be formatted a little different

ive been working in a side project to extend / modify said AST to be able to “insert” markdown into existing markdown.

Im on mobile right now, half a cup of coffee, and ive got a meeting in a few mins but I can try and show an example command later if I remember

edit: Work is over and I found an online pandoc tool

Pandoc convert from html to markdown with your example

1

EvitanRelta OP t1_j3ac48y wrote

Oh, in the link that u gave, the "align" attribute was transformed to:

{#centered-italicised-heading align="center"}

Is it possible to customise this to instead output as a HTML tag attribute? like:

<h1 align="center">
  ...
</h1>

what do u mean by "insert markdown into existing markdown"? could u give an example?

1

seanpuppy t1_j3aeh8f wrote

It most likely can't convert the html tag to stay as html. an <h1/> tag literally is a header with one # in markdown. Markdown was meant to be a more human read/write able form of HTML, where its meant to directly translate to HTML.

&#x200B;

For "inserting markdown" I wish I had a better example ready, I haven't open-sourced this thing yet (or finished it) but it started as an idea to make using my existing note system more powerful / easy to use without actually opening a file.

All my notes from day to day are in markdown, and lets say I use a template with something like below, so that I have a dedicated note file for every single day. (I use a cool VS code extension called Dendron which is similar to Obsidian for markdown notes)

# TODO
* cure cancer
* take a nap

# Ideas
* turn water to wine

# Meetings

## Meeting 1
blah blah

## Meeting 2...
blah

I want to be able to quickly jot down an idea or todo item, but I don't want to have to actual do the mental context switch of switching windows and finding the daily file. My work computer is a mac laptop, and ive found Alfred to be a very powerful and flexible tool to do basically anything from any context.

&#x200B;

So ideally I could have an Alfred command for "ideas" or "todos" etc... that would insert a string of text into my daily notes into the right spot. So in this case something like `inst $todo email joe about that thing` would insert "email joe about that thing" into a list block under a header tag called 'todo'

&#x200B;

with an output like:

TODO

  • cure cancer
  • take a nap
    • email joe about that thing

Ideas

  • turn water to wine

&#x200B;

But that got me thinking, there's potential for a powerful / flexible system of converting markdown into a tree like AST syntax that would let me reference different levels of the note similar to how one could reference nester JSON.

&#x200B;

So I started exploring pandoc, which converts all sorts of things into an AST (Abstract syntax tree) which is almost what I want, except its flat. No hierarchy except bulleted lists. To me, a <h2> below an <h1> is a second level in - BUT pandoc would treat it as a flat list of different markdown elements.

&#x200B;

I started out trying to write a python pandoc filter (see https://pandoc.org/filters.html ) but realized its intended design couldn't do what I want, but that doesn't really matter as pandoc can handle reading/writing from/to a pandoc AST, so any python script that reads in and spits out a compatible tree will work fine.

SO I created a python script that can handle SOME Markdown aspects, turn it into a nested tree, and spit back out a flat tree, which can then be used as ran through pandoc again to get markdown back out. Once I have that tree, I can start to design a syntax for specifying a part of the tree, and text I want to add, resulting in a modified nested tree, which can still be ultimately converted back to markdown.

&#x200B;

Unfortunately I haven't opened sourced it yet, I haven't finished but realistically its got enough functionality to be worth sharing as WIP. I hope all this made sense, I'm not sure if I've explained this project to anyone in this much detail yet.

1

EvitanRelta OP t1_j463a01 wrote

sflr!

Then i guess my converter library has that edge over Pandoc. Specifically, this library can preserve the HTML better than pandoc

So, what im getting is that:

  • u want to make a tasklist app
  • that stores the notes in markdown
  • with an commandline function that inserts a string as a list item under a specific header

sounds like it can be done by just using a bash script to parse the markdown file, find the headers, and just insert the listitem.

for the nested AST idea, im not sure what itd be useful for.

1