Submitted by cm_34978 t3_100rbhp in MachineLearning
ypanagis t1_j2nkyk0 wrote
Reply to comment by cm_34978 in [D] Data cleaning techniques for PDF documents with semantically meaningful parts by cm_34978
I was about to propose the same. For those who are interested, this seems to work for MacOS, too, but Windows is definitely a goto. A VBA script can also come in handy, for someone to get several PDFs, open them from Word and save as TXT.
cm_34978 OP t1_j2nsi8g wrote
Definitely. With windows, you get the advantage of the win32com library whereas with MacOS, you need need to play with AppleScript, which (in my hands) can be brittle and finicky.
Viewing a single comment thread. View all comments