Submitted by aicharades t3_10vnkj8 in MachineLearning
aicharades OP t1_j7ihm0p wrote
Reply to comment by __lawless in [P] ChatGPT without size limits: upload any pdf and apply any prompt to it by aicharades
Map prompt splits the text into chunks then summarizes those chunks. There's no reduce step.
Here's more detail on how it works: https://langchain.readthedocs.io/en/latest/modules/chains/combine_docs.html
__lawless t1_j7ii2h1 wrote
I see it is not a prompt per se, it is an analogue of map operation in ETL.
aicharades OP t1_j7ii63v wrote
exactly. the prompt is what the LangChain library uses to manage the text instructions for OpenAI
AccidentBackground72 t1_j7kl42h wrote
Any chance you could explain how to use Step 1 a little clearer? I understood the premise, but I'm not quite sure how that would translate to the instruction in step 1. As an example, I'm trying to perform a content analysis of a document with 7 chapters and identify 10-15 core themes in each chapter.
aicharades OP t1_j7kpzzn wrote
Of course! Step 1 breaks up your document and runs the prompt on each section. Try it with the Map section vs. Map Reduce (the main page).
Here's an example flow for Map:
- ​
Input a Book PDF 2. Convert the PDF to Text 3. Split the Book into Chunks: Book[pg1,pg2,pg3] -> pg1, pg2, pg3 4. Run the Prompt on Each Chunk: pg1, pg2, pg3 -> prompt(pg1), prompt(pg2), etc 5. Output the Summarized Chunks
Here's a prompt you could use (lots of room for improvement!):
the words in <<*>> are comments, plea remove from the final prompt
Goal: I'm trying to perform a content analysis of a document with 7 chapters and identify 10-15 core themes in each chapter.
Sample Map Prompt:
'INSTRUCTIONS': You are a writer <<BEST ROLE FIT??>> performing a content analysis of a document <<DOCUMENT TYPE??>>. You have been given a section of a larger document. You will identify up to 10-15 core themes in each chapter and output theme.
'INPUT': {text}
'OUTPUT':
Sample Reduce Prompt:
'INSTRUCTIONS': You are a copyeditor. You will need to edit a list of summaries together. Please combine the input together and combine any duplicate core themes. Please maintain the context of the document.
'INPUT': {text}
'OUTPUT':
Sample Input: document
AccidentBackground72 t1_j7kub42 wrote
That's an incredibly helpful overview! For the kind of work I do this is a really awesome tool.
aicharades OP t1_j7kvuh1 wrote
It was really awesome to see how OpenAI handles all forms of text when I uploaded the DNC email file. It took the raw emails and created a narrative from them, pretty unreal.
You could do this with a bunch of other historical documents and create stories and chatbots and such.
Viewing a single comment thread. View all comments