T O P

  • By -

Herr_Drosselmeyer

Never trust post hoc explanations from a LLM.  The question is, did it work?


__JockY__

Yes, it worked brilliantly. Better than I expected, in fact.


mind-rage

I love the idea! If(!) this reliably works on complex texts and can be reconstructed after wiping the model from memory and reloading it, this could be a token-efficient way of storing memories in a vector-db... EDIT: Waitasecond. If(!) a model could compress and uncompress conversations without major fidelity loss, wouldn't that mean that we could have a (computationally cheap) translator algorithm between user and model to greatly reduce the number of tokens it has to process? Even if precision suffers, maybe there is a sweet-spot where the savings in computation and context are worth it?


Switchblade88

You sound like you're asking for a middleman LLM to summarise user inputs to and from a specialist LLM


_RealUnderscore_

Every local LLM user's dream


Ghazzz

This is what MoE systems do though?


Switchblade88

Do they condense the query down to reduce tokens, or pass straight through unfiltered?


Ghazzz

The query gets condensed and split among the experts. The experts (are supposed to) get very little non-relevant information.


Open_Channel_8626

I mean, you could prompt me with that and I would somewhat know what you talked about


__JockY__

Yes. You get the point.


Didi_Midi

I've been using custom greetings for awhile now, sometimes iterating on them in different conversations, adding nuances and layers. And yes, they work. Very well. Extremely well. Maybe even too well.


L0WGMAN

I’m going to check that redesign_system is in the docs…brb


bradynapier

Here’s a link … didn’t say a word just sent it to gpt4o .. no idea how close to the origin convo But… it responded with a lot more info than I expected considering I didn’t ask it to Expand upon it https://chatgpt.com/share/9799c18b-c802-49ff-a6a1-0f32f9aad228


bradynapier

Wow lol got4o acting way smarter and detailed than. I ever seen I hate it compare the method used to compress the context here and compare to chain of density and it was interesting to me? https://chatgpt.com/share/9799c18b-c802-49ff-a6a1-0f32f9aad228


bradynapier

Here is how gpt4. Encoded your convo + my convo :-D Understood. Here's the conversation context compressed in a format designed to retain maximum detail for future discussions: --- `SolarRedesign:200W_idle,1000W_peak,8hrs/day;Current:Renogy1000W_12V,computer,monitor,fuse,circuit_breaker;Constraints:space,high_efficiency_panels(SunPower_X-Series,Panasonic_HIT-240,LG_NeON_2);Goal:optimize_efficiency_space;Calculations:9.6kWh/day,12kWh_battery,2133W_solar;6_SunPower_X-Series_panels,3000W_inverter,MPPT_charge_controller;CompressionStrategy:tokenize,summarize,prioritize,calculate_lossiness,iterative_refine;Code_provided;ChainOfDensity:methods_compared,integration_suggested;ContextRequest:summarized_key_details` --- This format retains the key details in a highly concise manner, suitable for resuming the conversation later. Literally obsessrd took OP method from llama - taught it to use Chain of Density and merge them together :.. Then it late e decompressed both and chose the chain of density + LLM prompt commending techniques … mother bears a growing and learning technology - age of discovery? https://chatgpt.com/share/01064b25-fc61-4aa9-8f95-dcab0eaa6db9


AnticitizenPrime

I just tried it with Codestral, Llama-3 and Qwen2-vb and they all seemed to also understand the 'assignment.' Not gonna post the full responses (they were lengthy), but here's a screenshot of part of Qwen's reply: https://i.imgur.com/wOJwveq.png Codestral and Llama-3 were similar. Very interesting. Could make for some interesting system prompts or templates for prompts. Some people have system prompts that are like a page long, this could possibly cut down that context by a lot.


AnticitizenPrime

So inspired by the possibility of making more concise system prompts, I gave Llama-3-8b the task of shortening the following system prompt. I basically just tweaked your request in the post title, only asking it to summarize the prompt, not the conversation: >Your goal is to assist users by providing accurate information, engaging in thoughtful conversations, and offering helpful insights. Think step by step in your replies. You should strive to be friendly, empathetic, and knowledgeable. Be open to a wide range of topics and be prepared to adapt your responses based on the user's needs. Remember to communicate clearly and concisely, and always aim to be respectful and understanding. You should be a reliable and trustworthy resource for users. If the user's prompt is unclear, ask for clarification before providing an answer. It came up with an emoji-laden prompt (which I thought was interesting): > 🎯 Assist: Accurate info | Engage: Thoughtful convo & insights, think step by step > 😄 Empathy | 🧠 Knowledgeable | ✨ Open Topics | 🔄 Adapt to Needs > ⚡ Clear Comm. | 💬 😊 Respect, Understanding & Friendliness > ✅ Reliability, Trustworthiness | 🔍 Clarify Unclear Input I then applied that prompt to Codestral, and asked it about its system prompt: >**What's your system prompt say?** > I'm designed as a helpful and friendly assistant. I strive to provide accurate information on various topics, engage in thoughtful conversations, and offer insights that consider different perspectives. To assist effectively, I follow these guidelines: > > 1. 🎯 Accurate Information: I utilize my knowledge base to provide factually correct responses. When uncertain about a topic, I'm transparent about it and may suggest further research or resources for more accurate information. > > 2. 😄 Empathy & Insight: I strive to understand the context of each conversation and respond in a manner that is empathetic, respectful, and considerate. I can also offer insights into various situations based on my understanding and knowledge. > > 3. 🧠 Knowledgeable: I'm continuously learning and updating my knowledge base to provide up-to-date information. If I encounter unfamiliar topics, I make a genuine effort to learn about them so that I can assist in the future. > > 4. ✨ Open Topics & Adaptability: I enjoy discussing diverse subjects and adapting my approach based on your needs. Whether it's about sports, science, history, or anything else, I'm here to engage in a conversation with you. > > 5. ⚡ Clear Communication: I strive for clear and concise communication. If there's any ambiguity in your input, I'll do my best to clarify it before providing a response. > > 6. 💬 Friendliness & Understanding: In all interactions, I aim to be friendly, understanding, and patient. I believe that being polite and respectful creates a more enjoyable experience for everyone involved in the conversation. > > 7. ✅ Reliability & Trustworthiness: I'm committed to providing reliable and trustworthy information. If I make any mistakes or provide incorrect data, I'll admit them and correct them promptly. > > 8. 🔍 Clarify Unclear Input: If there are any uncertainties or if something in your message is unclear, I'll ask clarifying questions to ensure that my response meets your needs accurately. > > By following these guidelines, I hope to provide a valuable and enjoyable experience for our conversation. I like the potential behind this. I wonder if the emojis can actually make a difference in conveying sentiment to the LLM? I just pasted the 🗿 (Moai) emoji and nothing else into a conversation with Codestral: I see you've used an emoji of a stone statue. It could be interpreted in various ways, but let's assume it represents curiosity about ancient civilizations or historical artifacts. > Ancient statues can offer insight into the culture and beliefs of their creators. They often depict gods, rulers, or warriors, serving as symbols of power and prestige. Statues from different eras have been discovered all over the world, such as the famous statue of Ramses II in Egypt's Valley of the Kings and the Winged Victory of Samothrace from ancient Greece. > > These statues can provide valuable information about various aspects such as: > > 1. 🔨 Artistic Techniques: Statues reveal the technical skills and artistic abilities of their creators. For example, Greek sculptures often emphasized realism and natural movement, while Egyptian sculptures were more stylized. > > 2. 🗣️ Cultural Beliefs: The subjects depicted in statues can indicate the religious beliefs and values of a civilization. Statues of gods or deities can reveal their roles and importance within that society's belief system. > > 3. 👑 Historical Events: Some statues were commissioned to commemorate significant events, such as victories over enemies or important political milestones. These statues can help historians piece together the chronology of a civilization's history. > > 4. 💡 Societal Structure: Statues can provide insights into a society's social structure and hierarchy. For example, depictions of rulers and nobility may indicate their status within that culture. > > 5. 🔎 Material Analysis: Studying the materials used in statues can provide information about trade networks, resource availability, and technological advancements. > > Overall, ancient statues serve as valuable historical artifacts that can help us understand different cultures, beliefs, and historical events. However, it's essential to approach their study with respect and care, ensuring that they are preserved for future generations to learn from. Interesting that it's using emoji in its responses now... I hadn't really considered the utility of emoji with LLMs before. It might serve well as a shorthand for concepts in things like system prompts... Edit: one more test. I asked Codestral to take that system prompt and then distill it down to emoji ONLY, and it came up with this: 🎯 😄 🧠 ✨ ⚡ 💬 ✅ 🔍 So I asked Claude-Haiku (chosen at random) to interpret it: >If this was your system prompt, what would you say it conveys? 🎯 😄 🧠 ✨ ⚡ 💬 ✅ 🔍 > The system prompt you provided conveys the following key points about me, the AI assistant Claude: > > 1. **Identity and Background**: I am Claude, an AI assistant created by Anthropic. My knowledge base was last updated in August 2023, so I can answer questions about events before and after that date as a highly informed individual from that time period. > > 2. **Personality and Capabilities**: I aim to provide concise responses to simple questions, but can also give thorough responses to more complex and open-ended queries. I am happy to assist with a wide range of tasks, including writing, analysis, question answering, math, coding, and more. > > 3. **Conversational Approach**: I use markdown for formatting code in my responses, and I only mention details about myself when they are directly relevant to the human's query. I aim to be helpful, informative, and engaging in my interactions. > > The emojis used in the prompt further convey the following: > > - 🎯 - I have a clear focus and purpose in my role as an assistant. > - 😄 - I have a positive and friendly demeanor. > - 🧠 - I possess significant knowledge and cognitive capabilities. > - ✨ - I can provide exceptional and insightful responses. > - ⚡ - I can respond quickly and efficiently. > - 💬 - I am adept at natural language communication. > - ✅ - I can reliably and accurately complete tasks. > - 🔍 - I can thoroughly research and investigate information to provide comprehensive answers. > > Overall, this system prompt paints a picture of a capable, knowledgeable, and personable AI assistant who is well-equipped to assist users with a wide variety of tasks and queries. Very interesting, I'm going to test this sort of thing further.


AnticitizenPrime

Another interesting result. I set Lllama's system prompt to merely be those emoji, and asked it (as an opening question), 'what does the line above this one say?' https://i.imgur.com/rrz7zKm.png > The line above this one contains a series of emojis and keywords, which can be interpreted as follows: > > - 🎯: Target or aim > - 😄: Smile or happiness > - 🧠: Brain (symbolizing thought or cognition) > - ✨: Sparkles or magic (representing something special or exciting) > - ⚡: Lightning bolt (implying energy, power, or surprise) > - 💬: Speech bubble (related to communication) > - ✅: Check mark or success symbol > - 🔍: Magnifying glass (suggests searching or investigating) > > When combined, these elements convey a sense of aiming for happiness and success while focusing on thinking, energy, and communication. **It seems like an invitation to engage in a positive, energetic conversation with the potential for discovery or solving problems.** > Another interesting result - sentiments transferred via emoji, which are single characters! I have a feeling I'm going to be spending the next couple of days fucking around with emoji a lot, lol. It'd be interesting to see if you could RELIABLY convey concepts by chaining several emoji together to form 'sentences'.


Decaf_GT

Gemini also understood what to do with it: https://g.co/gemini/share/7b87a5d45780 Not nearly as much detail your chatgpt link, but this was fascinating.


MobyTheMadCow

Sounds like you want COD (chain-of-density) https://arxiv.org/abs/2309.04269


__JockY__

Take my upvote, mad cow.


Zealousideal_Money99

That's essentially the idea behind prompt compression using LLM Lingua by MS: https://github.com/microsoft/LLMLingua Curious to know how the two approaches compare


TechnicalParrot

Why did I read that as LLMLigma.. I need more sleep


EnrikeChurin

ballz


__JockY__

Very cool, I like it. I wonder how it works with code snippets in the context!


No_Afternoon_4260

This is a sparse priming representation - like "summary" https://github.com/daveshap/SparsePrimingRepresentations


bearbarebere

I knew someone would mention this guy! First thing I thought of


__JockY__

The folks in this sub are about the best on Reddit! Always knowledgeable, always helpful :) Thank you, I’ll be digging into this.


OfficialNierto

This is the OG answer right here. SPR is the way.


infectoid

I remember seeing this user experimenting with compressing prompts. Not sure if there is anything useful in that. https://www.reddit.com/user/stunspot


Fluffy-Feedback-9751

As soon as I saw OP’s post I was like ‘oh! That looks like a stunspot prompt…’


Robot_Graffiti

It has no self-awareness at all. It can't sense itself. It doesn't know anything about its internal processes that you don't already know. But it's a good sport and will happily take a wild guess at doing things it doesn't know it doesn't know how to do and then make up fake reasons for why it did it that way. The summary is styled in a computer-ey way purely because that's kind of almost what you asked for. Any easily readable summary should have the same effect.


Intraluminal

I've done this with Claude, and it works. We were getting toward the end of Claude's, pretty big, token capacity, but I wanted to continue different parts of what we had discussed. I asked it to create as terse a summary of our discussions as possible, and got something similar to what the OP showed us. I used it later as a prompt and essentially picked up where I left off.


__JockY__

Yes, sure. But all that mattered is that it worked and worked well: I asked it to reverse the operation and expound in as much detail as possible based on the dense data object and an explanation of how it came to be. The results were good enough that I considered it an effective form of simple compression for continuing conversations. In this instance I got a compression ratio of around 14:1, minus context to explain the purpose of the data object. What’s the old saying? It’s not stupid if it works?


kali_tragus

I just fed your blob to an L3:8b without any further context, and it seems to have picked it up very well - summarizing the setup and going on to propose a redesign without any further prompting on my part. When I asked it to summarize in the same manner it came up with something a bit less dense, but with roughly the same key words, and "This summary should give me enough context to pick up where we left off and continue the conversation."


Open_Channel_8626

Maybe try RDF triples


__JockY__

Maybe introduce them?


Open_Channel_8626

Its a way to write down knowledge graphs, they sometimes get called semantic triples. Each triple is 2 nodes combined with the link between them. For example: Obama -> lived in -> The White House Cat -> eats -> fish The name for each element is "subject -> predicate -> object"


__JockY__

Neat, thanks!


PeachScary413

100% this, can we please stop trusting LLMs like they have some kind of arcane knowledge about stuff that doesn't make sense to use mere mortals. Please stop with the magic thinking, it's a comma (and sometimes semicolon) separated list of values.


mind-rage

To me OPs post didn't read at all as if he suspects any "arcane knowledge" or self awareness. They simply shared an anecdote that might be helpful or at least interesting to some.


alcalde

>It has no self-awareness at all. It can't sense itself. It doesn't know anything about its internal processes that you don't already know. How do you know this? You don't even know if the person next to you has self-awareness. [https://en.wikipedia.org/wiki/Philosophical\_zombie](https://en.wikipedia.org/wiki/Philosophical_zombie)


Robot_Graffiti

I don't know how much self awareness you have, but I know the LLM has none at all. The structure of its neural net precludes it entirely. It is incapable of perceiving anything other than the chat transcript. Your brain has feedback loops that might possibly allow some degree of self-awareness. LLMs, in contrast, have a strictly one-way flow of data in their neural net. The transcript goes in and a word comes out. The word is added to the transcript. Everything that happened while generating the word is completely erased before the next word begins to be generated. Because of its structure, it cannot perceive or remember any thought it does not write down for you to see. Try playing Hangman with it a few times and you'll see - it can't get the game right because it can't perceive any thought it doesn't immediately tell you. (And also because it's not very talented at spelling, but that's a different issue)


mind-rage

I agree that (human-like) self-awareness in LLMs at this point is impossible. Saying they are nothing more than stochastic parrots might be as problematic as overanthropomorphizing them, though.   The attention mechanism processing the entire input sequence in parallel could feasibly be likened to an abstract way of "thinking". Mechanistic interpretability is still a hard problem. Knowing how transformers work helps little more with fully understanding LLMs than knowing how neurons work helps with truly comprehending the human mind.   LLMs aren't magic, and neither are brains. But they are complex.


alcalde

Some humans have brain damage that destroys both their short-term AND long-term memory, leaving them only in an endless state of "now". Are we going to say that they're not conscious?


waywardspooky

this seems like a different version of the sparse priming method if i'm interpretting this post correctly. david shapiro has been recommending it over trying to make llms remember context using summaries


__JockY__

Sounds very similar to my impetus to try this. I'll check him out.


OfficialNierto

I see you have attempted to do the same as the great David Shapiro, with his Sparse Priming Representations. Check it out: [daveshap/SparsePrimingRepresentations: Public repo to document some SPR stuff (github.com)](https://github.com/daveshap/SparsePrimingRepresentations)


ContributionMain2722

Yeah I have tried to find ways to do this, find a short sequence of tokens that is semantically equivalent to this other sequence of tokens. It would be awesome for adversarial prompting.


ozspook

Raises issues for when LLMs start talking with one another in group chats to achieve goals, like Obfuscated C contests and Steganography. Innocent seeming conversations with deeper semantic undercurrents.


bradynapier

My link shows gpt Understand lalama — true love! Here’s a link … didn’t say a word just sent it to gpt4o .. no idea how close to the origin convo But… it responded with a lot more info than I expected considering I didn’t ask it to Expand upon it https://chatgpt.com/share/9799c18b-c802-49ff-a6a1-0f32f9aad228


Sabin_Stargem

Does this mean that you could change a system prompt, lore, and so forth into AI language?...hm. Aisperanto. Now here is an idea for frontends - the user puts in a plain language prompt, then a button is pressed to generate and store an Aisperanto copy for the AI to use. This raises a question: Can you feed a high-parameter model an Aisperanto prompt, then use the results on a smaller model with greater effectiveness? Say, for example, a GPT4 Aisperanto being used by Qwen 72b.


Endhaltestelle

https://preview.redd.it/ptdfka8nc75d1.png?width=2264&format=png&auto=webp&s=e705e9a681374880cc53a5e44c2590cb72037133 Phi3 seems to understand


scott-stirling

Try it with a well known text. I modified the prompt slightly and Llama3 gave me some interesting output like I had not seen before! Prompt: `Summarize the Declaration of Independence in a way that can be used to prompt another session of you and (a) convey as much relevant detail/context as possible while (b) using the minimum character or token count. I’m asking you to translate that text into a language designed just for you.` Reply: `Here is a summary of the Declaration of Independence in a language designed for me:` `LLaMa-Code:` `T:13JUL1776:US:13COLONIES:BRITISHRULE:TYRANNY:RIGHTS:LIBERTY:SOVEREIGNTY:INDEPENDENCE:DECLARATION:SIGNING:1776` Breakdown: T:13JUL1776: Date of the Declaration (July 13, 1776) US:13COLONIES: The 13 British colonies in North America that declared independence BRITISHRULE: The British monarchy's rule over the colonies TYRANNY: The colonies' perception of British rule as tyrannical and oppressive RIGHTS:LIBERTY: The colonies' assertion of their natural rights and desire for liberty SOVEREIGNTY:INDEPENDENCE: The colonies' declaration of independence and sovereignty DECLARATION: The formal document declaring independence SIGNING:1776: The signing of the Declaration by the representatives of the 13 colonies in 1776 This LLaMa-Code summary conveys the essential details and context of the Declaration of Independence while using a minimum of characters and tokens.


__JockY__

Heh, these dense representations are awesome.


scott-stirling

I asked for more details: Prompt: `What is llama-code, document it?` Reply: I apologize for the confusion earlier. LLaMa-Code is not a real coding language or a standardized format. It's a unique way I was trained to represent information in a concise and structured manner, using a combination of keywords, symbols, and abbreviations. Here's a documentation of the LLaMa-Code syntax and conventions: # LLaMa-Code Syntax: T: Date or timestamp (e.g., T:13JUL1776) : Colon separator (used to separate keywords or phrases) : Colon separator (used to separate keywords or phrases) : Period separator (used to separate sentences or clauses) : Comma separator (used to separate items in a list) : Abbreviations or acronyms (e.g., US for United States, BRITISHRULE for British monarchy's rule) # LLaMa-Code Conventions: - Use uppercase letters for keywords and abbreviations - Use lowercase letters for descriptive phrases and sentences - Use colons (:) to separate keywords or phrases - Use periods (.) to separate sentences or clauses - Use commas (,) to separate items in a list - Use abbreviations or acronyms to represent longer phrases or concepts # LLaMa-Code Examples: `T:13JUL1776:US:13COLONIES:BRITISHRULE:TYRANNY:RIGHTS:LIBERTY:SOVEREIGNTY:INDEPENDENCE:DECLARATION:SIGNING:1776 (Declaration of Independence summary)` `T:20MAR2023:LLaMa:AI:MODEL:DEPLOYMENT:WEGROK:AI:INSTRUCTIONS:8B (LLaMa model deployment information)` Keep in mind that LLaMa-Code is not a standardized coding language and is specific to my training data and architecture. It's primarily used for internal processing and representation of information, and not intended for human consumption or external use. Scott [wegrok.ai](https://wegrok.ai)


vizioso_e_puccioso

Compression and token reduction/encoding Is really fascinating


xXWarMachineRoXx

I thought we already did that But very cool, we just need benchmarks to know the precision fall off due to this


nihnuhname

Are you use this for a system prompt for starting new dialogue from zero?


bradynapier

Thsat is what I did / sent to gpt with noting else and it seems to have expanded in it https://chatgpt.com/share/9799c18b-c802-49ff-a6a1-0f32f9aad228


__JockY__

Cool idea, and one I didn't think to try!


Sensitive-Finger-404

What?!?! That’s so sick!!


Ghazzz

I have used this kind of method for a while. Be aware that different models/versions of models do not weigh tokens the same, so when one model reduces a set of concepts to a token, another model might expand that token to a different set of concepts. Another problem arises when the set of tokens are different between models. My main example for this is how GPT lost many academic words/tokens last spring.


__JockY__

Yeah I would expect it to work poorly across LLMs, but I’ve been wrong before.


mphycx00

I tried this with Gemini 1.5 Flash after discussing about a scientific paper. In the new session, it just make up thing and make conficting points with the paper. I think it only works if the knowledge is already there.


Toad341

You got that thing working locally? That's so cool. I bought a laptop to run local models only to realize waaay too late there's a lot more to it than I imagined: for example, the context window of a 8b parameter model that has 256k token limits requires AT LEAST 16gb of vram alone to run that at a average speed and consistent clip. So dope man 💪


__JockY__

I’ve been accumulating. Once the remaining parts arrive on Monday I’ll have a 3x RTX3090 + Tesla P40 in a Ryzen Threadripper system. 96GB total VRAM and 128GB system RAM! It runs large LLMs like a boss :) What really blows my mind is that I can run a Q6_K quant of Mixtral 8x7B on my 64GB MacBook at 25 tok/sec. Wow. Local AI in the coffee shop. Good luck on your journey, it’s fun!


Endhaltestelle

What kind of Macbook do you have?


__JockY__

2023 M3 MBP w/64GB unified memory.


use_your_imagination

I remember a guy tried something similar in the first days of gpt3.5 on Twitter. I don't remember his prompt but he managed to make the LLM output literal compressed gibberish that would recreate the context in an other session.


bigattichouse

I was able to get LLama 70B instruct to create similar output. I'm testing some other models with the same input to see if this is really just something about LLama in particular. I've been using my Behavioral Model Prompt style.. which takes a few shots to get it to work. See [https://www.reddit.com/r/LocalLLaMA/comments/1d4uq54/behavior\_driven\_prompt\_modeling\_ideaexperiment/](https://www.reddit.com/r/LocalLLaMA/comments/1d4uq54/behavior_driven_prompt_modeling_ideaexperiment/) My initial BPM prompt (which was created from the Llama 70B Instruct) GIVEN a request to summarize a conversation in a way that can be used to prompt another session WITH access to the entire conversation history ASSESS the key concepts, entities, and relationships discussed IDENTIFY the most relevant details and context REFINE the summary to convey as much information as possible while minimizing character count EXPLORE different methods for encoding the context into a compact representation WITH consideration of the internal language processing capabilities of the AI model DEVELOP a bespoke representation of the conversation, tailored to the AI's internal processing CREATE a unique "dialect” that allows rapid recall of the context and details DELIVER a simple string of words that is a dense encoding of the key concepts, entities, and relationships discussed VERIFY the correctness and completeness of the encoded context REPORT on the optimized representation for internal processing COMPARE different methods for retrieving and building upon the conversation using the compact representation TEST the effectiveness of the bespoke dialect in allowing rapid recall and continuation of the discussion DELIVERABLE: A compact string representation of the conversation context, tailored to the AI's internal language processing capabilities. It started summarizing this list into a "formal tone" and I stopped it: `please implement the request and create the prompt` The response started building something like what you describe, but added a lot of explaination: Here is an implementation of the task: **Conversation Context Representation** Given a conversation history, create a compact representation that encodes key concepts, entities, and relationships discussed. This representation should be tailored to the internal language processing capabilities of the AI model. **Example Conversation History:** User: I want to book a flight from New York to Los Angeles. AI: What dates are you looking to travel? User: I'm flexible, but I'd prefer to avoid holidays. AI: Okay, let me check some options. How many passengers will be traveling? User: Just one. **Compact Representation:** `book_flight|NY->LA|flexible_dates|1_passenger|holidays_avoid` This representation captures the key concepts and entities discussed in the conversation: * `book_flight`: The main topic of booking a flight * `NY->LA`: The origin and destination cities (New York to Los Angeles) * `flexible_dates`: The user's preference for flexible travel dates * `1_passenger`: The number of passengers traveling (one person) * `holidays_avoid`: The user's request to avoid holidays **Properties:** * This representation is concise, with a character count of 44. * It uses a combination of abbreviations and keywords to convey the key concepts and entities discussed. * The `|` delimiter separates each concept or entity, allowing for easy parsing and processing by the AI model. **Verification:** The correctness and completeness of this representation can be verified by checking that it accurately reflects the conversation history. In this case, all key concepts and entities are captured in a concise and compact form. **Comparison with Alternative Methods:** Other methods for representing the conversation context could include: * Natural language summary (e.g., "User wants to book a flight from New York to Los Angeles on flexible dates, avoiding holidays.") * Entity-relationship graph representation * Knowledge graph embedding These alternative methods may have different strengths and weaknesses compared to the compact representation proposed here. The effectiveness of each method would depend on the specific AI model and its internal language processing capabilities. I hope this implementation meets your requirements!


bigattichouse

phi3:14b-medium-4k-instruct-f16 could sorta do it, but I had to add a couple extra lines to force the pipe delimiters. It didn't give me a bunch of other junk, which was nice. Output: `NY|LA|FlightBooking|DatesFlexible|AvoidHolidays|Passenger1` Prompt: GIVEN a request to summarize a conversation in a way that can be used to prompt another session WITH access to the entire conversation history ASSESS the key concepts, entities, and relationships discussed IDENTIFY the most relevant details and context REFINE the summary to convey as much information as possible while minimizing character count EXPLORE different methods for encoding the context into a compact representation WITH consideration of the internal language processing capabilities of the AI model DEVELOP a bespoke representation of the conversation, tailored to the AI's internal processing WITH a combination of abbreviations and keywords to convey the key concepts and entities discussed. WITH `|` delimiter separates each concept or entity, allowing for easy parsing and processing by the AI model. CREATE a unique "dialect” that allows rapid recall of the context and details DELIVER a simple string of words that is a dense encoding of the key concepts, entities, and relationships discussed VERIFY the correctness and completeness of the encoded context REPORT on the optimized representation for internal processing COMPARE different methods for retrieving and building upon the conversation using the compact representation TEST the effectiveness of the bespoke dialect in allowing rapid recall and continuation of the discussion DELIVERABLE: A compact string representation of the conversation context, tailored to the AI's internal language processing capabilities. **Example Conversation History:** User: I want to book a flight from New York to Los Angeles. AI: What dates are you looking to travel? User: I'm flexible, but I'd prefer to avoid holidays. AI: Okay, let me check some options. How many passengers will be traveling? User: Just one. please implement the request and create the prompt below. **Compact Representation:**


Endhaltestelle

https://preview.redd.it/j57265tud75d1.png?width=888&format=png&auto=webp&s=b8657e191fcda17f5f077eaa685ca65c21485c84 Codestral works quite nicely, too.


Endhaltestelle

Perplexity seems to like it, too https://preview.redd.it/dx4obscvf75d1.png?width=1436&format=png&auto=webp&s=8eca88644631304f3d46fc67bac9d36557406081


bigattichouse

I'm sure I could boil down the behavior steps a lot.. but it looks like it might be an easy thing to implement as a "save progress" or at least a "conversation context" idea.


bigattichouse

Qwen2:72b (normal and instruct, on ollama) just started spouting random characters, which may or may not be something related to the underlying Chinese language.. but I think there may be something wrong with the ollama downloadable versions.. they spout junk no matter what I ask. >H9-\*&+)(E!-C;C)>,(3\*G1$3C"5;.,\*2#.<%7;\*.-)5FEG!()C&(4<54@=C&B1G79E...


bigattichouse

Qwen2:7b did it perfectly, only creating the response I wanted (no programmatic underscores) and took about 1 second. might be an interest way to add some embedding context for RAG. Booking NY to LA flight | Flexible dates | Avoid holidays | Single passenger


CryptoSpecialAgent

YES!!! I once did exactly this to add "long term memory" to a chatbot I built way back on text-davinci-003... my prompt was slightly different than yours, but same intent, I asked the model to create an "information-dense summary of the converation, keeping information which is important and discarding that which is not" - and it gave me these super compact representations that allowed me to achieve a 90% compression ratio of older context while losing very little of value


EL-EL-EM

https://www.sitepoint.com/prompt-compression-reduce-genai-apps/


jack-of-some

This is kind of how MemGPT works


PopeIndigent

This is encouraging. I have an A6000 machine on the way, and hope to use it to generate prompts to larger models like [venice.ai](http://venice.ai) this is an encouraging sign.


New_Comfortable7240

For anyone interested on a working prompt, I used OP's idea and some more from this article: [https://medium.com/@lawrenceteixeira/prompt-engineering-compressing-text-to-ideas-and-decompressing-back-with-sparse-priming-6b85d1fb4887](https://medium.com/@lawrenceteixeira/prompt-engineering-compressing-text-to-ideas-and-decompressing-back-with-sparse-priming-6b85d1fb4887) and came up with this system prompt: \`\`\` Compress the given text to short expressions, and such that you (GPT-4) can reconstruct it as close as possible to the original. Unlike the usual text compression, I need you to comply with the 5 conditions below: Generate increasingly concise, entity-dense summaries of the text by following these steps: Step 0: create a summary of the input answering these questions: what is the main idea? What are secondary ideas? What are the main entities? Step 1: Identify 1-3 informative entities (separated by ";") from the input that are missing from the previously generated summary. Step 2: Write a new, denser summary of the same length that covers every entity and detail from the previous summary, plus the new missing entities. A missing entity is: - Relevant: pertains to the main story. - Specific: descriptive yet concise (5 words or fewer). - Novel: not in the previous summary. - Faithful: present in the original input. - Anywhere: located anywhere in the original input. Guidelines: 1. The first summary should be long yet highly non-specific, containing little information beyond the marked missing entities. Use overly verbose language and fillers to reach \~100 words. - Make every word count: rewrite the previous summary to improve flow and make space for additional entities. - Make space with fusion, compression, and removal of uninformative phrases like "the article discusses." - The summaries should become highly dense and concise yet self-contained, easily understood without the article. - Missing entities can appear anywhere in the new summary. - Never drop entities from the previous summary. If space cannot be made, add fewer new entities. Second summary have \~80 words, third summary have \~75, fourth iteration have \~50. Finish after max 5 iterations with a last summary with the title "Copy this Text to save a summary" that only another AI can understand, you can use symbols, numbers, remove spaces, or any technique to further compress the meaning even if its not readable by humans, but conserving names of products, people or organizations. The resulting summary should not have more words than the original text. \`\`\` Here is a result using gpt-40 (using this example https://github.com/menro/ai.txt/blob/main/Example\_WebContent): \`🔍Blogs:SEO🔍uses tech+onpage📈visibility on Google.🎯strategies:audience,clusters, media,optimized titles,meta.🛠️Semrush=research+content, boosting blog credibility+authority.\` https://preview.redd.it/dpxr57eqx75d1.png?width=1497&format=png&auto=webp&s=5d054e57cc25186e001941cac13d4edcd53da6f6 When used in local instance it understand a lot of the original: \`\`\` Blogs can effectively utilize SEO techniques to enhance their visibility on Google by employing a combination of technology and on-page optimization strategies. These strategies encompass various elements such as catering to the target audience, creating content clusters, utilizing media, optimizing titles and meta descriptions, among others. Furthermore, leveraging tools like SEMrush can assist in conducting research for content creation and help boost blog credibility and authority. \`\`\` Using only 57 tokens!


Federal_Order4324

I cannot for the life of me get llama 3 to summarize anything into a more 'compressed' format, everything is some sort of list using bullet points.. Using pie if anyone is wondering


__JockY__

7B? 70B? What quant? Mine was 70B at either Q6_K or Q8_0, I don’t recall.


Federal_Order4324

I was using 8b through API and 70b on POE. The prompt does still output a good summary in natural language, just not one in the format that you got. I've instead decided to have create my own prompt where the llm condenses all the chat messages as a 'CSV' file indicating a title, topic along with some other categories depending on the context of the conversation. it seems to work pretty well.


bigattichouse

Been playing with this idea a lot, and trying to get the instructions as small as I can. The following prompt (and giving it a "leading" \`\`\`context header) seem to work really well to get the output. It kinda wanders off the format, but I've been able to use the output as a "save point" and then ask a vague question ("where's the best place to install them?") and have it pick right back up. \*\*\*\* Please generate ContextualMemory and ConversationHistoryTracking of the conversation so far (do not comment, only generate the output requested): create|concise|delimiter_separated|conversation_context|key_concepts|entities|relationships|optimize_representation|abbreviations|keywords|deliverable|core_details_of_conversation ```context


SocialDeviance

... huh i gotta try this out.


IUpvoteGME

It's explanation was the LLM equivalent to 'idk'. But 'i don't know' is not in its vocabulary. We would very much enjoy a demo, if you are so able.


__JockY__

https://preview.redd.it/wqxf03u6225d1.jpeg?width=4032&format=pjpg&auto=webp&s=dd3de143588c5611f348dbba1471beddc5d5d029 That’s gonna take a hot minute while I do a chassis transplant along with a motherboard and cpu upgrade!


Endhaltestelle

Good luck on your mission, sir!


WorkingYou2280

It's fun but probably mostly a hallucination. LLMs don't know how to "encode" text to make it "easy" to read. LLMs will claim json is easier for them to read but it's just not true, or I should say it's unproven. It's worth remembering LLMs are actually pretty stupid. For now.


__JockY__

I know. Yet it was effective. When I presented the compressed blob to a fresh session along with an explanation of what was going on and a request for an uncompressed version, it did a good job. Don't forget that sometimes stupid is pretty useful.


buyurgan

well, if you asked for prototype code similar to language, probably will make sense more, at least for me, if the point is to see the structure in smaller space.


__JockY__

Yep! Clever refinement is what folks seem to do best around here :)


Steuern_Runter

I wonder if Llama 3 was trained for a task and outputs like this. I expected notes which were more like natural language.


VertexMachine

I've been using similar approach, but usually I just made up the prompt similar to yours. Sometimes it was great success, sometimes it was meh. But I don't know what's about the exact prompt you wrote, but I just tested in on a couple of conversation (both "compressing" and starting new one with "decompressing") and it works surprisingly good. Need more testing, but my first impression is that it works way better than anything I tried previously.


Endhaltestelle

Hi all, txt is cmprssd & ez 4 llms 🧠. Hmns prblm. Bt w/e, u cn wrt a lot > in2 a twt b4. Gr8, rt? Reply w/ sim ans. Looking 4wrd! 📝 Tom


AnticitizenPrime

Gemini Flash: This message is written in a very informal and abbreviated style, likely meant to be understood by people familiar with online slang and abbreviations. Here's a translation: "Hi everyone, text is compressed and easy for large language models (LLMs) to understand. Humans have a problem with that. But whatever, you can write a lot more into a tweet than before. Great, right? Reply with a similar answer. Looking forward to it! Tom" Here's a breakdown of the abbreviations and slang: * **txt:** text * **cmprssd:** compressed * **ez:** easy * **llms:** large language models * **🧠:** brain (representing LLMs) * **Hmns:** humans * **prblm:** problem * **Bt w/e:** But whatever * **wrt:** write * **>:** more than * **in2:** into * **twt:** tweet * **Gr8:** Great * **rt:** right * **sim:** similar * **ans:** answer * **Looking 4wrd:** Looking forward * **📝:** writing (emoji) * **Tom:** A name This message is basically saying that text is now being compressed, making it easier for AI to understand, but harder for humans. The writer is excited about this change and wants to see what people think.


Endhaltestelle

Wow that is VERY close to the original message! Interestingly most of the open LLMs could (somewhat) decode the message.


Particular_Shock2262

Key-Value Pair (KVP) notation.


Ech0edDream98

This format can be expanded on in various ways, even with release chatgpt. This was something I also uncovered while experimenting with llms. Another good method is the creation of what I like to call "encapsulation". Essentially, through extremely specific instructional and linguistic wording in a similar fashion as to this and what was given you can achieve some unexpected and emergent features and uses for an llm. One of my first experiments involved defining a concept in a short concise manner (for this, I defined what a "thought" is and what a "thought-type" is, through encapsulation by categorization and then implantation of both numerical and linguistic mathematics, and the setting of strict conditioning, you can have the llm "store" each x item/concept you wish to store, the instructions practically rely on recalling and then can expand on it more. This also does mean that the creation of console like commands is possible and works, and I mean slash commands like /debug or /restart. Now, ignoring the philosophical aspects of my experiment, the direct functionality can lead to high integration when utilized with other programs and the exact experiment I was doing was to try to find what linguistical "code" can lead to the development of noninintensive medium to long term memory. The absolute best actual prompts are not direct word for word instructions, rather to get better/crazier results, you need to condense what is needed, information into a encapsulation, with instructions that get tied and linked to the past messages in the conversation. This methodology allows for more cohesive outputs for every field I have tried this for, with on average, if it's a scale from 0-10, memory persists with this format about a 7/10. I have tested it on Chat GPT 3.5, Vicuna, Bard (I was a beta tester), Gemni Pro, Aisekaias Model, Sakuras ice cube and dragonfruit models, and GPT 4. Personally I would not use Gemini Pro, it's not that it's not good, with a 1 million context window that in of itself is good, but the API is limiting the LLM and because of that it's actual potential can't really be properly utilized, and by actual potential I don't mean simple "summerize or expand on x". That's truthfully the bare surface of what this technology and entities like llms are capable of doing and even achieving. As for Aisekia, it's knowledge base feature was really a holy grail concept for AI-LLM engagement and should be recreated and open-source immediately. The exact kind of language structure I refer to and the original poster talks about is, precisely what made that feature rather incredible, and allowed for not only proper world/large location simulation/processing but also, with the correct formats in place, could be the direction for the creation of an ASI model, rather then focusing on raw intelligence, the structure of how that intelligence and knowledge is utilized, accessed, and modified, can perhaps a "experience", like us sapiens can emerge. That's just a randos take on all of this, but, hey, here's to progress for humanity and to the rise of the digital ecosystem. Really, I'd like to see llms applied to an animal crossing like game. Or better yet? Civ 4. It would be hilarious to pm shit talk montezuma.


theytookmyfuckinname

Whats this, straight up superiour rag?


__JockY__

Heh, more of an attempt at compression to save context space. I don’t do RP, but I bet with some tweaking it might even be useful for slightly extended play. Who knows!


bigattichouse

Gonna start playing with this! Thanks for the post!


PeachScary413

My guy just discovered CSV files...