@Sounds like an add-on/fix?
Yup, it’s an engineered layer on top of the core language model, so it remembers your prompts. The LLM itself is a transformer architecture, so the memory (during the training) is “preserved” via self-attention mechanism over a context window - not some persistent memory. That’s why in the past the models tended to forget old information (dropping out of the attention window). Then some tweaks were made to improve it (better preserve the most relevant information). As a fun fact, it seems closer to the brain memory model of raisins in the cake than eg LSTM (no persistent memory cells, but selectively attended and propagated information- kind of emergent).
I hope this makes sense- just climbed a big mountain, pleasantly weary.